Skip to content

How to create and use containers

How to use Singularity containers on rhino for interactive data analysis in R and Python

Steps on the remote machine (for example, Fred Hutch rhino cluster).

  • On a terminal, login into rhino
ssh rhino02
  • Make sure that any conda initialization is commented out in your .bashrc or .bash_profile file on the remote machine. This step is important. Otherwise, VScode will not recognize the conda environments within the Singularity container.

  • Do the remote operations below from within a tmux session so that you can detach and logout of your remote session and still keep the container running.

  • Start a tmux session:

tmux new -s tunnel
  • Make Singularity available:
module load Singularity
  • Go to the folder with our lab Singularity images and start the appropriate Singularity container:

```bash cd /fh/working/subramaniam_a/singularity/ singularity exec -B /fh r_python_1.3.0.sif /bin/bash

  • Start a VScode CLI tunnel from within the container:
./code tunnel --cli-data-dir=/home/$USER/R_PYTHON 

If you are doing the above the first time, you will have to login to GitHub using the displayed code and also name the tunnel.

  • Give the tunnel the name of the container so that you can easily identify it later.
  • You can open new tmux panes within the same tmux session and run multiple different containers at the same time.

Steps on local machine (for example, your lab desktop computer)

  • On VScode, open a remote window
  • Search for the command Remote-Tunnels: Connect to Tunnel:. Alternately, click on the bottom left-hand corner with >< symbol for Open Remote Window
  • You might have to login to your GitHub account at this point.
  • Select the active tunnel that you started on rhino.
  • You can now open any folder on the remote machine and create a Jupyter notebook.
  • You should be able to pick the Python interpreter at /opt/conda/bin/python or the Jupyter R kernel at /opt/conda/envs/R/lib/R/bin/R to run your Python or R notebook.

How to use Singularity containers in Snakemake workflows on the Fred Hutch cluster

We often use our lab's standard Docker containers for running specific Snakemake rules on the Fred Hutch cluster. If you use a specification of the form singularity: "docker://ghcr.io/rasilab/r:1.0.0" within a Snakemake rule, then this Docker container will be pulled to your local directory, converted to a Singularity container, and stored at .snakemake/singularity in your current working directory. However, it takes a long time to download and convert each Docker container and each container also uses up several GB of space. To avoid this, Rasi has stored a copy of all commonly used Singularity containers in our lab at /fh/working/subramaniam_a/singularity. You can symbolically link this folder to your workflow by executing the following from the folder that contains your Snakemake file:

mkdir .snakemake
cd .snakemake
ln -s /fh/working/subramaniam_a/singularity .

This will make all common singularity containers (files ending with .simg) immediately available to your Snakemake workflow. Note that Snakemake names the Singularity containers based on their SHA IDs. If you can get the SHA IDs for any container by using the hash_generatory.py script located in the above folder. For example:

python3 hash_generator.py docker://ghcr.io/rasilab/r_python:1.3.0

General organization for Docker and Singularity images in the lab

  • All Docker image should be built off a Dockerfile file that lives in its own public GitHub repository in the rasilab GitHub organization account.

  • All Docker images should be publicly available in our lab's GitHub container repository.

  • For routine analyses and writing, use our lab's default R, Python, and Latex images. These images contain commonly used libraries in our lab, as you can see in the Dockerfile of each image.

  • If you require additional R, Python, or Latex libraries not present in the above default images, build child images based off the above base images. For example, see the MaGeCK package built off our default Python package.

  • If you need standalone bioinformatic packages, derive them off the Miniconda3 4.12.0 base image. For example, see the Samtools package in our lab's package repository. You can also use these images as basis for other packages. For example, see the Bowtie2 package in our lab's package repository.

  • Note that Docker is not allowed on the Fred Hutch cluster. You can use Singularity images instead.

Steps to build and tag Docker images

To create or update a Dockerfile and the associated image, follow these steps:

  • Edit the Dockerfile in VScode to add or remove software. Always make modifications at the end of the Dockerfile so that only few intermediate images need to be re-created. This decreases the space and time requirements for building and pushing the image.

  • If you are creating a new GitHub repository for a new package, use a Dockerfile and README template from a previous package GitHub repository, for eg. MaGeCK. Make sure to set the repository name and description appropriately, and update its visibility to public.

  • Build the Docker image from the folder that contains the Dockerfile. Give a specific tag (see below for tag convention).

    docker build -t <image_name:tag> .
    # example
    docker build -t ghcr.io/rasilab/mageck:0.5.9 .
    
  • Assuming the above build command is successful, first commit and tag the Dockerfile. You might have to set the remote if you created a new GitHub repository using git remote set-url origin <url>. GitHub will prompt you for this when you create

    git add Dockerfile
    git commit -m "Update Dockerfile with <software>"
    git tag -a <tag> -m "Update Dockerfile with <software>"
    git push origin <tag>
    git push --all
    
  • We use the same tag for the GitHub repository containing the Dockerfile and the corresponding image. Both tags follow the semantic versioning convention. In practice, this means:

    1. Use tags of the form X.Y.Z. Start with 0 for X and 1 for Y and 0 for Z.

    2. If your package is for a single software package (eg. MaGeCK), then use the same version tag as the software package version you are installing. For example, see our lab's MaGeCK image.

    3. If you fix an error in a Dockerfile/image that contains several different packages (eg. our lab's Python image), increment the third digit Z.

    4. If you add an extra package that will not break compatibility of the Docker image with previous versions, increment the second digit Y.

    5. If you make any incompatible changes, increment the first digit X (avoid this as much as possible).

  • Push the Docker image to the GitHub container registry. The first time you do this from a computer, you will have to set up a Personal Access Token (classic version) as explained here. Make sure to give the write:packages permission when you create the token.

    docker push <image_name:tag>
    # example
    docker push ghcr.io/rasilab/python:1.2.0
    
    - Make sure to make the GitHub package and repository public so that you and other lab members can access it from the Fred Hutch cluster.

  • You can pull these images as Singularity images using the following command (used on Fred Hutch cluster):

    module load Singularity
    singularity pull docker://<image_name:tag>
    # example
    singularity pull docker://ghcr.io/rasilab/r_python:1.3.0
    

How to use images locally

  • You can pull images to your local machine by:

    docker pull docker://ghcr.io/rasilab/PACAGE_NAME:X.Y.Z
    
  • You can create a named container from the above image by:

    docker run -i -d --name pandoc-latex -v $HOME:$HOME ghcr.io/rasilab/pandoc-latex:1.1.0
    
  • You can use the container to run a command. For example,

    docker exec -w $(pwd) pandoc-latex pandoc manuscript.md --citeproc --template=template.tex --metadata-file=pandoc-options.yaml --pdf-engine=xelatex -o manuscript.pdf --filter=pandoc-svg.py
    
  • You can also use the containers in a VSCode .devcontainer.json file (for eg. to run Jupyter Notebooks):

    "image": "ghcr.io/rasilab/r:1.0.0",