Skip to content
Snippets Groups Projects
README.md 8.99 KiB
Newer Older
Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Singularity + Docker + Jupyter
Stefan Kesselheim's avatar
Stefan Kesselheim committed
In this recipe, you will learn how to create your own container-based environment that
Stefan Kesselheim's avatar
Stefan Kesselheim committed
you can use at home and on the supercomputer. In this picture it is outlined:
![](img/outline.png)
You will create a docker-based environment on your machine, that can be use to serve 
a jupyter server. Then you transfer the docker image to the supercomputer and make it available 
within Jupyter-JSC. Of course, this works without Jupyter, but is much less fun.

Stefan Kesselheim's avatar
Stefan Kesselheim committed
In General, the usage of singularity containers on HPC systems is recommended. If the containers from the NVIDIA container registry
Stefan Kesselheim's avatar
Stefan Kesselheim committed
[link](https://catalog.ngc.nvidia.com/) are used, you can be quite sure that you are as fast as it gets.

The workflow is compatible with windows as well as Linux and MacOS, however only if your host is of X86 architecture (new Macs might pose a problem).   
We however did not get a chance to try. 

You will perform the following steps:

Stefan Kesselheim's avatar
Stefan Kesselheim committed
0. Install docker
1. Create a docker container that contains the environment
2. Run the docker container to serve a local jupyter server and to execute programs
Stefan Kesselheim's avatar
Stefan Kesselheim committed
3. Export the docker image, transfer it to the supercomputer and convert it to a singularity container
4. Run things in the singularity container.

Finally, we say a few words about recommended workflows and give a few details that we have omitted before. 
Stefan Kesselheim's avatar
Stefan Kesselheim committed

Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Install docker on your home machine
Stefan Kesselheim's avatar
Stefan Kesselheim committed
This step depends on the OS you use. Please follow the instructions on the [docker web page](https://docs.docker.com/get-docker/). After that you should be able to 
start a simple docker container: 
```bash
docker run -it ubuntu bash
```
You can leave the container by typing `exit`.

On Windows, it is highly recommended to install the [Windows Subsystem Linux](https://docs.microsoft.com/en-us/windows/wsl/about). This will provide you with the WSL console, where you have a linux-like 
Stefan Kesselheim's avatar
Stefan Kesselheim committed
environment. Please check that you can execute the commands above. On MacOS and Linux, you already have the required environment. 
Stefan Kesselheim's avatar
Stefan Kesselheim committed

For convenience, we recommend enabling the option to run docker without the sudo command. On Linux, you can follow [this procedure](https://docs.docker.com/engine/install/linux-postinstall/). Otherwise, you will need 
to adjust the scripts in the `docker` subdirectory.

# Create a docker image and container
In this step, you will create a custom docker image and a docker container that contains the environment.
First clone this repository, and `cd` into it.  Please pick a good path for that. You might keep this repository for a long time. Here we assume, it is `/path/to`.
```bash
cd /path/to
git clone https://gitlab.jsc.fz-juelich.de/AI_Recipe_Book/recipes/singularity_docker_jupyter
cd singularity_docker_jupyter
```
Everything related to the docker image is in the subdir `docker`. The rules to build the docker image are found in the [Dockerfile](docker/Dockerfile). Look inside, you will 
see that we start from a plain ubuntu-image, install Python with `apt-get` and install Jupyter with `pip`. Build the docker container with the following commands
```bash
cd docker
./build.sh
```
The build script will build the container and tag it with `singularity_docker_jupyter`. Once the container is build, you can run the jupyter server with the script `run.sh`: 
```bash
./run_jupyter.sh
```
The run script will start a docker container hosting a jupyter server that you can access by navigating to http://localhost:8889/. It will be restarted automatically when your system reboots or the container exits. In order to permanently
remove it, execute
```bash
docker rm -f singularity_docker_jupyter_cont
```
We also have created small scripts to run commands and an interactive shell in the container. Execute
```bash
./run_bash_in_container.sh
```
to run an interactive bash shell inside of the container. Exit it by typing `exit`. Executing commands is possible with
the script `run_command_in_container.sh`. If you execute 
```bash
./run_command_in_container python3 --version
```
Stefan Kesselheim's avatar
Stefan Kesselheim committed
it will invoke the Python interpreter that has been installed into the container image. You will see which python version has been installed into the container. Instead, you can, of course, execute any other command and pass arbitrary arguments.

Stefan Kesselheim's avatar
Stefan Kesselheim committed
A note on GPUs. If you want to be able to use GPUs inside of the container, you must use the [NVIDIA Container runtime](https://developer.nvidia.com/nvidia-container-runtime) and 
add the flag `--runtime=nvidia` to the all calls to `docker run`.

Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Export the docker container 
Before you start into this step, ssh to the supercomputer and also clone the repository there.
```bash
cd /path/to
git clone https://gitlab.jsc.fz-juelich.de/AI_Recipe_Book/recipes/singularity_docker_jupyter
cd singularity_docker_jupyter
```
Stefan Kesselheim's avatar
Stefan Kesselheim committed

In this step, you will export the docker image you have created as a singularity container. It requires the following steps:
Stefan Kesselheim's avatar
Stefan Kesselheim committed

Stefan Kesselheim's avatar
Stefan Kesselheim committed
**1.** Save the docker image in a tarball
Stefan Kesselheim's avatar
Stefan Kesselheim committed
```bash
docker save singularity_docker_jupyter -o singularity_docker_jupyter.tar
```
Stefan Kesselheim's avatar
Stefan Kesselheim committed
Note that a tarball `singularity_docker_jupyter.tar` has been created in your local directory. 

**2.** Copy the image to one of the JSC machines. In this example, we place the file into the directory where we have
clone the repo before. 
Stefan Kesselheim's avatar
Stefan Kesselheim committed
```bash
Stefan Kesselheim's avatar
Stefan Kesselheim committed
scp singularity_docker_jupyter.tar surname1@jusuf.fz-juelich.de:/path/to/singularity_docker_jupyter
Stefan Kesselheim's avatar
Stefan Kesselheim committed
```
Note that this can take a while, depending on your connection.
Stefan Kesselheim's avatar
Stefan Kesselheim committed
**3.** ssh to the machine and convert the tarball into a singularity image. 
Stefan Kesselheim's avatar
Stefan Kesselheim committed
```bash
ssh surname1@jusuf.fz-juelich.de
Stefan Kesselheim's avatar
Stefan Kesselheim committed
cd /path/to/singularity_docker_jupyter
module load Singularity-Tools
Stefan Kesselheim's avatar
Stefan Kesselheim committed
singularity build singularity_docker_jupyter.sif docker-archive://singularity_docker_jupyter.tar
```
Stefan Kesselheim's avatar
Stefan Kesselheim committed
This will create a file `singularity_docker_jupyter.sif` in your local directory

Stefan Kesselheim's avatar
Stefan Kesselheim committed
If you local machine is a Linux machine, you also have the option to create the singularity image `singularity_docker_jupyter.sif` on your local machine.

Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Use the container with singularity on the supercomputer
Stefan Kesselheim's avatar
Stefan Kesselheim committed
The typical usage of a singularity container is `singularity run image.sif command --with-args`. We demonstrate the usage 
in an example submission script `example_submission_script.sh`, that you can submit to slurm. It is also possible to 
Stefan Kesselheim's avatar
Stefan Kesselheim committed
execute it directly on the login node. Here is the output:
![](img/singularity1.png)
Stefan Kesselheim's avatar
Stefan Kesselheim committed

Stefan Kesselheim's avatar
Stefan Kesselheim committed
Furthermore, you can create a jupyter kernel for jupyter-jsc. We have created a small script that will 
do it for you. Execute
```bash
./create_kernel.sh
```
and it will create a file `~/.local/share/jupyter/kernels/singularity_docker_jupyter/kernel.json` that contains the description
of a kernel. After logging out of Jupyter-JSC and logging in again, you will be able to pick a kernel that
was made from your own singularity container
![](img/jupyter_container.png)
If you look at the file `jupyter-jsc/kernel.sh` you will see that it only is it a thin wrapper, that makes sure the Kernel 
is executed in the singularity container.
![](img/kernel.png)
Stefan Kesselheim's avatar
Stefan Kesselheim committed

Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Workflow
Your first shot at creating an environment will never be your last one. Creating an environment is an iterative task, and you'll soon find a library or a tool that is required in 
your workflow. An option that speeds up your productivity might be using a virtual environment that is stored outside of the container. This makes it much easier to add libraries. 
In another recipe, we have covered, how to do this. In General, we discourage the use of `docker commit`, even though this might feel like a very easy way to develop an environment step by step. However, you give 
up the great reproducibility you achieve with docker files. 
Stefan Kesselheim's avatar
Stefan Kesselheim committed

Stefan Kesselheim's avatar
Stefan Kesselheim committed
# Details about the docker container is started
Stefan Kesselheim's avatar
Stefan Kesselheim committed
The script `run_jupyter.sh` does a few things that are untypical when using docker. Here are the most important points.
* We do not store any information in the container. Your home directory is mounted into the container. This is done by using `-v $HOME:$HOME`. It is mounted to the same path as on the 
host computer. This ensures not path inconsistencies occur. 
* The `HOST` environment variable is exported into the container. This is done with the option `-e HOME`
* The user id and group id are not set to `root`/0 as typical for docker, but with the option `--user $(id -u $USER):$(id -g $USER)` we make sure that UID and GID inside the container are the same as the ones of the user who starts the container.
All files modified in the container will be accessed with the UID and GID of the same user. If you are the only user, you will not even realize you are inside a container.
* The port 8888 of the container is mapped to the port 8889 of your local computer. The jupyter server started in the container by default serves on port 8888. To avoid conflicts with another potentially
running jupyter environment on your local machine, the container-based server serves on port 8889.
* The container is given the name `singularity_docker_jupyter_cont`

We start the jupyter server with a few options
* We don't restrict IPs that can use the server
* We don't open a browser
* We disable access tokens
* We use the `$HOME` directory as base directory for the jupyter server.  
Stefan Kesselheim's avatar
Stefan Kesselheim committed