Newer
Older
In this recipe, you will learn how to create your own container-based environment that
you can use at home and on the supercomputer. In this picture it is outlined:
![](img/outline.png)
You will create a docker-based environment on your machine, that can be use to serve
a jupyter server. Then you transfer the docker image to the supercomputer and make it available
within Jupyter-JSC. Of course, this works without Jupyter, but is much less fun.
In General, the usage of singularity containers on HPC systems is recommended. If the containers from the NVIDIA container registry
[link](https://catalog.ngc.nvidia.com/) are used, you can be quite sure that you are as fast as it gets.
The workflow is compatible with windows as well as Linux and MacOS, however only if your host is of X86 architecture (new Macs might pose a problem).
We however did not get a chance to try.
You will perform the following steps:
0. Install docker
1. Create a docker container that contains the environment
2. Run the docker container to serve a local jupyter server and to execute programs
3. Export the docker image, transfer it to the supercomputer and convert it to a singularity container
4. Run things in the singularity container.
Finally, we say a few words about recommended workflows and give a few details that we have omitted before.
This step depends on the OS you use. Please follow the instructions on the [docker web page](https://docs.docker.com/get-docker/). After that you should be able to
start a simple docker container:
```bash
docker run -it ubuntu bash
```
You can leave the container by typing `exit`.
On Windows, it is highly recommended to install the [Windows Subsystem Linux](https://docs.microsoft.com/en-us/windows/wsl/about). This will provide you with the WSL console, where you have a linux-like
environment. Please check that you can execute the commands above. On MacOS and Linux, you already have the required environment.
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
For convenience, we recommend enabling the option to run docker without the sudo command. On Linux, you can follow [this procedure](https://docs.docker.com/engine/install/linux-postinstall/). Otherwise, you will need
to adjust the scripts in the `docker` subdirectory.
# Create a docker image and container
In this step, you will create a custom docker image and a docker container that contains the environment.
First clone this repository, and `cd` into it. Please pick a good path for that. You might keep this repository for a long time. Here we assume, it is `/path/to`.
```bash
cd /path/to
git clone https://gitlab.jsc.fz-juelich.de/AI_Recipe_Book/recipes/singularity_docker_jupyter
cd singularity_docker_jupyter
```
Everything related to the docker image is in the subdir `docker`. The rules to build the docker image are found in the [Dockerfile](docker/Dockerfile). Look inside, you will
see that we start from a plain ubuntu-image, install Python with `apt-get` and install Jupyter with `pip`. Build the docker container with the following commands
```bash
cd docker
./build.sh
```
The build script will build the container and tag it with `singularity_docker_jupyter`. Once the container is build, you can run the jupyter server with the script `run.sh`:
```bash
./run_jupyter.sh
```
The run script will start a docker container hosting a jupyter server that you can access by navigating to http://localhost:8889/. It will be restarted automatically when your system reboots or the container exits. In order to permanently
remove it, execute
```bash
docker rm -f singularity_docker_jupyter_cont
```
We also have created small scripts to run commands and an interactive shell in the container. Execute
```bash
./run_bash_in_container.sh
```
to run an interactive bash shell inside of the container. Exit it by typing `exit`. Executing commands is possible with
the script `run_command_in_container.sh`. If you execute
```bash
./run_command_in_container python3 --version
```
it will invoke the Python interpreter that has been installed into the container image. You will see which python version has been installed into the container. Instead, you can, of course, execute any other command and pass arbitrary arguments.
A note on GPUs. If you want to be able to use GPUs inside of the container, you must use the [NVIDIA Container runtime](https://developer.nvidia.com/nvidia-container-runtime) and
add the flag `--runtime=nvidia` to the all calls to `docker run`.
# Export the docker container
Before you start into this step, ssh to the supercomputer and also clone the repository there.
```bash
cd /path/to
git clone https://gitlab.jsc.fz-juelich.de/AI_Recipe_Book/recipes/singularity_docker_jupyter
cd singularity_docker_jupyter
```
In this step, you will export the docker image you have created as a singularity container. It requires the following steps:
```bash
docker save singularity_docker_jupyter -o singularity_docker_jupyter.tar
```
Note that a tarball `singularity_docker_jupyter.tar` has been created in your local directory.
**2.** Copy the image to one of the JSC machines. In this example, we place the file into the directory where we have
clone the repo before.
scp singularity_docker_jupyter.tar surname1@jusuf.fz-juelich.de:/path/to/singularity_docker_jupyter
```
Note that this can take a while, depending on your connection.
**3.** ssh to the machine and convert the tarball into a singularity image.
cd /path/to/singularity_docker_jupyter
module load Singularity-Tools
singularity build singularity_docker_jupyter.sif docker-archive://singularity_docker_jupyter.tar
```
This will create a file `singularity_docker_jupyter.sif` in your local directory
If you local machine is a Linux machine, you also have the option to create the singularity image `singularity_docker_jupyter.sif` on your local machine.
# Use the container with singularity on the supercomputer
The typical usage of a singularity container is `singularity run image.sif command --with-args`. We demonstrate the usage
in an example submission script `example_submission_script.sh`, that you can submit to slurm. It is also possible to
execute it directly on the login node. Here is the output:
![](img/singularity1.png)
Furthermore, you can create a jupyter kernel for jupyter-jsc. We have created a small script that will
do it for you. Execute
```bash
./create_kernel.sh
```
and it will create a file `~/.local/share/jupyter/kernels/singularity_docker_jupyter/kernel.json` that contains the description
of a kernel. After logging out of Jupyter-JSC and logging in again, you will be able to pick a kernel that
was made from your own singularity container
![](img/jupyter_container.png)
If you look at the file `jupyter-jsc/kernel.sh` you will see that it only is it a thin wrapper, that makes sure the Kernel
is executed in the singularity container.
![](img/kernel.png)
# Workflow
Your first shot at creating an environment will never be your last one. Creating an environment is an iterative task, and you'll soon find a library or a tool that is required in
your workflow. An option that speeds up your productivity might be using a virtual environment that is stored outside of the container. This makes it much easier to add libraries.
In another recipe, we have covered, how to do this. In General, we discourage the use of `docker commit`, even though this might feel like a very easy way to develop an environment step by step. However, you give
up the great reproducibility you achieve with docker files.
The script `run_jupyter.sh` does a few things that are untypical when using docker. Here are the most important points.
* We do not store any information in the container. Your home directory is mounted into the container. This is done by using `-v $HOME:$HOME`. It is mounted to the same path as on the
host computer. This ensures not path inconsistencies occur.
* The `HOST` environment variable is exported into the container. This is done with the option `-e HOME`
* The user id and group id are not set to `root`/0 as typical for docker, but with the option `--user $(id -u $USER):$(id -g $USER)` we make sure that UID and GID inside the container are the same as the ones of the user who starts the container.
All files modified in the container will be accessed with the UID and GID of the same user. If you are the only user, you will not even realize you are inside a container.
* The port 8888 of the container is mapped to the port 8889 of your local computer. The jupyter server started in the container by default serves on port 8888. To avoid conflicts with another potentially
running jupyter environment on your local machine, the container-based server serves on port 8889.
* The container is given the name `singularity_docker_jupyter_cont`
We start the jupyter server with a few options
* We don't restrict IPs that can use the server
* We don't open a browser
* We disable access tokens
* We use the `$HOME` directory as base directory for the jupyter server.