This repository holds the documentation for the Helmholtz GPU Hackathon 2023 at Jülich Supercomputing Centre (Forschungszentrum Jülich).
This repository holds the documentation for the Helmholtz GPU Hackathon 2024 at CASUS Görlitz.
For additional info, please write to Andreas Herten (<a.herten@fz-juelich.de>) on Slack or email.
For additional info, please write #cluster-support on Slack.
## Sign-Up
## Sign-Up
Please use JuDoor to sign up for our training project, training2310: [https://judoor.fz-juelich.de/projects/join/training2310](https://judoor.fz-juelich.de/projects/join/training2310)
Please use JuDoor to sign up for our training project, training2406: [https://judoor.fz-juelich.de/projects/join/training2406](https://judoor.fz-juelich.de/projects/join/training2406)
Make sure to accept the usage agreement for JURECADC and JUWELS Booster.
Make sure to accept the usage agreement for JURECA-DC and JUWELS Booster.
Please upload your SSH key to the system via JuDoor. The key needs to be restricted to accept accesses only from a specific source, as specified through the `from` clause. Please have a look at the associated documentation ([SSH Access](https://apps.fz-juelich.de/jsc/hps/juwels/access.html) and [Key Upload](https://apps.fz-juelich.de/jsc/hps/juwels/access.html#key-upload-key-restriction)).
Please upload your SSH key to the system via JuDoor. The key needs to be restricted to accept accesses only from a specific source, as specified through the `from` clause. Please have a look at the associated documentation ([SSH Access](https://apps.fz-juelich.de/jsc/hps/juwels/access.html) and [Key Upload](https://apps.fz-juelich.de/jsc/hps/juwels/access.html#key-upload-key-restriction)).
## HPC Systems
## HPC Systems
We are using primarily JURECADC for the Hackathon, a system with 768 NVIDIA A100 GPUs. As an optional alternative, also the JUWELS Booster system with its 3600 A100 GPUs and stronger node-to-node interconnect (4×200 Gbit/s, instead of 2×200 Gbit/s for JURECA DC) can be utilized. The focus should be on JURECA DC, though.
We are using primarily JURECA-DC for the Hackathon, a system with 768 NVIDIA A100 GPUs.
For the system documentation, see the following websites:
For the system documentation, see the following websites:
After successfully uploading your key through JuDoor, you should be able to access JURECADC via
After successfully uploading your key through JuDoor, you should be able to access JURECA-DC via
```bash
```bash
ssh user1@jureca.fz-juelich.de
ssh user1@jureca.fz-juelich.de
...
@@ -38,7 +38,7 @@ An alternative way of access the systems is through _Jupyter JSC_, JSC's Jupyter
...
@@ -38,7 +38,7 @@ An alternative way of access the systems is through _Jupyter JSC_, JSC's Jupyter
On the systems, different directories are accessible to you. To set environment variables according to a project, call the following snippet after logging in:
On the systems, different directories are accessible to you. To set environment variables according to a project, call the following snippet after logging in:
```bash
```bash
jutil env activate -p training2310-A training2310
jutil env activate -p training2406-A training2406
```
```
This will, for example, make the directory `$PROJECT` available to use, which you can use to store data. Your `$HOME` will not be a good place for data storage, as it is severely limited! Use `$PROJECT` (or `$SCRATCH`, see documentation on [_Available File Systems_](https://apps.fz-juelich.de/jsc/hps/jureca/environment.html#available-file-systems)).
This will, for example, make the directory `$PROJECT` available to use, which you can use to store data. Your `$HOME` will not be a good place for data storage, as it is severely limited! Use `$PROJECT` (or `$SCRATCH`, see documentation on [_Available File Systems_](https://apps.fz-juelich.de/jsc/hps/jureca/environment.html#available-file-systems)).
The JSC systems use a special flavor of Slurm as the workload manager (PSSlurm). Most of the vanilla Slurm commands are available with some Jülich-specific additions. An overview of Slurm is available in the according documentation which also gives example job scripts and interactive commands: [https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html)
The JSC systems use a special flavor of Slurm as the workload manager (PSSlurm). Most of the vanilla Slurm commands are available with some Jülich-specific additions. An overview of Slurm is available in the according documentation which also gives example job scripts and interactive commands: [https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html)
Please account your jobs to the `training2310` project, either by setting the according environment variable with the above `jutil` command (as above), or by manually adding `-A training2310` to your batch jobs.
Please account your jobs to the `training2406` project, either by setting the according environment variable with the above `jutil` command (as above), or by manually adding `-A training2406` to your batch jobs.
Different partitions are available (see [documentation for limits](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html#jureca-dc-module-partitions)):
Different partitions are available (see [documentation for limits](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html#jureca-dc-module-partitions)):
*`dc-gpu`: All GPU-equipped nodes
*`dc-gpu`: All GPU-equipped nodes
*`dc-gpu-devel`: Some nodes available for development
*`dc-gpu-devel`: Some nodes available for development
For the days of the Hackathon, reservations will be in place to accelerate scheduling of jobs. The reservations will be announced at this point closer to the event.
For the days of the Hackathon, reservations will be in place to accelerate scheduling of jobs.
* Day 1: `--reservation gpuhack24`
* Day 2: `--reservation gpuhack24-2024-04-23`
* Day 3: `--reservation gpuhack24-2024-04-24`
* Day 4: `--reservation gpuhack24-2024-04-25`
* Day 5: `--reservation gpuhack24-2024-04-26`
X-forwarding sometimes is a bit of a challenge, please consider using _Xpra_ in your Browser through Jupyter JSC!
X-forwarding sometimes is a bit of a challenge, please consider using _Xpra_ in your Browser through Jupyter JSC!