From 49181c1e16ae6e4ea81f88d04d51c5b0891527e0 Mon Sep 17 00:00:00 2001 From: Andreas Herten <a.herten@fz-juelich.de> Date: Mon, 24 Mar 2025 17:23:53 +0100 Subject: [PATCH] Update for 2025 --- README.md | 55 +++++++++++++++++-------------------------------------- 1 file changed, 17 insertions(+), 38 deletions(-) diff --git a/README.md b/README.md index aa19226..ed901ac 100644 --- a/README.md +++ b/README.md @@ -1,47 +1,46 @@ -# Helmholtz GPU Hackathon 2024 +# Helmholtz GPU Hackathon 2025 -This repository holds the documentation for the Helmholtz GPU Hackathon 2024 at CASUS Görlitz. +This repository holds the documentation for the Helmholtz GPU Hackathon 2025 at Forschungszentrum Jülich. For additional info, please write #cluster-support on Slack. ## Sign-Up -Please use JuDoor to sign up for our training project, training2406: [https://judoor.fz-juelich.de/projects/join/training2406](https://judoor.fz-juelich.de/projects/join/training2406) +Please use JuDoor to sign up for our training project, training2508: [https://judoor.fz-juelich.de/projects/join/training2508](https://judoor.fz-juelich.de/projects/join/training2508) -Make sure to accept the usage agreement for JURECA-DC and JUWELS Booster. +Make sure to accept the usage agreement for JEDI. -Please upload your SSH key to the system via JuDoor. The key needs to be restricted to accept accesses only from a specific source, as specified through the `from` clause. Please have a look at the associated documentation ([SSH Access](https://apps.fz-juelich.de/jsc/hps/juwels/access.html) and [Key Upload](https://apps.fz-juelich.de/jsc/hps/juwels/access.html#key-upload-key-restriction)). +Please upload your SSH key to the system via JuDoor. The key needs to be restricted to accept accesses only from a specific source, as specified through the `from` clause. Please have a look at the associated documentation ([SSH Access](https://apps.fz-juelich.de/jsc/hps/jedi/access.html) and [Key Upload](https://apps.fz-juelich.de/jsc/hps/jedi/access.html#key-upload-key-restriction)). ## HPC Systems -We are using primarily JURECA-DC for the Hackathon, a system with 768 NVIDIA A100 GPUs. +We are using primarily JEID for the Hackathon, the JUPITER precursor system with 192 NVIDIA Hopper GPUs. For the system documentation, see the following websites: -* [JURECA-DC](https://apps.fz-juelich.de/jsc/hps/jureca/configuration.html) -* [JUWELS Booster](https://apps.fz-juelich.de/jsc/hps/juwels/booster-overview.html) +* [JEDI](https://apps.fz-juelich.de/jsc/hps/jedi/configuration.html) ## Access -After successfully uploading your key through JuDoor, you should be able to access JURECA-DC via +After successfully uploading your key through JuDoor, you should be able to access JEDI via ```bash -ssh user1@jureca.fz-juelich.de +ssh user1@login.jedi.fz-juelich.de ``` -The hostname for JUWELS Booster is `juwels-booster.fz-juelich.de`. +The hostname for JEDI is `login.jedi.fz-juelich.de`. -An alternative way of access the systems is through _Jupyter JSC_, JSC's Jupyter-based web portal available at [https://jupyter-jsc.fz-juelich.de](https://jupyter-jsc.fz-juelich.de). Sessions should generally be launched on the login nodes. A great alternative to X is available through the portal called Xpra. It's great to run the Nsight tools! +An alternative way of access the systems is through _Jupyter JSC_, JSC's Jupyter-based web portal available at [https://jupyter.jsc.fz-juelich.de/workshops/gpuhack25](https://jupyter.jsc.fz-juelich.de/workshops/gpuhack25) (link to a pre-configured session). Sessions should generally be launched on the login nodes. A great alternative to X is available through the portal called Xpra. It's great to run the Nsight tools! ## Environment On the systems, different directories are accessible to you. To set environment variables according to a project, call the following snippet after logging in: ```bash -jutil env activate -p training2406 -A training2406 +jutil env activate -p training2508 -A training2508 ``` -This will, for example, make the directory `$PROJECT` available to use, which you can use to store data. Your `$HOME` will not be a good place for data storage, as it is severely limited! Use `$PROJECT` (or `$SCRATCH`, see documentation on [_Available File Systems_](https://apps.fz-juelich.de/jsc/hps/jureca/environment.html#available-file-systems)). +This will, for example, make the directory `$PROJECT` available to use, which you can use to store data. Your `$HOME` will not be a good place for data storage, as it is severely limited! Use `$PROJECT` (or `$SCRATCH`, see documentation on [_Available File Systems_](https://apps.fz-juelich.de/jsc/hps/jedi/environment.html#available-file-systems)). Different software can be loaded to the environment via environment modules, via the `module` command. To see available compilers (the first level of a toolchain), type `module avail`. The most relevant modules are @@ -51,7 +50,7 @@ The most relevant modules are ## Containers -JSC supports containers thorugh Apptainer (previously: Singularity) on the HPC systems. The details are covered in a [dedicated article in the systems documetnation](https://apps.fz-juelich.de/jsc/hps/jureca/container-runtime.html). Access is subject to accepting a dedicated license agreement (because of special treatment regarding support) on JuDoor. +JSC supports containers thorugh Apptainer (previously: Singularity) on the HPC systems. The details are covered in a [dedicated article in the systems documetnation](https://apps.fz-juelich.de/jsc/hps/jedi/container-runtime.html). Access is subject to accepting a dedicated license agreement (because of special treatment regarding support) on JuDoor. Once access is granted (check your `groups`), Docker containers can be imported and executed similarly to the following example: @@ -62,32 +61,12 @@ $ srun -n 1 --pty apptainer exec --nv tf.sif python3 myscript.py ## Batch System -The JSC systems use a special flavor of Slurm as the workload manager (PSSlurm). Most of the vanilla Slurm commands are available with some Jülich-specific additions. An overview of Slurm is available in the according documentation which also gives example job scripts and interactive commands: [https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html) +The JSC systems use a special flavor of Slurm as the workload manager (PSSlurm). Most of the vanilla Slurm commands are available with some Jülich-specific additions. An overview of Slurm is available in the according documentation which also gives example job scripts and interactive commands: [https://apps.fz-juelich.de/jsc/hps/jedi/batchsystem.html](https://apps.fz-juelich.de/jsc/hps/jedi/batchsystem.html) -Please account your jobs to the `training2406` project, either by setting the according environment variable with the above `jutil` command (as above), or by manually adding `-A training2406` to your batch jobs. +Please account your jobs to the `training2508` project, either by setting the according environment variable with the above `jutil` command (as above), or by manually adding `-A training2508` to your batch jobs. -Different partitions are available (see [documentation for limits](https://apps.fz-juelich.de/jsc/hps/jureca/batchsystem.html#jureca-dc-module-partitions)): - -* `dc-gpu`: All GPU-equipped nodes -* `dc-gpu-devel`: Some nodes available for development +Only one partition is available on JEDI, called `all` (see [documentation for limits](https://apps.fz-juelich.de/jsc/hps/jedi/batchsystem.html)). For the days of the Hackathon, reservations will be in place to accelerate scheduling of jobs. -* Day 1: `--reservation gpuhack24` -* Day 2: `--reservation gpuhack24-2024-04-23` -* Day 3: `--reservation gpuhack24-2024-04-24` -* Day 4: `--reservation gpuhack24-2024-04-25` -* Day 5: `--reservation gpuhack24-2024-04-26` - X-forwarding sometimes is a bit of a challenge, please consider using _Xpra_ in your Browser through Jupyter JSC! - -## Etc - - -### Previous Documentation - -More (although slightly outdated) documentation is available from the 2021 Hackathon [in the according JSC Gitlab Hackathon docu branch](https://gitlab.version.fz-juelich.de/gpu-hackathon/doc/-/tree/2021). - -### PDFs - -See the directory `./pdf/` for PDF version of the documentation. -- GitLab