Skip to content
Snippets Groups Projects
Commit 6c95a2d4 authored by kasravi1's avatar kasravi1
Browse files

add group name into args

parent 3063c793
No related branches found
No related tags found
No related merge requests found
...@@ -87,9 +87,10 @@ If you would like to install PyTorch, you can find the PyTorch NVIDIA container ...@@ -87,9 +87,10 @@ If you would like to install PyTorch, you can find the PyTorch NVIDIA container
you can proceed to build the base NVIDIA image using the following command: you can proceed to build the base NVIDIA image using the following command:
```bash ```bash
bash scripts/build_base_nvidia_image.sh -A {account} -p {partition} -V {Nvidia_container_version} -N {base_image_name} bash scripts/build_base_nvidia_image.sh -A {account} -G {group_name} -P {partition} -V {Nvidia_container_version} -N {base_image_name}
``` ```
- `account`: The account name for using supercomputers. - `account`: The account name for using supercomputers.
- `group_name`: The group name for using supercomputers (you can find the name of the Group on JuDoor). Sometimes account and group name are the same.
- `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.). - `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.).
- `Nvidia_container_version`: The Nvidia container (e.g., `nvcr.io/nvidia/pytorch:23.05-py3`). - `Nvidia_container_version`: The Nvidia container (e.g., `nvcr.io/nvidia/pytorch:23.05-py3`).
- `base_image_name`: The name of the base image with a `.sif` extension (e.g., `pytorch-23-05-py3.sif`). - `base_image_name`: The name of the base image with a `.sif` extension (e.g., `pytorch-23-05-py3.sif`).
...@@ -98,14 +99,14 @@ bash scripts/build_base_nvidia_image.sh -A {account} -p {partition} -V {Nvidia_c ...@@ -98,14 +99,14 @@ bash scripts/build_base_nvidia_image.sh -A {account} -p {partition} -V {Nvidia_c
Here is the example how to build the base image: Here is the example how to build the base image:
```bash ```bash
bash scripts/build_base_nvidia_image.sh -A atmlaml -P devel -V nvcr.io/nvidia/pytorch:23.05-py3 -N pytorch-23-05-py3.sif bash scripts/build_base_nvidia_image.sh -A atmlaml -G atmlaml -P devel -V nvcr.io/nvidia/pytorch:23.05-py3 -N pytorch-23-05-py3.sif
``` ```
After running scripts/build_base_nvidia_image.sh, a SLURM job will be launched, and a report will be generated in the `slurm_report` path. Please make sure to review this report to confirm that your image was created without any errors. After running scripts/build_base_nvidia_image.sh, a SLURM job will be launched, and a report will be generated in the `slurm_report` path. Please make sure to review this report to confirm that your image was created without any errors.
The above command will create an singularity image in the following path: The above command will create an singularity image in the following path:
`/p/project1/{account}/{$USER}/apptainer/pytorch-23-05-py3.sif` `/p/project1/{account}/$USER/apptainer/pytorch-23-05-py3.sif`
...@@ -116,9 +117,10 @@ Note: Since you are using Nvidia container, PyTorch and other dependencies is al ...@@ -116,9 +117,10 @@ Note: Since you are using Nvidia container, PyTorch and other dependencies is al
If there is a `torch` or `torchvision` inside of `requirements.txt`, please remove it in order to avoid conflict between the Nvidia PyTorch inside of the conatiner. If there is a `torch` or `torchvision` inside of `requirements.txt`, please remove it in order to avoid conflict between the Nvidia PyTorch inside of the conatiner.
```bash ```bash
bash scripts/build_customized_nvidia_image.sh -A {account} -P {partition} -S {absolute_path_base_image} -R {absolute_path_requirements.txt} -F {final_image_name} bash scripts/build_customized_nvidia_image.sh -A {account} -G {group_name} -P {partition} -S {absolute_path_base_image} -R {absolute_path_requirements.txt} -F {final_image_name}
``` ```
- `account`: The account name for using supercomputers. - `account`: The account name for using supercomputers.
- `group_name`: The group name for using supercomputers (you can find the name of the Group on JuDoor). Sometimes account and group name are the same.
- `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.). - `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.).
- `absolute_path_base_image`: The absolute path of the base image that was created before. - `absolute_path_base_image`: The absolute path of the base image that was created before.
- `absolute_path_requirements.txt`: The absolute path of the requirements.txt that all containes all pip dependencies. - `absolute_path_requirements.txt`: The absolute path of the requirements.txt that all containes all pip dependencies.
...@@ -129,7 +131,7 @@ Here is the example how to build the final image: ...@@ -129,7 +131,7 @@ Here is the example how to build the final image:
```bash ```bash
bash scripts/build_customized_nvidia_image.sh -A atmlaml -P devel -S /p/project1/atmlaml/{$USER}/apptainer/pytorch-23-05-py3.sif -R {path_to_requirements}/requirements.txt -F pytorch-23-05-py3-final.sif bash scripts/build_customized_nvidia_image.sh -A atmlaml -G atmlaml -P devel -S /p/project1/atmlaml/{$USER}/apptainer/pytorch-23-05-py3.sif -R {path_to_requirements}/requirements.txt -F pytorch-23-05-py3-final.sif
``` ```
Check slurm report in `slurm_report` to make sure the singularity image has been created sucessfully. Check slurm report in `slurm_report` to make sure the singularity image has been created sucessfully.
...@@ -137,7 +139,7 @@ Athe end of this report, if you see `Hello world. This is my Python version` can ...@@ -137,7 +139,7 @@ Athe end of this report, if you see `Hello world. This is my Python version` can
You can find the final singularity image in the following path: You can find the final singularity image in the following path:
`/p/project1/{account}/{$USER}/apptainer/pytorch-23-05-py3-final.sif` `/p/project1/{account}/$USER/apptainer/pytorch-23-05-py3-final.sif`
Congratulations!!! You have successfully created a singularity image. To use this image for your application, you can find an example submission script in `example_submission_script.sh`. Congratulations!!! You have successfully created a singularity image. To use this image for your application, you can find an example submission script in `example_submission_script.sh`.
......
...@@ -5,25 +5,27 @@ ACCOUNT_NAME="" ...@@ -5,25 +5,27 @@ ACCOUNT_NAME=""
PARTITION_NAME="" PARTITION_NAME=""
NVIDIA_CONTAINER_VERSION="" NVIDIA_CONTAINER_VERSION=""
SINGULARITY_IMG_NAME="" SINGULARITY_IMG_NAME=""
GROUP_NAME=""
# Parse options # Parse options
while getopts "A:P:V:N:" opt; do while getopts "A:G:P:V:N:" opt; do
case $opt in case $opt in
A) ACCOUNT_NAME=$OPTARG ;; A) ACCOUNT_NAME=$OPTARG ;;
G) GROUP_NAME=$OPTARG ;;
P) PARTITION_NAME=$OPTARG ;; P) PARTITION_NAME=$OPTARG ;;
V) NVIDIA_CONTAINER_VERSION=$OPTARG ;; V) NVIDIA_CONTAINER_VERSION=$OPTARG ;;
N) SINGULARITY_IMG_NAME=$OPTARG ;; N) SINGULARITY_IMG_NAME=$OPTARG ;;
*) *)
echo "Usage: $0 -A <account_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>" echo "Usage: $0 -A <account_name> -G <group_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
exit 1 exit 1
;; ;;
esac esac
done done
# Check if required arguments are provided # Check if required arguments are provided
if [ -z "$ACCOUNT_NAME" ] || [ -z "$PARTITION_NAME" ] || [ -z "$NVIDIA_CONTAINER_VERSION" ] || [ -z "$SINGULARITY_IMG_NAME" ]; then if [ -z "$ACCOUNT_NAME" ] || [ -z "$GROUP_NAME" ] || [ -z "$PARTITION_NAME" ] || [ -z "$NVIDIA_CONTAINER_VERSION" ] || [ -z "$SINGULARITY_IMG_NAME" ]; then
echo "Usage: $0 -A <account_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>" echo "Usage: $0 -A <account_name> -G <group_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
exit 1 exit 1
fi fi
...@@ -47,7 +49,7 @@ module load Stages ...@@ -47,7 +49,7 @@ module load Stages
module try-load GCC Apptainer-Tools module try-load GCC Apptainer-Tools
# Define the directory path # Define the directory path
DIR="/p/project1/${ACCOUNT_NAME}/${USER}/apptainer" DIR="/p/project1/${GROUP_NAME}/${USER}/apptainer"
# Ensure directory exists # Ensure directory exists
if [ ! -d "\$DIR" ]; then if [ ! -d "\$DIR" ]; then
...@@ -59,7 +61,7 @@ else ...@@ -59,7 +61,7 @@ else
fi fi
# Create temporary directory for Apptainer build # Create temporary directory for Apptainer build
TMP=\$(mktemp -p /p/scratch/${ACCOUNT_NAME}/apptainer -d) || { echo "Failed to create temporary directory"; exit 1; } TMP=\$(mktemp -p /p/scratch/${GROUP_NAME}/apptainer -d) || { echo "Failed to create temporary directory"; exit 1; }
export APPTAINER_TMPDIR=\$TMP export APPTAINER_TMPDIR=\$TMP
# Create the Singularity definition file in TMP # Create the Singularity definition file in TMP
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please to comment