Skip to content
Snippets Groups Projects
Commit 6c95a2d4 authored by kasravi1's avatar kasravi1
Browse files

add group name into args

parent 3063c793
No related branches found
No related tags found
No related merge requests found
......@@ -87,9 +87,10 @@ If you would like to install PyTorch, you can find the PyTorch NVIDIA container
you can proceed to build the base NVIDIA image using the following command:
```bash
bash scripts/build_base_nvidia_image.sh -A {account} -p {partition} -V {Nvidia_container_version} -N {base_image_name}
bash scripts/build_base_nvidia_image.sh -A {account} -G {group_name} -P {partition} -V {Nvidia_container_version} -N {base_image_name}
```
- `account`: The account name for using supercomputers.
- `group_name`: The group name for using supercomputers (you can find the name of the Group on JuDoor). Sometimes account and group name are the same.
- `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.).
- `Nvidia_container_version`: The Nvidia container (e.g., `nvcr.io/nvidia/pytorch:23.05-py3`).
- `base_image_name`: The name of the base image with a `.sif` extension (e.g., `pytorch-23-05-py3.sif`).
......@@ -98,14 +99,14 @@ bash scripts/build_base_nvidia_image.sh -A {account} -p {partition} -V {Nvidia_c
Here is the example how to build the base image:
```bash
bash scripts/build_base_nvidia_image.sh -A atmlaml -P devel -V nvcr.io/nvidia/pytorch:23.05-py3 -N pytorch-23-05-py3.sif
bash scripts/build_base_nvidia_image.sh -A atmlaml -G atmlaml -P devel -V nvcr.io/nvidia/pytorch:23.05-py3 -N pytorch-23-05-py3.sif
```
After running scripts/build_base_nvidia_image.sh, a SLURM job will be launched, and a report will be generated in the `slurm_report` path. Please make sure to review this report to confirm that your image was created without any errors.
The above command will create an singularity image in the following path:
`/p/project1/{account}/{$USER}/apptainer/pytorch-23-05-py3.sif`
`/p/project1/{account}/$USER/apptainer/pytorch-23-05-py3.sif`
......@@ -116,9 +117,10 @@ Note: Since you are using Nvidia container, PyTorch and other dependencies is al
If there is a `torch` or `torchvision` inside of `requirements.txt`, please remove it in order to avoid conflict between the Nvidia PyTorch inside of the conatiner.
```bash
bash scripts/build_customized_nvidia_image.sh -A {account} -P {partition} -S {absolute_path_base_image} -R {absolute_path_requirements.txt} -F {final_image_name}
bash scripts/build_customized_nvidia_image.sh -A {account} -G {group_name} -P {partition} -S {absolute_path_base_image} -R {absolute_path_requirements.txt} -F {final_image_name}
```
- `account`: The account name for using supercomputers.
- `group_name`: The group name for using supercomputers (you can find the name of the Group on JuDoor). Sometimes account and group name are the same.
- `partition`: The partition name on the supercomputer. Please use a partition that has internet access (e.g., `devel`, `develgpus`, `develbooster`, etc.).
- `absolute_path_base_image`: The absolute path of the base image that was created before.
- `absolute_path_requirements.txt`: The absolute path of the requirements.txt that all containes all pip dependencies.
......@@ -129,7 +131,7 @@ Here is the example how to build the final image:
```bash
bash scripts/build_customized_nvidia_image.sh -A atmlaml -P devel -S /p/project1/atmlaml/{$USER}/apptainer/pytorch-23-05-py3.sif -R {path_to_requirements}/requirements.txt -F pytorch-23-05-py3-final.sif
bash scripts/build_customized_nvidia_image.sh -A atmlaml -G atmlaml -P devel -S /p/project1/atmlaml/{$USER}/apptainer/pytorch-23-05-py3.sif -R {path_to_requirements}/requirements.txt -F pytorch-23-05-py3-final.sif
```
Check slurm report in `slurm_report` to make sure the singularity image has been created sucessfully.
......@@ -137,7 +139,7 @@ Athe end of this report, if you see `Hello world. This is my Python version` can
You can find the final singularity image in the following path:
`/p/project1/{account}/{$USER}/apptainer/pytorch-23-05-py3-final.sif`
`/p/project1/{account}/$USER/apptainer/pytorch-23-05-py3-final.sif`
Congratulations!!! You have successfully created a singularity image. To use this image for your application, you can find an example submission script in `example_submission_script.sh`.
......
......@@ -5,25 +5,27 @@ ACCOUNT_NAME=""
PARTITION_NAME=""
NVIDIA_CONTAINER_VERSION=""
SINGULARITY_IMG_NAME=""
GROUP_NAME=""
# Parse options
while getopts "A:P:V:N:" opt; do
while getopts "A:G:P:V:N:" opt; do
case $opt in
A) ACCOUNT_NAME=$OPTARG ;;
G) GROUP_NAME=$OPTARG ;;
P) PARTITION_NAME=$OPTARG ;;
V) NVIDIA_CONTAINER_VERSION=$OPTARG ;;
N) SINGULARITY_IMG_NAME=$OPTARG ;;
*)
echo "Usage: $0 -A <account_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
echo "Usage: $0 -A <account_name> -G <group_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
exit 1
;;
esac
done
# Check if required arguments are provided
if [ -z "$ACCOUNT_NAME" ] || [ -z "$PARTITION_NAME" ] || [ -z "$NVIDIA_CONTAINER_VERSION" ] || [ -z "$SINGULARITY_IMG_NAME" ]; then
echo "Usage: $0 -A <account_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
if [ -z "$ACCOUNT_NAME" ] || [ -z "$GROUP_NAME" ] || [ -z "$PARTITION_NAME" ] || [ -z "$NVIDIA_CONTAINER_VERSION" ] || [ -z "$SINGULARITY_IMG_NAME" ]; then
echo "Usage: $0 -A <account_name> -G <group_name> -P <partition_name> -V <nvidia_container_version> -N <singularity_base_img_name>"
exit 1
fi
......@@ -47,7 +49,7 @@ module load Stages
module try-load GCC Apptainer-Tools
# Define the directory path
DIR="/p/project1/${ACCOUNT_NAME}/${USER}/apptainer"
DIR="/p/project1/${GROUP_NAME}/${USER}/apptainer"
# Ensure directory exists
if [ ! -d "\$DIR" ]; then
......@@ -59,7 +61,7 @@ else
fi
# Create temporary directory for Apptainer build
TMP=\$(mktemp -p /p/scratch/${ACCOUNT_NAME}/apptainer -d) || { echo "Failed to create temporary directory"; exit 1; }
TMP=\$(mktemp -p /p/scratch/${GROUP_NAME}/apptainer -d) || { echo "Failed to create temporary directory"; exit 1; }
export APPTAINER_TMPDIR=\$TMP
# Create the Singularity definition file in TMP
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment