Skip to content
Snippets Groups Projects
Commit 08e7d8e2 authored by Andreas Herten's avatar Andreas Herten
Browse files

Initial commit

parents
Branches
No related tags found
No related merge requests found
# Advanced CPU Mask for GPU/non-GPU Processes
Task: Utilize most cores of JUWELS Booster by launching 4 GPU proccesses from CPU cores with GPU affinity and launch CPU-only proccesse from the remainder of the cores. There should be 4 ranks per node; in each rank, 1 GPU process is launched and the remaining cores given to the CPU-only process.
## Usage
Launch `split_mask.sh` with 2 arguments. The first argument is the process which should not run on the GPU, the second argument is the process to be run on the GPU. The acording CPU masks are set, `CUDA_VISIBLE_DEVICES` is set as well. If only 1 argument is provided, it will be taken for both cases. No argument or >2 arguments will result in masks printed (for debugging).
```bash
srun -n 2 --cpu-bind=mask_cpu:0xfff000000000fff,0xfff000000000fff000 \
bash split_mask.sh ./app1 ./app2
```
(see below)
The script makes implicit assumptions for the AMD EPYC CPUs in JUWELS Booster. Handle with care on other systems.
## Helpers
In addition, the following helpers are provided:
* `calc_pinning.py`: Takes CPU hex masks for individual NUMA domains and combines NUMA domains pairwise. Stuff is printed.
* `process_info.sh`: Simple script to print some info relating to current affinity; GPU ID, MPI rank, CPU mask, last CPU core run on
* `get_close_gpu.sh`: Helper script needed by `split_mask.sh` to get a close GPU to a NUMA domain
## Sample Output
```bash
❯ srun -n 9 --cpu-bind=verbose,mask_cpu:0xfff000000000fff,0xfff000000000fff000,0xfff000000000fff000000,0xfff000000000fff000000000 bash split_mask.sh "bash process_info.sh" |& sort
cpu_bind=MASK - jwb0001, task 0 0 [2983]: mask 0xfff000000000fff set
cpu_bind=MASK - jwb0001, task 1 1 [2984]: mask 0xfff000000000fff000 set
cpu_bind=MASK - jwb0001, task 2 2 [2987]: mask 0xfff000000000fff000000 set
cpu_bind=MASK - jwb0001, task 3 3 [2991]: mask 0xfff000000000fff000000000 set
cpu_bind=MASK - jwb0001, task 4 4 [2994]: mask 0xfff000000000fff set
cpu_bind=MASK - jwb0001, task 5 5 [2997]: mask 0xfff000000000fff000 set
cpu_bind=MASK - jwb0001, task 6 6 [3000]: mask 0xfff000000000fff000000 set
cpu_bind=MASK - jwb0001, task 7 7 [3002]: mask 0xfff000000000fff000000000 set
cpu_bind=MASK - jwb0001, task 8 8 [3005]: mask 0xfff000000000fff set
MPI Rank: 0;CUDA_VIS_DEV: 1;pid 3146's current affinity list: 6;Last CPU core: 6
MPI Rank: 0;CUDA_VIS_DEV: ;pid 3112's current affinity list: 0-5,7-11,48-55,57-59;Last CPU core: 2
MPI Rank: 1;CUDA_VIS_DEV: 0;pid 3147's current affinity list: 18;Last CPU core: 18
MPI Rank: 1;CUDA_VIS_DEV: ;pid 3120's current affinity list: 12-17,19-23,60-71;Last CPU core: 14
MPI Rank: 2;CUDA_VIS_DEV: 3;pid 3151's current affinity list: 30;Last CPU core: 30
MPI Rank: 2;CUDA_VIS_DEV: ;pid 3122's current affinity list: 24-29,31-35,72-83;Last CPU core: 28
MPI Rank: 3;CUDA_VIS_DEV: 2;pid 3158's current affinity list: 42;Last CPU core: 42
MPI Rank: 3;CUDA_VIS_DEV: ;pid 3125's current affinity list: 36-41,43-47,84-95;Last CPU core: 88
MPI Rank: 4;CUDA_VIS_DEV: 1;pid 3150's current affinity list: 6;Last CPU core: 6
MPI Rank: 4;CUDA_VIS_DEV: ;pid 3123's current affinity list: 0-5,7-11,48-55,57-59;Last CPU core: 3
MPI Rank: 5;CUDA_VIS_DEV: 0;pid 3148's current affinity list: 18;Last CPU core: 18
MPI Rank: 5;CUDA_VIS_DEV: ;pid 3121's current affinity list: 12-17,19-23,60-71;Last CPU core: 65
MPI Rank: 6;CUDA_VIS_DEV: 3;pid 3154's current affinity list: 30;Last CPU core: 30
MPI Rank: 6;CUDA_VIS_DEV: ;pid 3124's current affinity list: 24-29,31-35,72-83;Last CPU core: 72
MPI Rank: 7;CUDA_VIS_DEV: 2;pid 3155's current affinity list: 42;Last CPU core: 42
MPI Rank: 7;CUDA_VIS_DEV: ;pid 3126's current affinity list: 36-41,43-47,84-95;Last CPU core: 94
MPI Rank: 8;CUDA_VIS_DEV: 1;pid 3152's current affinity list: 6;Last CPU core: 6
MPI Rank: 8;CUDA_VIS_DEV: ;pid 3127's current affinity list: 0-5,7-11,48-55,57-59;Last CPU core: 50
```
-Andreas Herten, 12 December 2020
#!/usr/bin/env python3
## Helper to combine masks of two NUMA domains into one. Will print stuff out on the way.
## -Andreas Herten, 12 December 2020
def print_mask(mask):
print(" ".join([f'{int(mask.split(",")[0], base=16):096b}'[::-1][:48][6*i:6*(i+1)] for i in range(8)]))
def sum_mask(mask1, mask2):
return int(mask1, 16) + int(mask2, 16)
# One mask per NUMA domain
masks = [
'0x3f00000000003f',
'0xfc0000000000fc0',
'0x3f00000000003f000',
'0xfc0000000000fc0000',
'0x3f00000000003f000000',
'0xfc0000000000fc0000000',
'0x3f00000000003f000000000',
'0xfc0000000000fc0000000000'
]
print(",".join(masks))
print("")
for mask in masks:
print_mask(mask)
print("\n")
converted_masks = [hex(sum_mask(a, b)) for a, b in zip(masks[::2], masks[1::2])]
print(",".join(converted_masks))
print("")
for mask in converted_masks:
print_mask(mask)
#!/usr/bin/env bash
declare -A gpus=( ["3"]="0" ["1"]="1" ["7"]="2" ["5"]=3)
echo "${gpus[$1]}"
#!/usr/bin/env bash
echo -n "MPI Rank: $MPI_LOCALRANKID;"
echo -n "CUDA_VIS_DEV: $CUDA_VISIBLE_DEVICES;"
echo -n "$(taskset -c -p $$);"
echo -n "Last CPU core: $(awk '{print $39}' /proc/$$/stat)"
echo ""
#!/usr/bin/env bash
## Add mask and launch process; masks added: one GPU-aware core, the remaining cores of two NUMA domains.
## Arguments to script:
## * None, >2: Print masks
## * 1: Launch same process with different masks
## * 2: Launch first process with other mask, launch second process with GPU mask
## NOTE: Highly tuned to AMD EPYC topology of JUWELS Booster. Will work for other topologies, but needs fine-tuning
## -Andreas Herten, 12 December 2020
## Get odd NUMA domain (which is always close to a GPU)
numa_domains=$(numactl -s | grep nodebind | sed "s/nodebind: //")
for domain in $numa_domains; do
if [ $((domain%2)) == 1 ]; then
odd_domain=$domain
fi
done
## GPU for odd NUMA domain
gpu_id=$(bash get_close_gpu.sh $odd_domain)
## Get cores of NUMA domain
numa_cores=$(numactl -s | grep physcpubind | sed "s/physcpubind: //")
numa_cores_array=($numa_cores) ## convert to array
## Split list of cores into single GPU-close core one the remaining ones
N_CORES_PER_DOMAIN=6 ## this can probably be retrieved from the systems somewhere^TM
core_gpu=${numa_cores_array[$N_CORES_PER_DOMAIN]} ## this implicitly assumes even NUMA domain before odd NUMA domain
core_rest=${numa_cores_array[@]/$core_gpu}
core_rest__commasep=$(echo $core_rest | tr " " ",")
## Mask program calls to match the domains, including GPU
case $# in
1)
env -u CUDA_VISIBLE_DEVICES numactl --physcpubind=$core_rest__commasep $1
CUDA_VISIBLE_DEVICES=$gpu_id numactl --physcpubind=$core_gpu $1
;;
2)
env -u CUDA_VISIBLE_DEVICES numactl --physcpubind=$core_rest__commasep $1
CUDA_VISIBLE_DEVICES=$gpu_id numactl --physcpubind=$core_gpu $2
;;
*)
echo "DEBUG OUTPUT"
echo " GPU: $gpu_id"
echo " GPU Core: $core_gpu"
echo " Other Cores: $core_rest"
;;
esac
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment