... | @@ -104,6 +104,44 @@ Except for the MPI and the node-internal CUDA transport layer, all layers utiliz |
... | @@ -104,6 +104,44 @@ Except for the MPI and the node-internal CUDA transport layer, all layers utiliz |
|
|
|
|
|
With any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
With any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
|
|
|
|
|
|
## Usage of TCP Communication API Without miniPMI
|
|
|
|
Linktest can be configured to test MPI or TCP without the miniPMI library. In the case of MPI no additional work is necessary, aside from executing with `mpiexe` or the like, and linktest can be used as above. When testing TCP communication without the miniPMI library the cluster configuration needs to be specified explicitly via the following four environment variables: `LINKTEST_TCP_SIZE`, `LINKTEST_TCP_RANK`, `LINKTEST_TCP_IPADDR_<<<RANK>>>` and `LINKTEST_TCP_PORT_<<<RANK>>>`.
|
|
|
|
|
|
|
|
`LINKTEST_TCP_SIZE`: An integer indicating the number of tasks to be used for the test.
|
|
|
|
|
|
|
|
`LINKTEST_TCP_RANK`: The rank of the current task.
|
|
|
|
|
|
|
|
`LINKTEST_TCP_IPADDR_<<<RANK>>>`: The IP address of rank `<<<RANK>>`, where `<<<RANK>>>` is the eight-digit zero-filled integer rank to which the environment variable corresponds.
|
|
|
|
|
|
|
|
`LINKTEST_TCP_PORT_<<<RANK>>>`: The communication port to use of rank `<<<RANK>>`, where `<<<RANK>>>` is the eight-digit zero-filled integer rank to which the environment variable corresponds. Note that it is imperative that these ports are free on the respective machines. Linktest will not test this, nor will it port-scan to find free ports and communicate them to the partners. Setting free ports is the users responsibility.
|
|
|
|
|
|
|
|
For a given task `LINKTEST_TCP_SIZE` and `LINKTEST_TCP_RANK` must be specified. `LINKTEST_TCP_IPADDR_<<<RANK>>>` and `LINKTEST_TCP_PORT_<<<RANK>>>`must also be specified for all other tasks.
|
|
|
|
|
|
|
|
With the thus configured cluster environment Linktest can be executed like normal. Below is an example of how to configure this cluster environment given a host-name list, which in this case is queried via a SLURM environment variable under the assumption that this script is submitted via SLURM and that there is one task per node:
|
|
|
|
```BASH
|
|
|
|
# 1. List of Host Names
|
|
|
|
hosts=($(scontrol show hostnames ${SLURM_JOB_NODELIST} | paste -s -d " "))
|
|
|
|
|
|
|
|
# 2. Export TCP Size & Rank
|
|
|
|
export LINKTEST_TCP_SIZE=${SLURM_NTASKS};
|
|
|
|
for i in $(seq 0 $((${#hosts[@]}-1))); do
|
|
|
|
if [ "${HOSTNAME}" == "${hosts[${i}]}" ]; then
|
|
|
|
export LINKTEST_TCP_RANK=${i};
|
|
|
|
fi
|
|
|
|
done
|
|
|
|
|
|
|
|
# 3. Export TCP IP-Address & Port
|
|
|
|
base_port=60000;
|
|
|
|
for i in $(seq 0 $((${#hosts[@]}-1))); do
|
|
|
|
task=$(printf "%08d\n" ${i});
|
|
|
|
export LINKTEST_TCP_IPADDR_${task}=$(getent hosts "${hosts[${i}]}" | awk '{ print $1 }');
|
|
|
|
export LINKTEST_TCP_PORT_${task}=$((${base_port}+${i}));
|
|
|
|
done
|
|
|
|
|
|
|
|
# 4. Execute Linktest
|
|
|
|
linktest --mode tcp --num-warmup-messages 10 --num-messages 1000 --size-messages 1024 --output tcp.sion;
|
|
|
|
```
|
|
|
|
|
|
# JSC Run Examples
|
|
# JSC Run Examples
|
|
|
|
|
|
**Linktest on 2048 nodes, 1 task per node, message size 16 MiB, 2 warmup messages and 4 messages for measurement:**
|
|
**Linktest on 2048 nodes, 1 task per node, message size 16 MiB, 2 warmup messages and 4 messages for measurement:**
|
... | | ... | |