|
|
[[_TOC_]]
|
|
|
|
|
|
# Usage
|
|
|
Linktest has to be started in parallel, with an even number of proccesses for example using `srun --ntasks 2 linktest.` You can check the usage via `linktest -h` (even without srun), which should look similar to this
|
|
|
Linktest has to be started in parallel, with an even number of proccesses for example using `srun --ntasks 2 linktest.` You can control its execution via the following command-line arguments:
|
|
|
|
|
|
`-h` or `--help`: Prints a help message similar to the following:
|
|
|
You can check the usage via `linktest -h` (even without srun), which should look similar to this
|
|
|
```
|
|
|
Usage: ./linktest options
|
|
|
Version : <<<VERSION>>>
|
|
|
Usage : linktest [options]
|
|
|
|
|
|
with the following optional options (default values in parathesis):
|
|
|
Possible options (default values in parathesis):
|
|
|
|
|
|
-h/--help print help message and exit
|
|
|
-v/--version print version and exit
|
|
|
-w/--num-warmup-messages VAL number of warmup pingpong messages [REQUIRED]
|
|
|
-n/--num-messages VAL number of pingpong messages [REQUIRED]
|
|
|
-s/--size-messages VAL message size in bytes [REQUIRED]
|
|
|
-m/--mode VAL transport Layer to be used [REQUIRED]
|
|
|
--alltoall perform all-to-all modus (e.g. MPI_Alltoall) (0)
|
|
|
--bidir perform bidirectional tests (0)
|
|
|
--use-gpus use GPUs (0)
|
|
|
--bisect perform a bandwidth tests between bisecting halves (0)
|
|
|
-m/--mode VAL transport Layer to be used [REQUIRED]*
|
|
|
--all-to-all additionally perform MPI all-to-all tests (0)
|
|
|
--bidirectional perform bidirectional tests (0)
|
|
|
--use-gpu-memory use GPU memory to store messages (0)
|
|
|
--bisection perform a bandwidth tests between bisecting halves (0)
|
|
|
--randomize randomize processor numbers (0)
|
|
|
--serial-tests serialize tests (0)
|
|
|
--no-sion-file don not write data to sion file (0)
|
|
|
--no-sion-file do not write data to sion file (0)
|
|
|
--parallel-sion-file write data in parallel (sion) (0)
|
|
|
--num-slowest VAL number of slowest pairs to be retested (10)
|
|
|
--min-iterations VAL linktest repeats for at least <min_iterations> (1)
|
|
|
--min-runtime VAL linktest runs for at least <min_runtime> seconds communication time (0.0)
|
|
|
-o/--output VAL output file name (pingpong_results_bin.sion)
|
|
|
|
|
|
|
|
|
* This build supports [<<<SUPPORTED COMMUNICATION APIs>>>].
|
|
|
Alternatively to --mode, the transport layer can be defined by using linktest.LAYER
|
|
|
or setting environment variable LINKTEST_VCLUSTER_IMPL
|
|
|
```
|
|
|
where `<<<VERSION>>>` is the three part version of Linktest executable and `<<<SUPPORTED COMMUNICATION APIs>>>` is a list of support communication APIs/Layers. This option supersedes all others. When executing Linktest with this command-line option it does not need to be run in parallel.
|
|
|
|
|
|
`-v` or `--version`: Prints the following version information:
|
|
|
```
|
|
|
FZJ Linktest (<<<VERSION>>>)
|
|
|
```
|
|
|
where `<<<VERSION>>>` is the three part version of Linktest executable. Like the `-h` or `--help` option Linktest does not need to be executed with this option. This option supersedes all other aside from the `-h` or `--help` option.
|
|
|
|
|
|
`-w/--num-warmup-messages`: Specifies that the following integer indicates the number of warm-up messages to use to warm up a connection before testing it. When not printing help or version information this command-line argument is required.
|
|
|
|
|
|
`-n/--num-messages`: Specifies that the following integer indicates the number of messages measurements should be averaged over during testing. When not printing help or version information this command-line argument is required.
|
|
|
|
|
|
`-s/--size-messages`: Specifies that the following integer indicates the message size in bytes for testing. When not printing help or version information this command-line argument is required.
|
|
|
|
|
|
`-m` or `--mode`: Specifies that the following ASCII string indicates the communication API to use for testing. Alternatively the communication API can be extracted from the extension of the Linktest executable name or from the `LINKTEST_VCLUSTER_IMPL` environment variable. When multiple ways of specifying the communication API are used then `-m` or `--mode` supersedes the linktest executable extension, which in turn also supersedes the `LINKTEST_VCLUSTER_IMPL` environment variable.
|
|
|
|
|
|
`--all-to-all`: Specifies that the following integer, if non-zero, indicates that all-to-all testing should be done before and after the main Linktest test if the used communication API is MPI.
|
|
|
|
|
|
`--bidirectional`: Specifies that the following integer, if non-zero, indicates that testing should occur
|
|
|
bidirectionally instead of semi-directionally, which is the default.
|
|
|
|
|
|
`--bisection`: Specifies that the following integer, if non-zero, indicates that the tasks for testing should be split in two halves and that testing should only occur between these two. This is useful for determining bisection bandwidths.
|
|
|
|
|
|
`--randomize`: Specifies that the following integer, if non-zero, indicates that the order in which tests are performed is to be randomized.
|
|
|
|
|
|
`--serial-tests`: Specifies that the following integer, if non-zero, indicates that connections should be tested in serial. By default testing occurs in parallel.
|
|
|
|
|
|
`--no-sion-file`: Specifies that the following integer, if non-zero, indicates that the collected results should not be written out into a SION file.
|
|
|
|
|
|
`--parallel-sion-file`: Specifies that the following integer, if non-zero, indicates that the collected results should be written out into a SION file in parallel if writing is enabled.
|
|
|
|
|
|
`--num-slowest`: Specifies that the following integer indicates the number of slowest connections to serially retest after the end of the main test.
|
|
|
|
|
|
`--min-iterations`: Specifies that the following integer indicates the number of times the linktest benchmark should be repeated. If not one the writing of SION files is disabled. This command-line argument is useful to apply a communication load to the system.
|
|
|
|
|
|
`--min-runtime`: Specifies that the following floating-point--precision number indicates the number of seconds that Linktest should repeat itself for. If non-zero the writing of SION files is disabled. This command-line is useful to apply a communication load to the system.
|
|
|
|
|
|
`-o` or `--output`: Specifies that the following string indicates the filename of the output SION file.
|
|
|
|
|
|
The arguments num-warmup-messages, num-messages & size-messages are required. The transport layer is usually given through the --mode option. In rare cases where this doesn't work, you can fall back to the linktest.LAYER executables, and/or set the environment variable `LINKTEST_VCLUSTER_IMPL`.
|
|
|
|
|
|
```
|
|
|
# Option 1: Using mode to specify the virtual-cluster implementation
|
|
|
srun \
|
|
|
--ntasks 4 \
|
|
|
./linktest \
|
|
|
srun --ntasks 4 \
|
|
|
linktest \
|
|
|
--mode mpi \
|
|
|
--num-warmup-messages 10 \
|
|
|
--num-messages 100 \
|
|
|
--size-messages $((16*1024*1024));
|
|
|
# Option 2: Using a linktest executable with a suffix
|
|
|
srun \
|
|
|
--ntasks 4 \
|
|
|
./linktest.mpi \
|
|
|
srun --ntasks 4 \
|
|
|
linktest.mpi \
|
|
|
--num-warmup-messages 10 \
|
|
|
--num-messages 100 \
|
|
|
--size-messages $((16*1024*1024));
|
|
|
|
|
|
# Option 3: Using the LINKTEST_VCLUSTER_IMPL enviroment variable
|
|
|
export LINKTEST_VCLUSTER_IMPL=mpi;
|
|
|
srun \
|
|
|
--ntasks 4 \
|
|
|
./linktest \
|
|
|
srun --ntasks 4 \
|
|
|
linktest \
|
|
|
--num-warmup-messages 10 \
|
|
|
--num-messages 100 \
|
|
|
--size-messages $((16*1024*1024));
|
... | ... | @@ -59,7 +102,7 @@ srun \ |
|
|
|
|
|
Except for the MPI and the node-internal CUDA transport layer, all layers utilize the TCP sockets implementation underneath for setup and exchange of data in non-benchmark code segments. The TCP layer implementation uses a lookup of the hostname of the node to determine the IPs for the initial connection setup. There are currently only limited methods to customize this behavior. The code supports the option to set `LINKTEST_SYSTEM_NODENAME_SUFFIX` as a suffix to be added to the short hostname. For example, on JSC systems, `LINKTEST_SYSTEM_NODENAME_SUFFIX=i` may need to be exported to make sure the out-of-band connection setup is done via the IPoIB network.
|
|
|
|
|
|
Whith any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
|
With any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
|
|
|
|
# JSC Run Examples
|
|
|
|
... | ... | |