... | @@ -11,31 +11,35 @@ Usage : linktest [options] |
... | @@ -11,31 +11,35 @@ Usage : linktest [options] |
|
|
|
|
|
Possible options (default values in parathesis):
|
|
Possible options (default values in parathesis):
|
|
|
|
|
|
-h/--help Print this help message and exit
|
|
-h/--help Print this help message and exit
|
|
-v/--version Print LinkTest version and exit
|
|
-v/--version Print LinkTest version and exit
|
|
-m/--mode VAL Transport Layer to be used [REQUIRED]*
|
|
-m/--mode VAL Transport Layer to be used [REQUIRED]*
|
|
-w/--num-warmup-messages VAL Number of warm-up messages [REQUIRED]
|
|
-w/--num-warmup-messages VAL Number of warm-up messages [REQUIRED]
|
|
-n/--num-messages VAL Number of messages [REQUIRED]
|
|
-n/--num-messages VAL Number of messages [REQUIRED]
|
|
-s/--size-messages VAL Message size in bytes [REQUIRED]
|
|
-s/--size-messages VAL Message size in bytes [REQUIRED]
|
|
-o/--output VAL output file name (pingpong_results_bin.sion)
|
|
-o/--output VAL output file name (pingpong_results_bin.sion)
|
|
--no-sion-file Do not write data to sion file (0)
|
|
--no-sion-file Do not write data to sion file (0)
|
|
--parallel-sion-file Write data SION file in parallel (0)
|
|
--parallel-sion-file Write data SION file in parallel (0)
|
|
--num-slowest VAL Number of slowest pairs to be retested (10)
|
|
--num-slowest VAL Number of slowest pairs to be retested (10)
|
|
--min-iterations VAL LinkTest repeats for at least <min_iterations> (1)
|
|
--min-iterations VAL LinkTest repeats for at least <min_iterations> (1)
|
|
--min-runtime VAL LinkTest runs for at least <min_runtime> seconds communication time (0.0)
|
|
--min-runtime VAL LinkTest runs for at least <min_runtime> seconds communication time (0.0)
|
|
--memory_buffer_allocator VAL Allocator type for memory (DEFAULT)
|
|
--memory-buffer-allocator VAL Allocator type for memory buffers (DEFAULT)
|
|
--all-to-all Additionally perform MPI all-to-all tests (0)
|
|
--all-to-all Additionally perform MPI all-to-all tests (0)
|
|
--unidirectional Perform unidirectional tests (0)
|
|
--unidirectional Perform unidirectional tests (0)
|
|
--bidirectional Perform bidirectional tests (0)
|
|
--bidirectional Perform bidirectional tests (0)
|
|
--bisection Test between bisecting halves (0)
|
|
--bisection Test between bisecting halves (0)
|
|
--serial-tests Serialize tests (0)
|
|
--serial-tests Serialize tests (0)
|
|
--randomize Randomize test order (0)
|
|
--randomize-steps Randomize step execution order (0)
|
|
--use-gpu-memory Use GPU memory to store message buffers (0)
|
|
--seed-randomize-steps VAL Seed for step randomization (9876543210)
|
|
--multi_buf Use multiple send and receive buffers (0)
|
|
--use-gpu-memory Use GPU memory to store message buffers (0)
|
|
--num_multi_buf VAL Number of buffers when using multiple buffers (0)
|
|
--multi-buffer Use multiple send and receive buffers (0)
|
|
--randomize-buffers Randomize buffers (0)
|
|
--num-multi-buffer VAL Number of buffers when using multiple buffers (0)
|
|
--mt_seed VAL Seed for buffer randomization (1234567890)
|
|
--randomize-buffers Randomize buffers (0)
|
|
--check_buffers Check buffers after timing kernel (0)
|
|
--seed-randomize-buffers VAL Seed for buffer randomization (1234567890)
|
|
|
|
--check-buffers Check buffers after timing kernel (0)
|
|
|
|
--num-randomize-tasks VAL Use VAL different randomly assigned process IDs for communication scheme (0)
|
|
|
|
--seed-randomize-tasks VAL Seed for task randomization (29309775)
|
|
|
|
--group-processes-by-hostname Group processes by hostnames and only test group to group (0)
|
|
|
|
|
|
* This build supports [<<<SUPPORTED COMMUNICATION APIs>>>].
|
|
* This build supports [<<<SUPPORTED COMMUNICATION APIs>>>].
|
|
Alternatively to --mode, the transport layer can be defined by using linktest.LAYER
|
|
Alternatively to --mode, the transport layer can be defined by using linktest.LAYER
|
... | @@ -49,7 +53,7 @@ FZJ Linktest (<<<VERSION>>>) |
... | @@ -49,7 +53,7 @@ FZJ Linktest (<<<VERSION>>>) |
|
```
|
|
```
|
|
where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the `-h` or `--help` option LinkTest does not need to be executed with this option. This option supersedes all other aside from the `-h` or `--help` option.
|
|
where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the `-h` or `--help` option LinkTest does not need to be executed with this option. This option supersedes all other aside from the `-h` or `--help` option.
|
|
|
|
|
|
`-m` or `--mode`: Specifies that the following ASCII string indicates the communication API to use for testing. Alternatively the communication API can be extracted from the extension of the LinkTest executable name or from the `LINKTEST_VCLUSTER_IMPL` environment variable. When multiple ways of specifying the communication API are used then `-m` or `--mode` supersedes the linktest executable extension, which in turn also supersedes the `LINKTEST_VCLUSTER_IMPL` environment variable.
|
|
`-m` or `--mode`: Specifies that the following ASCII string indicates the communication API to use for testing. Alternatively the communication API can be extracted from the extension of the LinkTest executable name or from the `LINKTEST_VCLUSTER_IMPL` environment variable. When multiple ways of specifying the communication API are used then `-m` or `--mode` supersedes the LinkTest executable extension, which in turn also supersedes the `LINKTEST_VCLUSTER_IMPL` environment variable.
|
|
|
|
|
|
`-w`or `--num-warmup-messages`: Specifies that the following integer indicates the number of warm-up messages to use to warm up a connection before testing it. When not printing help or version information this command-line argument is required.
|
|
`-w`or `--num-warmup-messages`: Specifies that the following integer indicates the number of warm-up messages to use to warm up a connection before testing it. When not printing help or version information this command-line argument is required.
|
|
|
|
|
... | @@ -65,7 +69,7 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
... | @@ -65,7 +69,7 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
|
|
|
|
|
`--num-slowest`: Specifies that the following integer indicates the number of slowest connections to serially retest after the end of the main test.
|
|
`--num-slowest`: Specifies that the following integer indicates the number of slowest connections to serially retest after the end of the main test.
|
|
|
|
|
|
`--min-iterations`: Specifies that the following integer indicates the number of times the linktest benchmark should be repeated. If not one the writing of SION files is disabled. This command-line argument is useful to apply a communication load to the system.
|
|
`--min-iterations`: Specifies that the following integer indicates the number of times the LinkTest benchmark should be repeated. If not one the writing of SION files is disabled. This command-line argument is useful to apply a communication load to the system.
|
|
|
|
|
|
`--min-runtime`: Specifies that the following floating-point--precision number indicates the number of seconds that LinkTest should repeat itself for. If non-zero the writing of SION files is disabled. This command-line is useful to apply a communication load to the system.
|
|
`--min-runtime`: Specifies that the following floating-point--precision number indicates the number of seconds that LinkTest should repeat itself for. If non-zero the writing of SION files is disabled. This command-line is useful to apply a communication load to the system.
|
|
|
|
|
... | @@ -85,11 +89,13 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
... | @@ -85,11 +89,13 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
|
|
|
|
|
`--bidirectional`: Specifies that testing should occur bidirectionally instead of semi-directionally, which is the default. Cannot be used in conjunction with `--unidirectional`.
|
|
`--bidirectional`: Specifies that testing should occur bidirectionally instead of semi-directionally, which is the default. Cannot be used in conjunction with `--unidirectional`.
|
|
|
|
|
|
`--bisection`: Specifies that the tasks for testing should be split in two halves and that testing should only occur between these two. This is useful for determining bisection bandwidths.
|
|
`--bisection`: Specifies that the tasks for testing should be split in two halves and that testing should only occur between these two. This is useful for determining bisection bandwidths. For more information see [Communication Patterns](#Communications-Patterns).
|
|
|
|
|
|
`--serial-tests`: Specifies that connections should be tested in serial. By default testing occurs in parallel.
|
|
`--serial-tests`: Specifies that connections should be tested in serial. By default testing occurs in parallel.
|
|
|
|
|
|
`--randomize`: Specifies that the order in which tests are performed is to be randomized.
|
|
`--randomize`: Specifies that the step order in which tests are performed is to be randomized.
|
|
|
|
|
|
|
|
`--seed-randomize-steps`: Specifies that the following integer is to be used as a seed for step randomization. This option is only important if `--randomize` is specified. The seed value can be between 1 and 2^32-1.
|
|
|
|
|
|
`--use-gpu-memory`: Specifies that GPU memory should be used for the message buffers.
|
|
`--use-gpu-memory`: Specifies that GPU memory should be used for the message buffers.
|
|
|
|
|
... | @@ -99,11 +105,17 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
... | @@ -99,11 +105,17 @@ where `<<<VERSION>>>` is the three part version of LinkTest executable. Like the |
|
|
|
|
|
`--randomize-buffers`: Specifies that the buffers should be randomized before sending and receiving. Randomization is done using the Mersenne Twister 19937 algorithm, which has a period of 2^19937-1. Currently does not work for GPU buffers.
|
|
`--randomize-buffers`: Specifies that the buffers should be randomized before sending and receiving. Randomization is done using the Mersenne Twister 19937 algorithm, which has a period of 2^19937-1. Currently does not work for GPU buffers.
|
|
|
|
|
|
`--mt_seed`: Specifies that the following integer is to be used as a seed for buffer randomization. This option is only important if `--randomize-buffers` is specified. The seed value can be between 1 and 2^32-1. Currently does not work for GPU buffers.
|
|
`--seed-randomize-buffers`: Specifies that the following integer is to be used as a seed for buffer randomization. This option is only important if `--randomize-buffers` is specified. The seed value can be between 1 and 2^32-1. Currently does not work for GPU buffers.
|
|
|
|
|
|
`check_buffers`: Specifies that buffers should be checked after each step. This only detects errors if during the last time the buffer was written to the buffer was corrupted. If one of the messages in the middle is incorrectly transferred this will not detect it.
|
|
`check_buffers`: Specifies that buffers should be checked after each step. This only detects errors if during the last time the buffer was written to the buffer was corrupted. If one of the messages in the middle is incorrectly transferred this will not detect it.
|
|
|
|
|
|
The arguments num-warmup-messages, num-messages & size-messages are required. The transport layer is usually given through the --mode option. In rare cases where this doesn't work, you can fall back to the linktest.LAYER executables, and/or set the environment variable `LINKTEST_VCLUSTER_IMPL`.
|
|
`--num-randomize-tasks`: Specifies that the following interger indicates the number of times the test should be repeated for random permutations of the process IDs. If `0` then the process IDs are not randomized.
|
|
|
|
|
|
|
|
`--seed-randomize-tasks`: Specifies that the following integer is to be used as a seed for process-ID randomization. This option is only important if the value of `--num-randomize-tasks` is nonzero. The seed value can be between 1 and 2^32-1.
|
|
|
|
|
|
|
|
`--group-processes-by-hostname`: Specifies that process IDs should be grouped according to their hostname. Testing then exclusively occurs between group pairs. When testing a group pair all possible connection pairs of the processes belonging to the two groups are iterated through while ensuring that communication only happens between the two groups. This is done for all possible group pairs. If `--bisection` is also specified then the group of groups is split into bisecting halves and tests only occur between the two halves. For more information see [Communication Patterns](#Communications-Patterns).
|
|
|
|
|
|
|
|
The arguments `--num-warmup-messages`, `--num-messages` & `=size-messages` are required. The transport layer is usually given through the `--mode` option. In rare cases where this doesn't work, you can fall back to the linktest.LAYER executables, and/or set the environment variable `LINKTEST_VCLUSTER_IMPL`.
|
|
|
|
|
|
```
|
|
```
|
|
# Option 1: Using mode to specify the virtual-cluster implementation
|
|
# Option 1: Using mode to specify the virtual-cluster implementation
|
... | @@ -133,9 +145,7 @@ Except for the MPI and the node-internal CUDA transport layer, all layers utiliz |
... | @@ -133,9 +145,7 @@ Except for the MPI and the node-internal CUDA transport layer, all layers utiliz |
|
|
|
|
|
With any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
With any transport layer but MPI or intra-node CUDA it is important to make sure that the PMI (not MPI) environment is correctly set up. The easiest way to achieve this using slurm is: `srun --mpi=pmi2` or `srun --mpi=pmix`. If this option is not available or not supported by slurm please consult the relevant PMI documentation for your system.
|
|
|
|
|
|
## WIP: Supported Combinations of Communication APIs & Various Options
|
|
## Supported Combinations of Communication APIs & Various Options
|
|
|
|
|
|
!!! WORK IN PROGRESS !!!
|
|
|
|
|
|
|
|
Not all option combinations are currently possible. The following table shows supported combinations.
|
|
Not all option combinations are currently possible. The following table shows supported combinations.
|
|
|
|
|
... | @@ -151,23 +161,22 @@ Not all option combinations are currently possible. The following table shows su |
... | @@ -151,23 +161,22 @@ Not all option combinations are currently possible. The following table shows su |
|
| Parallel SION File | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Parallel SION File | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Min. Iterations | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Min. Iterations | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Min. Runtime | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Min. Runtime | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Use Multi. Buffers\* | :heavy_check_mark:\*\* | :x: | :x: | :x: | :x: | :x: |
|
|
| Use Multi. Buffers | :heavy_check_mark:\* | :x: | :x: | :x: | :x: | :x: |
|
|
| Check Buffers\* | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Check Buffers | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
|
|
| Randomize Buffers\* | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
|
|
| Randomize Buffers | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
|
|
|
|
|
|
:heavy_check_mark: : Implemented
|
|
:heavy_check_mark: : Implemented
|
|
:x: : Not-Implemented
|
|
:x: : Not-Implemented
|
|
\* : Not in yet in master branch
|
|
\* : Only works for unidirectional
|
|
\*\* : Only works for unidirectional
|
|
|
|
|
|
|
|
## Performing bisection bandwidth tests
|
|
## Performing bisection bandwidth tests
|
|
**Currently only works with stable branch and MPI**
|
|
**Currently only works with MPI**
|
|
|
|
|
|
To perform a bisection bandwidth test, in which the parallel bandwidth between two bisecting halves of tasks, usually placed in a specific configuration of interest with respect to the network topology, two options need to be set.
|
|
To perform a bisection bandwidth test, in which the parallel bandwidth between two bisecting halves of tasks, usually placed in a specific configuration of interest with respect to the network topology, two options need to be set.
|
|
|
|
|
|
`--bisection` splits the tasks into two halves and only tests between them. If the tasks assigned to linktest are enumerated `0` to and including `n-1`, where `n` is even, then the tasks `0` to and including `n/2-1` are assigned to the first half and the tasks `n/2` to and including `n-1` are assigned to the second half. Tasks should be pinned to nodes such that the desired test configuration is achieved.
|
|
`--bisection` splits the tasks into two halves and only tests between them. If the tasks assigned to linktest are enumerated `0` to and including `n-1`, where `n` is even, then the tasks `0` to and including `n/2-1` are assigned to the first half and the tasks `n/2` to and including `n-1` are assigned to the second half. Tasks should be pinned to nodes such that the desired test configuration is achieved.
|
|
|
|
|
|
`--unidirectional` causes linktest to test unidirectionally connections in parallel. Testing semidirectionally or bidirectionally does not ensure that communication occurs unidirectionally between the two halves at any given point in time. `--bidirictional` can be used with the understanding that at no point the tests guarantee a certain communication pattern and direction between the two bisecting halves. The individual communications can not be sufficiently synchronized for this. For `--semidirectional` we have seen that the communication organizes itself in such a way that on a given link communication occurs in one direction, but the direction any given link communicates at any given time is random.
|
|
`--unidirectional` causes LinkTest to test unidirectionally connections in parallel. Testing semidirectionally or bidirectionally does not ensure that communication occurs unidirectionally between the two halves at any given point in time. `--bidirictional` can be used with the understanding that at no point the tests guarantee a certain communication pattern and direction between the two bisecting halves. The individual communications can not be sufficiently synchronized for this. For `--semidirectional` we have seen that the communication organizes itself in such a way that on a given link communication occurs in one direction, but the direction any given link communicates at any given time is random.
|
|
|
|
|
|
## Usage of TCP Communication API Without miniPMI
|
|
## Usage of TCP Communication API Without miniPMI
|
|
LinkTest can be configured to test MPI or TCP without the miniPMI library. In the case of MPI no additional work is necessary, aside from executing with `mpiexe` or the like, and linktest can be used as above. When testing TCP communication without the miniPMI library the cluster configuration needs to be specified explicitly via the following four environment variables: `LINKTEST_TCP_SIZE`, `LINKTEST_TCP_RANK`, `LINKTEST_TCP_IPADDR_<<<RANK>>>` and `LINKTEST_TCP_PORT_<<<RANK>>>`.
|
|
LinkTest can be configured to test MPI or TCP without the miniPMI library. In the case of MPI no additional work is necessary, aside from executing with `mpiexe` or the like, and linktest can be used as above. When testing TCP communication without the miniPMI library the cluster configuration needs to be specified explicitly via the following four environment variables: `LINKTEST_TCP_SIZE`, `LINKTEST_TCP_RANK`, `LINKTEST_TCP_IPADDR_<<<RANK>>>` and `LINKTEST_TCP_PORT_<<<RANK>>>`.
|
... | @@ -245,7 +254,7 @@ $ xenv -L GCC -L CUDA -L ParaStationMPI \ |
... | @@ -245,7 +254,7 @@ $ xenv -L GCC -L CUDA -L ParaStationMPI \ |
|
LinkTest writes measurement results to stdout and monitoring information to stderr. Additionally by default a binary file in sion format will be produced containing detailed measurement data. These files are often quite sparse, therefore they can be compressed very efficiently if needed.
|
|
LinkTest writes measurement results to stdout and monitoring information to stderr. Additionally by default a binary file in sion format will be produced containing detailed measurement data. These files are often quite sparse, therefore they can be compressed very efficiently if needed.
|
|
|
|
|
|
## stdout
|
|
## stdout
|
|
The stdout output starts with the settings that were given for this run
|
|
The stdout output starts with the settings that were given for this run (exact output depends on configuration)
|
|
```
|
|
```
|
|
-------------------- LinkTest Args -------------------------
|
|
-------------------- LinkTest Args -------------------------
|
|
Virtual-Cluster Implementation: mpi
|
|
Virtual-Cluster Implementation: mpi
|
... | | ... | |