Skip to content
Snippets Groups Projects
user avatar
Yannik Müller authored
31f62c06
History

LinkTest

LinkTest performs parallel end-to-end bandwidth tests. Additionally to MPI it supports multiple other transport layers. Single- and multi-node tests can be performed, depending on the transport layer. Results are written through SIONlib, a highly scalable IO library, thus a large number of processes can be tested.

Wiki

Our wiki contains further information including details of how to build, run, and analyze with LinkTest
https://gitlab.jsc.fz-juelich.de/cstao-public/linktest/-/wikis/home

Code Structure

The LinkTest repository consists of 4 parts

  1. benchmark/
    • Performs a parallel all-to-all performance test and produces results in form of stdout and an optional binary file in sion format
    • Makefile produces the linktest executable
  2. python/
    • Python-based analysis tool. Visualizes data from the SION files as pdf
    • python -m pip installs the linktest-report package
  3. example/
    • Python notebook showcasing how to write a custom (advanced) analysis based on the python API

Copyright

Copyright (c) 2008-2022 Dr. Wolfgang Frings, Dr. Dorian Krause, Yannik Müller & Dr. Max Holicki Forschungszentrum Juelich GmbH - Juelich Supercomputing Centre

Read the copyright-terms in COPYRIGHT before usage or distribution.

Quickstart

For the quickstart Examples, you will need:

Build procedure

In general, you have to do the following steps

  1. Install Dependencies
  2. Set-Up Environment, Paths etc.
  3. Install linktest with make, enable transport layers other than mpi
  4. [Optional] Install linktest-report via pip (Update PYTHONPATH or use virtual environments)

Quick-Start Example installing LinkTest with only MPI support. (Might not work on your system)

mkdir -p install
cd benchmark
ml GCC ParaStationMPI SIONlib SciPy-Stack                     # 1 + 2 
make PREFIX=../install clean install                          # 3
python -m pip install ../python/setup.py --prefix=../install  # 4

Full Example

The example scripts are written for the JUWELS System. They assume MPI, TCP, Infiniband and UCX are available. They also uses specific slurm accounts in order to work out of the box. You likely want to change those. To work through the example do the following:

  1. Build linktest executable and linktest-report: ./exampleBuild.sh.
  2. Execute LinkTest in parallel with an even number of processes: ./exampleRun.sh.

A file named pingpong_results_bin.sion will be produced, and a text based report will be printed on stdout.

  1. Analyse the produced SION file with linktest-report: ./exampleAnalysis.sh.

A file named report.pdf should be produced. This report shows the communication matrix (ntasks x ntasks), a histogram of the connection bandwidth, the parameters that were active during the run, and the results of the retest of the slowest connections.