... | ... | @@ -13,7 +13,7 @@ Probably, here are a few things that will speed up report generation: |
|
|
|
|
|
2. Use the `--downsampling_factor_matrix_ticks` option, see [the Linktest Report options](Linktest-Report#options). Plotting tick labels in MatPlotLib is very slow, as such reducing the number of tick labels to plot also speeds up the report generation. An added bonus is that tick labels may also become larger, making them easier to read.
|
|
|
|
|
|
3. If you are prone to cancelling, seemingly hanging, processes early because of no command-line output use the `verbose` option to see timing information for segments of the report generation.
|
|
|
3. If you are prone to cancelling seemingly hanging processes early because of no command-line output use the `verbose` option to see timing information for segments of the report generation.
|
|
|
|
|
|
4. Use a newer version of Python or MatPlotLib. Although the report tool was originally developed for Python 3.8.5 and MatPlotLib version 3.3.1 upgrading MatPlotLib version 3.3.4 improved a 2 minute run using a defragmented SION file by approximately 15%. Upgrading to Python 3.9.0 cut the time to just above 1 minute. The problem is mostly the slow MatPlotLib back end for generating plots. The backends are optimized for quality, not performance. Profiling indicates that for larger SION files, 500 MiB and above, the MatPlotLib back end takes up about 80% of the compute time of the report.
|
|
|
|
... | ... | @@ -56,4 +56,3 @@ Note that Linktest does not have the ability to explicitly pre-initialise hardwa |
|
|
TLDR: Change your process pinning.
|
|
|
|
|
|
This is an artifact of your process pinning. Due to the way in which modern CPUs are constructed certain CPU cores have faster access to certain hardware devices, and hence faster to connections to the CPUs of other nodes, than other cores. This manifests itself commonly in checkerboard patterns in the timing matrix. The checkerboard pattern can commonly be avoided by reorganizing the rows and columns. This reorganization can be achieved by changing the processor pinning when Linktest is executed. For more information on how this is done please see the documentation for the tools you use to execute Linktest in parallel, for example `mpiexec` or `srun`. |
|
|
|