PDC Summer School: Performance Engineering Lab / General Instructions
Introduction
The goal of this lab is to practise approaches and strategies for performance engineering, which have been presented during the connected lectures. The lab comprises of two parts:
- Performance analysis and optimisation of a Poisson equation solver
- Performance analysis and optimisation of a dense matrix-matrix multiplyer
In the next section you will find information on a tools that you are expected to use for this lab.
Profiling using Scalasca
To use Scalasca and related tools you need to load a number of modules:
module load qt papi gcc openmpi score-p scalasca cubegui
Performing measurements requires instrumentation of the executable using Score-P:
scorep <compile_command>
Finally, profiling statistics can be generated using the Scalasca tool scan
:
scan <run_command>
Define the environment variable SCOREP_METRIC_PAPI
in case you want Scalasca to read hardware counters, which are accessible via the PAPI library, e.g.
SCOREP_METRIC_PAPI=PAPI_L1_DCM scan <run_command>
The list of available hardware counters can be obtained using the command
papi_avail -a
The Scalasca tool scan
creates a subdirectory where it stores the output in a Cube file. Such files can be analysed using the Scalasca graphical tool square
:
square <path_to_cube_file>
Alternatively, one can use command line tools like cube_stat
to display the content of Cube files, e.g.
cube_stat -p <path_to_cube_file>
cube_stat -m PAPI_L1_DCM -p <path_to_cube_file>
cube_stat -r <name_of_routine> -p <path_to_cube_file>
Here the first invocation returns the default metric, i.e. time, while the second invocation returns the PAPI_L1_DCM
counter measurement. In the third example the execution time for a specific routine is returned.
For further information consult the Scalasca Quick Reference or User Guide, which you find on the Scalasca 2.x documentation page