Snippets Groups Projects

README.md

Xin Li authored Aug 9, 2019

d0487f49

d0487f49 Aug 9, 2019

README.md 4.43 KiB

Overview

In this lab you will get more familiar with more advanced MPI topics, including one sided communication and MPI I/O.

Goals

Get experience in MPI one sided communication, MPI I/O and topologies in MPI

Duration

Three hours

Source Codes

MPI One sided. Prototype C and Fortran (game_of_life-one_sided-prototype.c and game_of_life-one_sided-prototype.f90)
MPI Topology. Simple 1d example Topology C and Fortran (simple_1d_topology.c and simple_1d_topology.f90)
MPI I/O. Serial hello world in C and Fortran (hello_mpi.c and hello_mpi.f90)
MPI Latency: C and Fortran (mpi_latency.c and mpi_latency.f90)
MPI Bandwidth : C and Fortran (mpi_bandwidth.c and mpi_bandwidth.f90)
MPI Bandwidth Non-Blocking: C and Fortran (mpi_bandwidth-nonblock.c and mpi_bandwidth-nonblock.f90)

Preparation

In preparation for this lab, read the general instructions which will help you get going on Beskow.

Exercise 1 - One sided communication

Take the prototype one sided communication code and complete the code by adding the correct one sided MPI calls so that the program works. The number of live cells after the calculation should be the same on any number of tasks that can easily divide the grid. The solution that will be provided towards the end of the class uses MPI_Get, but something similar could also be done with MPI_Put.

Exercise 2 - Topologies

Part A

Run the simple example topology program and understand how it works. Notice that the rank order in the MPI_COMM_WORLD communicator is not necessarily the same as for the cart_comm communicator.

Part B

The code in Exercise 1 uses a simple and manually implemented "topology". Re-implement the calculation of which MPI task to read the halo data from using MPI topology functions, i.e. set up a simple periodic 1d topology then use MPI_Cart_shift to get the rank of the ranks to get the data from.

Note that the position in the new topology is not necessarily the same as the position in MPI_COMM_WORLD so make sure that the initial grid setup reflects that.

Exercise 3 - MPI I/O

MPI I/O is used so that results can be written to the same file in parallel. Take the serial hello world programs and modify them so that instead of writing the output to screen the output is written to a file using MPI I/O.

The simplest solution is likely to be for you to create a character buffer, and then use the MPI_File_write_at function.

Exercises 4 - Bandwidth and latency between nodes

Use mpi_wtime to compute latency and bandwidth with the bandwidth and latency codes above

Note: In modifying the original exercises provided by LLNL, We had to make a small change to the latency code as the Cray latency is a lot better than the tests were designed for. When the latency is of the order 1 millisecond, writing it out as an integer number of milliseconds did not make much sense.

For this exercise, it is nice to compare running on the same node e.g.

salloc -N 1 --ntasks-per-node=2 -A <project> -t 00:05:00
srun -n 2 ./mpi_latency.x

with running on separate nodes

salloc -N 2 --ntasks-per-node=1 -A <project> -t 00:05:00
srun -n 2 ./mpi_latency.x

Similarly for the bandwidth.

As you would expect the latency is much better on a single node than across nodes, but possibly unexpectedly if you just have 2 MPI tasks the bandwidth is better between nodes than across a single node. (probably related to lack of contention for resources, e.g. the gemini chips and the l3 cache etc.)

Solutions

The solutions will be made available at the end of the lab.

Acknowledgment

The examples in this lab are provided for educational purposes by National Center for Supercomputing Applications, (in particular their Cyberinfrastructure Tutor), Lawrence Livermore National Laboratory and Argonne National Laboratory. Much of the LLNL MPI materials comes from the Cornell Theory Center. We would like to thank them for allowing us to develop the material for machines at PDC. You might find other useful educational materials at these sites.