@@ -59,7 +59,7 @@ Hint: As the number of darts and rounds is hard coded then all workers already k
...
@@ -59,7 +59,7 @@ Hint: As the number of darts and rounds is hard coded then all workers already k
In this exercise, you learn about the heart of MPI: point-to-point message-passing routines in both their blocking and non-blocking forms as well as the various modes of communication.
In this exercise, you learn about the heart of MPI: point-to-point message-passing routines in both their blocking and non-blocking forms as well as the various modes of communication.
Try to parallelize the "Parallel Search" problem. In the parallel search problem, the program should find all occurrences of a certain integer, which will be called the target. It should then write the target value, the indices and the number of occurences to an output file. In addition, the program should read both the target value and all the array elements from an input file.
Your task is to parallelize the "Parallel Search" problem. In the parallel search problem, the program should find all occurrences of a certain integer, which will be called the target. It should then write the target value, the indices and the number of occurences to an output file. In addition, the program should read both the target value and all the array elements from an input file.
Hint: One issue that comes up when parallelizing a serial code is handling I/O. As you can imagine, having multiple processes writing to the same file at the same time can produce useless results. A simple solution is to have each process write to an output file named with its rank. Output to these separate files removes the problem. Here is how to do that in C and Fortran:
Hint: One issue that comes up when parallelizing a serial code is handling I/O. As you can imagine, having multiple processes writing to the same file at the same time can produce useless results. A simple solution is to have each process write to an output file named with its rank. Output to these separate files removes the problem. Here is how to do that in C and Fortran:
In this lab, you'll get familiar with MPI's Collection Communication routines, using them on programs you previously wrote with point-to-point calls. You'll also explore non-blocking behavior.
# Overview
# Overview
In this lab, you'll get familiar with MPI's Collection Communication routines, using them on programs you previously wrote with point-to-point calls. You'll also explore non-blocking behavior.
### Goals
### Goals
Get familar with MPI Collective Communication routines and non-blocking calls
Get familar with MPI Collective Communication routines and non-blocking calls
...
@@ -15,21 +15,21 @@ Three hours
...
@@ -15,21 +15,21 @@ Three hours
- Calculation of PI: Serial C and Fortran ([pi_serial.c](pi_serial.c) and [pi_serial.f90](pi_serial.f90))
- Calculation of PI: Serial C and Fortran ([pi_serial.c](pi_serial.c) and [pi_serial.f90](pi_serial.f90))
- Send data across all processes : No source provided
- Send data across all processes : No source provided
- Parallel Search: Serial C and Fortran ([parallel_search-serial.c](parallel_search-serial.c) and [parallel_search-serial.f90](parallel_search-serial.f90)),
input file ([b.data](b.data)), and output file ([reference.found.data](reference.found.data))
- Game of Life: Serial C and Fortran ([game_of_life-serial.c](game_of_life-serial.c) and [game_of_life-serial.f90](game_of_life-serial.f90))
- Game of Life: Serial C and Fortran ([game_of_life-serial.c](game_of_life-serial.c) and [game_of_life-serial.f90](game_of_life-serial.f90))
- Parallel Search: Serial C and Fortran ([parallel_search-serial.c](parallel_search-serial.c) and [parallel_search-serial.f90](parallel_search-serial.f90))
- Input file used in the Parallel Search program: [b.data](b.data)
- Output file from the Parallel Search program: [reference.found.data](reference.found.data)
# Preparation
# Preparation
In preparation for this lab, read the [general instructions](../README.md) which will help you get going on Beskow.
In preparation for this lab, read the [general instructions](../README.md) which will help you get going on Beskow.
# Exercise 1: Calculate PI Using Collectives
# Exercise 1: Calculate π Using Collectives
Calculates PI using a "dartboard" algorithm. If you're unfamiliar with this algorithm, checkout the Wikipedia page on
Calculates π using a "dartboard" algorithm. If you're unfamiliar with this algorithm, checkout the Wikipedia page on
[Monte Carlo Integration](http://en.wikipedia.org/wiki/Monte_Carlo_Integration) or
[Monte Carlo Integration](http://en.wikipedia.org/wiki/Monte_Carlo_Integration) or
*Fox et al.(1988) Solving Problems on Concurrent Processors, vol.1 page 207.*
*Fox et al.(1988) Solving Problems on Concurrent Processors, vol. 1, page 207.*
**Hint**: all processes should contribute to the calculation, with the master averaging the values for PI. Consider using `mpi_reduce` to collect results.
Hint: All processes should contribute to the calculation, with the master averaging the values for π. Consider using `mpi_reduce` to collect results.
# Exercise 2: Send data across all processes using Non-Blocking
# Exercise 2: Send data across all processes using Non-Blocking
...
@@ -44,23 +44,23 @@ For the case where you want to use proper synchronization, you'll want to do a n
...
@@ -44,23 +44,23 @@ For the case where you want to use proper synchronization, you'll want to do a n
To see what happens without synchronization, leave out the `wait`.
To see what happens without synchronization, leave out the `wait`.
# Exercise 3: Find PI Using Non-Blocking Communications
# Exercise 3: Find π Using Non-Blocking Communications
Use a non-blocking send to try to overlap communication and computation. Take the PI code from Exercise 1 as your starting point.
Use a non-blocking send to try to overlap communication and computation. Take the code from Exercise 1 as your starting point.
# Exercise 4: Implement the "Game of Life" and "Parallel Search" Using Collectives
# Exercise 4: Implement the "Parallel Search" and "Game of Life" Using Collectives
In almost every MPI program there are instances where all the processors in a communicator need to perform some sort of data transfer or calculation. These "collective communication" routines are the subject of this exercise and the "Game of Life" and "Parallel Search" programs are no exception.
In almost every MPI program there are instances where all the processors in a communicator need to perform some sort of data transfer or calculation. These "collective communication" routines are the subject of this exercise and the "Parallel Search" and "Game of Life" programs are no exception.
### Your First Challenge
### Your First Challenge
Modify your previous "Game of Life" code to use `mpi_reduce` to compute the total number of live cells, rather than individual sends and receives.
Modify your previous "Parallel Search" code to change how the master first sends out the target and subarray data to the slaves. Use the MPI broadcast routines to give each slave the target. Use the MPI scatter routine to give all processors a section of the array ``b`` it will search.
### Your Second Challenge
Hint: When you use the standard MPI scatter routine you will see that the global array ``b`` is now split up into four parts and the master process now has the first fourth of the array to search. So you should add a search loop (similar to the workers') in the master section of code to search for the target and calculate the average and then write the result to the output file. This is actually an improvement in performance since all the processors perform part of the search in parallel.
Modify your previous "Parallel Search" code to change how the master first sends out the target and subarray data to the slaves. Use the MPI broadcast routines to give each slave the target. Use the MPI scatter routine to give all processors a section of the array b it will search.
### Your Second Challenge
When you use the standard MPI scatter routine you will see that the global array b is now split up into four parts and the master process now has the first fourth of the array to search. So you should add a search loop (similar to the workers') in the master section of code to search for the target and calculate the average and then write the result to the output file. This is actually an improvement in performance since all the processors perform part of the search in parallel.
Modify your previous "Game of Life" code to use `mpi_reduce` to compute the total number of live cells, rather than individual sends and receives.
# Solutions
# Solutions
...
@@ -77,4 +77,3 @@ The examples in this lab are provided for educational purposes by
...
@@ -77,4 +77,3 @@ The examples in this lab are provided for educational purposes by
We would like to thank them for allowing us to develop the material for machines at PDC.
We would like to thank them for allowing us to develop the material for machines at PDC.
You might find other useful educational materials at these sites.
You might find other useful educational materials at these sites.