diff --git a/advanced_lab/README.md b/advanced_lab/README.md index 8ce624e287f138fe1fefaf7e9e3c2d287bf8a12e..a5dca5f3f0f05f6f1774638e2b01ec47b3d18570 100644 --- a/advanced_lab/README.md +++ b/advanced_lab/README.md @@ -8,7 +8,7 @@ Your task is to parallelize a finite-volume solver for the two dimensional shall ## Algorithm -For this exercise we solve the shallow water equations on a square domain using a simple dimensional splitting approach. Updating volumes *Q* with numerical fluxes *F* and *G*, first in the x and then in the y direction, more easily expressed with the following pseudo-code +For this exercise we solve the shallow water equations on a square domain using a simple dimensional splitting approach. Updating volumes *Q* with numerical fluxes *F* and *G*, first in the x and then in the y direction, more easily expressed with the following pseudo-code: ``` for each time step do @@ -30,8 +30,7 @@ Choose to work with either the given serial C/Fortran 90 code or, if you think y ## 1. Parallelize the code -A serial version of the code is provided here: [shwater2d.c](c/shwater2d.c) or [shwater2d.f](f90/shwater2d.f90). Remember not to try parallelising everything. - add OpenMP statements to make it run in parallel and make sure the computed solution is correct.Do not parallelize everything Some advices are provided below. +A serial version of the code is provided here: [shwater2d.c](c/shwater2d.c) or [shwater2d.f](f90/shwater2d.f90). Remember not to try parallelising everything. Add OpenMP statements to make it run in parallel and make sure the computed solution is correct. Some advices are provided below. ### Tasks and questions to be addressed @@ -67,23 +66,22 @@ and _Hint: How are threads created/destroyed by OpenMP? How can it impact performance?_ -## 2. Measure parallel performance. +## 2. Measure parallel performance -In this exercise, we explore parallel performance refers to the computational speed-up *S*<sub>n</sub>_ = $\Delta$*T*<sub>1</sub>/$\Delta$*T*<sub>n</sub>_, using _n_ threads. +In this exercise, we explore parallel performance refers to the computational speed-up *S*<sub>*n*</sub> = ($\Delta$*T*<sub>1</sub>/$\Delta$*T*<sub>*n*</sub>), where *n* is the number of threads. ### Tasks and questions to be addressed -1) Measure run time $\Delta$T for 1, 2, ..., 24 threads and calculate the speed-up. +1) Measure run time $\Delta$*T*<sub>*n*</sub> for *n* = 1, 2, ..., 24 threads and calculate the speed-up. 2) Is it linear? If not, why? 3) Finally, is the obtained speed-up acceptable? 4) Try to increase the space discretization (M,N) and see if it affects the speed-up. Recall from the OpenMP exercises that the number of threads is determined by an environment variable ``OMP_NUM_THREADS``. One could change the variable or use the shell script provided in Appendix B. -### 3. Optimize the code. +## 3. Optimize the code -The given serial code is not optimal, why? If you have time, go ahead and try to make it faster. Try to decrease the serial run time. Once the serial -performance is optimal, redo the speedup measurements and comment on the result. +The given serial code is not optimal, why? If you have time, go ahead and try to make it faster. Try to decrease the serial run time. Once the serial performance is optimal, redo the speedup measurements and comment on the result. For debugging purposes you might want to visualize the computed solution. Uncomment the line ``save_vtk``. The result will be stored in ``result.vtk``, which can be opened in ParaView, available on Tegner after ``module add paraview``. Beware that the resulting file could be rather large, unless the space discretization (M,N) is decreased. @@ -105,10 +103,23 @@ where *f* and *g* are the flux functions, derived from (1). For simplicity we us <img src="image/eq_4.png" alt="Eq_4" width="800px"/> -## B. Run script for changing ``OMP_NUM_THREADS`` +## B. Run scripts for changing ``OMP_NUM_THREADS`` + +Bash: + +``` +#!/bin/bash + +for n in `seq 1 1 16`; do + OMP_NUM_THREADS=$n srun -n 1 ./a.out +done +``` + +C shell: ``` #!/bin/csh + foreach n (`seq 1 1 16`) env OMP_NUM_THREADS=$n srun -n 1 ./a.out end diff --git a/advanced_lab/README.pdf b/advanced_lab/README.pdf index 7e1b046a337f3fc8809487904ddc36ad07634832..9fa3fde22a814f2a5ad3fcf34359f3cb77689943 100644 Binary files a/advanced_lab/README.pdf and b/advanced_lab/README.pdf differ diff --git a/intro_lab/README.pdf b/intro_lab/README.pdf deleted file mode 100644 index a9988a97b64cc46b1cff42c65b5ce9b8a9197aa7..0000000000000000000000000000000000000000 Binary files a/intro_lab/README.pdf and /dev/null differ