From 5c835a4dd97ad8f5f7a38c1268486505bac4cf47 Mon Sep 17 00:00:00 2001
From: Xin Li <lixin.reco@gmail.com>
Date: Mon, 23 Aug 2021 15:29:19 +0200
Subject: [PATCH] updated intro_lab

---
 intro_lab/README.md      | 14 +++++++-------
 intro_lab/pi.f90         |  3 ++-
 intro_lab/stream-triad.c |  8 ++++----
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/intro_lab/README.md b/intro_lab/README.md
index 55c3332..47053af 100644
--- a/intro_lab/README.md
+++ b/intro_lab/README.md
@@ -9,14 +9,14 @@ To run your code, you need first to generate your executable. It is very importa
 To compile your C OpenMP code using ``gcc``, therefore, use
 
 ```
-gcc -O2 -openmp -o myprog.x myprog.c -lm
+gcc -O2 -fopenmp -o myprog.x myprog.c -lm
 ```
 
 In Fortran, it is recommended to use the Intel compiler
 
 ```
 module load i-compilers
-ifort -O2 -fopenmp -o myprog.x myprog.f90 -lm
+ifort -O2 -qopenmp -o myprog.x myprog.f90 -lm
 ```
 
 To run your code, you will need to have an (e.g., interactive) allocation:
@@ -83,10 +83,10 @@ This implementation performs repeated execution of the benchmarked kernel to mak
 ### Tasks and questions to be addressed
 
 1) Create a parallel version of the programs using a parallel construct: ``#pragma omp parallel for``. In addition to a parallel construct, you might need some runtime library routines:
-   - ``int omp_get_num_threads()`` to get the number of threads in a team
+   - ``int omp_get_max_threads()`` to get the maximum number of threads
    - ``int omp_get_thread_num()`` to get thread ID
    - ``double omp_get_wtime()`` to get the time in seconds since a fixed point in the past
-   - ``omp_set_num_threads()`` to request a number of threads in a team
+   - ``omp_set_num_threads()`` to set the number of threads to be used
 2) Run the parallel code and take the execution time with 1, 2, 4, 12, 24 threads for different array length ``N``. Record the timing.
 3) Produce a plot showing execution time as a function of array length for different number of threads.
 4) How large does ``N`` has to be for using 2 threads becoming more beneficial compared to a single thread?
@@ -132,10 +132,10 @@ A simple serial C code to calculate $\pi$ is the following:
 ### Tasks and questions to be addressed
 
 1) Create a parallel version of the [pi.c](pi.c) / [pi.f90](pi.f90) program using a parallel construct: ``#pragma omp parallel``.  Pay close attention to shared versus private variables. In addition to a parallel construct, you might need some runtime library routines:
-   - ``int omp_get_num_threads()`` to get the number of threads in a team
+   - ``int omp_get_max_threads()`` to get the maximum number of threads
    - ``int omp_get_thread_num()`` to get thread ID
    - ``double omp_get_wtime()`` to get the time in seconds since a fixed point in the past
-   - ``omp_set_num_threads()`` to request a number of threads in a team
+   - ``omp_set_num_threads()`` to set the number of threads to be used
 2) Run the parallel code and take the execution time with 1, 2, 4, 8, 12, 24 threads. Record the timing.
 3) How does the execution time change varying the number of threads? Is it what you expected? If not, why do you think it is so?
 4) Is there any technique you heard of in class to improve the scalability of the technique? How would you implement it?
@@ -180,4 +180,4 @@ Here we are going to implement a fourth parallel version of the [pi.c](pi.c) / [
 
 Hints:
 
-- To change the schedule, you can either change the environment variable with ``export OMP_SCHEDULE=type`` where ``type`` can be any of static, dynamic, guided or in the source code as ``omp parallel for schedule(type)``.
\ No newline at end of file
+- To change the schedule, you can either change the environment variable with ``export OMP_SCHEDULE=type`` where ``type`` can be any of static, dynamic, guided or in the source code as ``omp parallel for schedule(type)``.
diff --git a/intro_lab/pi.f90 b/intro_lab/pi.f90
index 127a5fe..0c8dcb2 100644
--- a/intro_lab/pi.f90
+++ b/intro_lab/pi.f90
@@ -27,6 +27,7 @@ enddo
 pi = pi * 4.0D0 * dx
 run_time = OMP_GET_WTIME() - start_time
 ref_pi = 4.0D0 * atan(1.0D0)
-print '("pi with ", i0, " steps is ", f16.10, " in ", f12.6, " seconds (error=", e12.6, ")")', NSTEPS, pi, run_time, abs(ref_pi - pi)
+print '("pi with ", i0, " steps is ", f16.10, " in ", f12.6, " seconds (error=", e12.6, ")")', &
+    NSTEPS, pi, run_time, abs(ref_pi - pi)
 
 end program
diff --git a/intro_lab/stream-triad.c b/intro_lab/stream-triad.c
index 9ee42ac..1a77a89 100644
--- a/intro_lab/stream-triad.c
+++ b/intro_lab/stream-triad.c
@@ -14,18 +14,18 @@ int main() {
     int i, j;
     double s;
 
-     /* Initialise b, c and s */
+    /* Initialise b, c and s */
     s = 0.1;
     for (i = 0; i < N; i++) {
         b[i] = (double) i;
         c[i] = (double) i;
     }
 
-     /* Run benchmark loop M times */
-     for (j = 0; j < M; j++) {
+    /* Run benchmark loop M times */
+    for (j = 0; j < M; j++) {
         for (i = 0; i < N; i++)
             a[i] = b[i] + s * c[i];
     }
 
     return 0;
-}
\ No newline at end of file
+}
-- 
GitLab