1 CHAPTER 3 2 Conditional Compilation C/C++ The following example illustrates the use of conditional compilation using the OpenMP macro g OPENMP.With OpenMP compilation,the_OPENMP macro becomes defined. 5 Example cond_comp.Ic #include <stdio.h> int main() #ifdef printt(compiled by an OpenM-compiiant implementation.In"); return 0; 1 C/C++ Fortran The following example illustrates the use of the conditional compilation sentinel.With OpenMP 7 compilation,the conditional compilation sentinel !is recognized and treated as two spaces.In fixed form source.statements guarded by the sentinel must start after column 6. 9 Example cond_comp.If S-1 PROGRAM EXAMPLE 9g45622 PRINT★, "Compiled by an OpenMP-compliant implementation." END PROGRAM EXAMPLE Fortran 10
1 CHAPTER 3 2 Conditional Compilation C / C++ 3 The following example illustrates the use of conditional compilation using the OpenMP macro 4 _OPENMP. With OpenMP compilation, the _OPENMP macro becomes defined. 5 Example cond_comp.1c S-1 #include <stdio.h> S-2 S-3 int main() S-4 { S-5 S-6 # ifdef _OPENMP S-7 printf("Compiled by an OpenMP-compliant implementation.\n"); S-8 # endif S-9 S-10 return 0; S-11 } C / C++ Fortran 6 The following example illustrates the use of the conditional compilation sentinel. With OpenMP 7 compilation, the conditional compilation sentinel !$ is recognized and treated as two spaces. In 8 fixed form source, statements guarded by the sentinel must start after column 6. 9 Example cond_comp.1f S-1 PROGRAM EXAMPLE S-2 S-3 C234567890 S-4 !$ PRINT *, "Compiled by an OpenMP-compliant implementation." S-5 S-6 END PROGRAM EXAMPLE Fortran 10
1 CHAPTER4 2 Internal Control Variables(ICVs) 34567 Accordingto Section of the OpenMP 4.0 specification.an OpenMP implmentation must act as if ICVs that requested for encountered parallel regions;there is one copy of this ICV per task.The max-active-levels-var ICV controls the maximum number of nested active parallel regions;there is one copy of this ICV for the whole program. 910 In the following ex mple,the e ne max-active-levels-var,dyn-var,and nthreads-var ICVs are modified through calls to the runtime library routines omp_set_nested. 1 omp_set_max_active_levels,omp_set_dynamic,and omp_set_num_threads 12 respectively.These ICVs affect the operation of parallel regions.Each implicit task generated 13 by a parallel region has its own copy of the nest-var,dyn-var,and nthreads-var ICVs. In the following example.the new value ofapplies only to the implicit tasks that 1516 ICV for the whole program andaue is the same for all tas.This example assumes that nested omp _set】 e ds.There i f the max-active-levels-var 7 parallelism is supported. 18 The outer parallel region creates a team of two threads;each of the threads will execute one of 19 the two implicit tasks generated by the outer parallel region. Each implicit task g erated by the outer parallel region calls omp_set num _threads(3) assigning the value 3 to its respective copy of nthreads-var.Then each implicit task encounters an 22 inner parallel region that creates a team of three threads;each of the threads will execute one of the three implicit tasks generated by that inner parallel region. Since the outer parallel reg 2 gion is executed by 2 threads,and the inner by 3,there will be a total of6 implicit tasks generated by the two inner parallel regions 2% Each implicit task generated by an inner parallel region will execute the call to omp_set_num_threads (4),assigning the value 4 to its respective copy of nthreads-var
1 CHAPTER 4 2 Internal Control Variables (ICVs) 3 According to Section 2.3 of the OpenMP 4.0 specification, an OpenMP implementation must act as 4 if there are ICVs that control the behavior of the program. This example illustrates two ICVs, 5 nthreads-var and max-active-levels-var. The nthreads-var ICV controls the number of threads 6 requested for encountered parallel regions; there is one copy of this ICV per task. The 7 max-active-levels-var ICV controls the maximum number of nested active parallel regions; there is 8 one copy of this ICV for the whole program. 9 In the following example, the nest-var, max-active-levels-var, dyn-var, and nthreads-var ICVs are 10 modified through calls to the runtime library routines omp_set_nested, 11 omp_set_max_active_levels, omp_set_dynamic, and omp_set_num_threads 12 respectively. These ICVs affect the operation of parallel regions. Each implicit task generated 13 by a parallel region has its own copy of the nest-var, dyn-var, and nthreads-var ICVs. 14 In the following example, the new value of nthreads-var applies only to the implicit tasks that 15 execute the call to omp_set_num_threads. There is one copy of the max-active-levels-var 16 ICV for the whole program and its value is the same for all tasks. This example assumes that nested 17 parallelism is supported. 18 The outer parallel region creates a team of two threads; each of the threads will execute one of 19 the two implicit tasks generated by the outer parallel region. 20 Each implicit task generated by the outer parallel region calls omp_set_num_threads(3), 21 assigning the value 3 to its respective copy of nthreads-var. Then each implicit task encounters an 22 inner parallel region that creates a team of three threads; each of the threads will execute one of 23 the three implicit tasks generated by that inner parallel region. 24 Since the outer parallel region is executed by 2 threads, and the inner by 3, there will be a total 25 of 6 implicit tasks generated by the two inner parallel regions. 26 Each implicit task generated by an inner parallel region will execute the call to 27 omp_set_num_threads(4), assigning the value 4 to its respective copy of nthreads-var. 11
12 The print statement in the outer parallel region is executed by only one of the threads in the team.So it will be executed only once. 3 The print statement in an inner parallel region is also executed by only one of the threads in the team.Since we have a total of two inner parallel regions,the print statement will be executed twice-once per inner parallel region C/C++ 6 Example icv.le 82 #include <stdio.h> #include <omp.h> int main (void) S-6 omp_set_nested(1); s-7 omp_set_max_active_levels(8) S-8 omp_set_dynamic(0); s-9 omp_set_num_threads(2); s-10 pragma omp parallel S-11 S-12 omp_set_num_threads(3); S-13 S-14 #pragma omp parallel S-15 S-16 omp_set_num_threads(4); S-17 #pragma omp single S-18 S-19 /★ S-20 The following should print: S-21 Inner:max_act_lev=8,num_thds=3,max_thds=4 s22 Inner:max_act_lev=8,num_thds=3,max_thds=4 S-23 s-24 printf ("Inner:max_act_lev=id,num_thds=d,max_thds=%d\n", 25 omp_get_max_active_levels(),omp_get_num_threads () omp_get_max_threads()); #pragma omp barrie王 #pragma omp single he following should print: outer:max_act_lev=8,num_thds=2,max_thds=3 printf ("Outer:max_act_lev=&d,num_thds=8d,max_thds=d\n" OpenMP Examples Version 4.0.2-March 2015
1 The print statement in the outer parallel region is executed by only one of the threads in the 2 team. So it will be executed only once. 3 The print statement in an inner parallel region is also executed by only one of the threads in the 4 team. Since we have a total of two inner parallel regions, the print statement will be executed 5 twice – once per inner parallel region. C / C++ 6 Example icv.1c S-1 #include <stdio.h> S-2 #include <omp.h> S-3 S-4 int main (void) S-5 { S-6 omp_set_nested(1); S-7 omp_set_max_active_levels(8); S-8 omp_set_dynamic(0); S-9 omp_set_num_threads(2); S-10 #pragma omp parallel S-11 { S-12 omp_set_num_threads(3); S-13 S-14 #pragma omp parallel S-15 { S-16 omp_set_num_threads(4); S-17 #pragma omp single S-18 { S-19 /* S-20 * The following should print: S-21 * Inner: max_act_lev=8, num_thds=3, max_thds=4 S-22 * Inner: max_act_lev=8, num_thds=3, max_thds=4 S-23 */ S-24 printf ("Inner: max_act_lev=%d, num_thds=%d, max_thds=%d\n", S-25 omp_get_max_active_levels(), omp_get_num_threads(), S-26 omp_get_max_threads()); S-27 } S-28 } S-29 S-30 #pragma omp barrier S-31 #pragma omp single S-32 { S-33 /* S-34 * The following should print: S-35 * Outer: max_act_lev=8, num_thds=2, max_thds=3 S-36 */ S-37 printf ("Outer: max_act_lev=%d, num_thds=%d, max_thds=%d\n", 12 OpenMP Examples Version 4.0.2 - March 2015
omp_get_max_active_levels(),omp_get_num_threads(), omp_get_max_threads()); return 0; C/C++ Fortran Example icv.If 253406567 8509 call omp_set_nested(.true.) call omp_set_max_active_levels(8) call omp_set_dynamic(.false.) call omp_set_num_threads(2) !Somp parallel S-10 call omp_set_num_threads(3) S-11 S-12 !Somp parallel S-13 call omp_set_num_threads(4) S-14 !Somp single S-15 The following should print: S-16 Inner:max_act_lev=8,num thds=3,max_thds=4 S17 Inner:max_act_lev=8,num_thds=3,max_thds=4 S-18 print *"Inner:max_act_lev=",omp_get_max_active_levels(), S-19 "num_thds=",omp_get_num_threads(), S-20 "max_thds=",omp_get_max_threads ( S-21 !Somp end single !Somp end parallel Somp barrier !Somp single The following should print: Outer:max_act_lev= 8,num_thds=2,max_thds=3 print *"Outer:ma act_lev= omp_get_max_active_levels(), omp_ge reads ( "max thds=", omp_get_max_threads ( Somp end single !Somp end parallel Fortran CHAPTER 4.INTERNAL CONTROL VARIABLES (ICVS)13
S-38 omp_get_max_active_levels(), omp_get_num_threads(), S-39 omp_get_max_threads()); S-40 } S-41 } S-42 return 0; S-43 } C / C++ Fortran 1 Example icv.1f S-1 program icv S-2 use omp_lib S-3 S-4 call omp_set_nested(.true.) S-5 call omp_set_max_active_levels(8) S-6 call omp_set_dynamic(.false.) S-7 call omp_set_num_threads(2) S-8 S-9 !$omp parallel S-10 call omp_set_num_threads(3) S-11 S-12 !$omp parallel S-13 call omp_set_num_threads(4) S-14 !$omp single S-15 ! The following should print: S-16 ! Inner: max_act_lev= 8 , num_thds= 3 , max_thds= 4 S-17 ! Inner: max_act_lev= 8 , num_thds= 3 , max_thds= 4 S-18 print *, "Inner: max_act_lev=", omp_get_max_active_levels(), S-19 & ", num_thds=", omp_get_num_threads(), S-20 & ", max_thds=", omp_get_max_threads() S-21 !$omp end single S-22 !$omp end parallel S-23 S-24 !$omp barrier S-25 !$omp single S-26 ! The following should print: S-27 ! Outer: max_act_lev= 8 , num_thds= 2 , max_thds= 3 S-28 print *, "Outer: max_act_lev=", omp_get_max_active_levels(), S-29 & ", num_thds=", omp_get_num_threads(), S-30 & ", max_thds=", omp_get_max_threads() S-31 !$omp end single S-32 !$omp end parallel S-33 end Fortran CHAPTER 4. INTERNAL CONTROL VARIABLES (ICVS) 13
1 CHAPTER 5 2 The parallel Construct The parallel construct can be used in coarse-grain parallel programs.In the following example each thread in the parallel region decides what part of the global arrayx to work on,based on 5 the thre ead number A C/C++ Example parallel.Ic #include <omp.h> 2535455679 void subdomain(float sx,int istart,int ipoints) int i; for (任=0; ista ipoints;++) 123.456 s-11 void sub(float *x,int npoints) S-12 S-13 int iam,nt,ipoints,istart S-14 S-15 #pragma omp parallel default(shared) private(iam,nt,ipoints,istart) s-16 S-17 iam -get_thread_num() S-18 nt= et num threads() ipoints points nt: istart iam ipoints; S-21 if (iam ==nt-1) /last thread may do more + S-22 ipoints npoints-istart; S-23 subdomain(x,istart,ipoints); S-24 9.25 14
1 CHAPTER 5 2 The parallel Construct 3 The parallel construct can be used in coarse-grain parallel programs. In the following example, 4 each thread in the parallel region decides what part of the global array x to work on, based on 5 the thread number: C / C++ 6 Example parallel.1c S-1 #include <omp.h> S-2 S-3 void subdomain(float *x, int istart, int ipoints) S-4 { S-5 int i; S-6 S-7 for (i = 0; i < ipoints; i++) S-8 x[istart+i] = 123.456; S-9 } S-10 S-11 void sub(float *x, int npoints) S-12 { S-13 int iam, nt, ipoints, istart; S-14 S-15 #pragma omp parallel default(shared) private(iam,nt,ipoints,istart) S-16 { S-17 iam = omp_get_thread_num(); S-18 nt = omp_get_num_threads(); S-19 ipoints = npoints / nt; /* size of partition */ S-20 istart = iam * ipoints; /* starting array index */ S-21 if (iam == nt-1) /* last thread may do more */ S-22 ipoints = npoints - istart; S-23 subdomain(x, istart, ipoints); S-24 } S-25 } 14