OpenMP Parallel Programming Jyothi Krishna V S Jan 23, 2018
OpenMP API Not a parallel programming language. Multithreaded, shared memory parallelism. OpenMP Specifications: latest 4.5 (Nov 2015) Specifies Data environment Work sharing, Synchronization and Undefined behaviour for all programs that are non-compliant.
OpenMP -C /C++ #pragma omp : compiler directives. For example #pragma omp parallel: Create parallel region. No of Threads: inProgram > Environment Variable > System Default (No of hardware threads) Header file “#include <omp.h>” gcc link flag : -fopenmp gcc version 6.4 complete with OpenMP 4.5 specs. If your gcc version is lower some of the OpenMP 4.5 specs might not work.
Fork Join Model Master thread : Thread id 0. omp_get_num_threads(), omp_get_thread_num(), omp_set_num_threads() Image Source: https://computing.llnl.gov/tutorials/openMP/
Hello World #include <stdio.h> #include <omp.h> int main() { #pragma omp parallel { printf(“Hello World \n”); } Output(Considering 4 Threads): Hello World
Data Environment Thread 0 Thread 1 Shared Memory P Memory Thread 0 Shared Environment Directives : private, shared Private: #pragma omp parallel private(i) first-private/last-private: Copy-in Reduction: On shared variable and reducible operation. reduction(op: varlist) Thread 0 Thread 1 Shared Memory P Memory Thread 0 P Memory Thread 1
WorkSharing Constructs OpenMP for (#pragma omp for <clauses>): Implements loops OpenMP sections (#pragma omp sections <clauses>) Multiple omp section (#pragma omp section <clauses>) OpenMP tasks (#pragma omp task <clauses>) OpenMP single (#pragma omp single <clauses>) Special Mentions #pragma omp simd #prama omp master
OpenMP For Output: The itr is 0 with threadid 0 #pragma omp parallel { #pragma omp for schedule(..) for(i=0; i<100; i++) { printf(“The itr is %d with threadid %d\n ”, i, omp_get_thread_num()); } #pragma omp for schedule(dynamic) ordered Schedule: static/dynamic/ guided shared/private/firstprivate/lastprivate ordered/unordered reduction
OpenMP Tasks int fib(int n) { int i, j; if (n<2) return n; else { #pragma omp task shared(i) i=fib(n-1); #pragma omp task shared(j) j=fib(n-2); #pragma omp taskwait return i+j; } } Puts newly created tasks in a task pool. Task to threads happen at certain scheduling points. Taskwait : fence for task created at this level Binding: tied /untied. taskyield: reduces contention Recursive parallelism. Task scheduling pattern is random.
Synchronization Barriers : #pragma omp barrier Implicit barriers End of parallel regions End of work-sharing constructs Remove implicit barrier: nowait Taskwait /taskyeild: for tasks Ordered Data synchronizations: Atomics and criticals Flushs: #pragma omp flush OpenMP locks: extensions to pthread lock.
Atomic and Critical #pragma omp critical { sharedvariableupdate(); } #pragma omp critical updateb sharedvariableupdateb() #pragma omp atomic sharedc++; #pragma omp atomic write sharedc = 7; #pragma omp atomic read i = sharedc; Atomic v/s critcal Critical always with locks Atomics can use system atomic operations Named critical sections Atomic modes: update/read/write/capture.
Reductions int sum =0; #pragma omp parallel for for(int i =0;i<100;i++) { sum += i; } printf(“sum 1 = %d \n”,sum); Output: Run 1 sum 1 = 4824 sum 2 = 4950 Run 2 sum 1 = 4242 Run 3 sum 1 = 4950 int sum =0; #pragma omp parallel for reduction(+: sum) for(int i =0;i<100;i++) { sum += i; } printf(“sum 2 = %d \n”,sum);
Optimizations & Default values Nested Parallelism : Default Value false. OMP_MAX_ACTIVE_LEVELS : if nested is true. Data : default status is shared omp for schedule: default is static dynamic/guided chunk size: default is 1 No of Threads: default is hardware threads. OMP_DYNAMIC: default value is true OMP_WAIT_POLICY: active /passive Can have big impact on energy consumption. OMP_STACKSIZE: stack size for threads Omp dynamic ->dynmaic number of threads.
OpenMP Multithreaded, shared memory parallelism. Compliant programs produce intended outputs. Fine tuning based on Algorithm Inputs Resources: OpenMP Home : http://www.openmp.org/ 4.5 specs: http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf Tutorials: https://computing.llnl.gov/tutorials/openMP/