Presentation is loading. Please wait.

Presentation is loading. Please wait.

OpenMP Parallel Programming

Similar presentations


Presentation on theme: "OpenMP Parallel Programming"— Presentation transcript:

1 OpenMP Parallel Programming
Jyothi Krishna V S Jan 23, 2018

2 OpenMP API Not a parallel programming language.
Multithreaded, shared memory parallelism. OpenMP Specifications: latest 4.5 (Nov 2015) Specifies Data environment Work sharing, Synchronization and Undefined behaviour for all programs that are non-compliant.

3 OpenMP -C /C++ #pragma omp : compiler directives. For example
#pragma omp parallel: Create parallel region. No of Threads: inProgram > Environment Variable > System Default (No of hardware threads) Header file “#include <omp.h>” gcc link flag : -fopenmp gcc version 6.4 complete with OpenMP 4.5 specs. If your gcc version is lower some of the OpenMP 4.5 specs might not work.

4 Fork Join Model Master thread : Thread id 0. omp_get_num_threads(), omp_get_thread_num(), omp_set_num_threads() Image Source:

5 Hello World #include <stdio.h> #include <omp.h>
int main() { #pragma omp parallel { printf(“Hello World \n”); } Output(Considering 4 Threads): Hello World

6 Data Environment Thread 0 Thread 1 Shared Memory P Memory Thread 0
Shared Environment Directives : private, shared Private: #pragma omp parallel private(i) first-private/last-private: Copy-in Reduction: On shared variable and reducible operation. reduction(op: varlist) Thread 0 Thread 1 Shared Memory P Memory Thread 0 P Memory Thread 1

7 WorkSharing Constructs
OpenMP for (#pragma omp for <clauses>): Implements loops OpenMP sections (#pragma omp sections <clauses>) Multiple omp section (#pragma omp section <clauses>) OpenMP tasks (#pragma omp task <clauses>) OpenMP single (#pragma omp single <clauses>) Special Mentions #pragma omp simd #prama omp master

8 OpenMP For Output: The itr is 0 with threadid 0 #pragma omp parallel
{ #pragma omp for schedule(..) for(i=0; i<100; i++) { printf(“The itr is %d with threadid %d\n ”, i, omp_get_thread_num()); } #pragma omp for schedule(dynamic) ordered Schedule: static/dynamic/ guided shared/private/firstprivate/lastprivate ordered/unordered reduction

9 OpenMP Tasks int fib(int n) { int i, j; if (n<2) return n; else { #pragma omp task shared(i) i=fib(n-1); #pragma omp task shared(j) j=fib(n-2); #pragma omp taskwait return i+j; } } Puts newly created tasks in a task pool. Task to threads happen at certain scheduling points. Taskwait : fence for task created at this level Binding: tied /untied. taskyield: reduces contention Recursive parallelism. Task scheduling pattern is random.

10 Synchronization Barriers : #pragma omp barrier Implicit barriers
End of parallel regions End of work-sharing constructs Remove implicit barrier: nowait Taskwait /taskyeild: for tasks Ordered Data synchronizations: Atomics and criticals Flushs: #pragma omp flush OpenMP locks: extensions to pthread lock.

11 Atomic and Critical #pragma omp critical { sharedvariableupdate(); }
#pragma omp critical updateb sharedvariableupdateb() #pragma omp atomic sharedc++; #pragma omp atomic write sharedc = 7; #pragma omp atomic read i = sharedc; Atomic v/s critcal Critical always with locks Atomics can use system atomic operations Named critical sections Atomic modes: update/read/write/capture.

12 Reductions int sum =0; #pragma omp parallel for
for(int i =0;i<100;i++) { sum += i; } printf(“sum 1 = %d \n”,sum); Output: Run 1 sum 1 = 4824 sum 2 = 4950 Run 2 sum 1 = 4242 Run 3 sum 1 = 4950 int sum =0; #pragma omp parallel for reduction(+: sum) for(int i =0;i<100;i++) { sum += i; } printf(“sum 2 = %d \n”,sum);

13 Optimizations & Default values
Nested Parallelism : Default Value false. OMP_MAX_ACTIVE_LEVELS : if nested is true. Data : default status is shared omp for schedule: default is static dynamic/guided chunk size: default is 1 No of Threads: default is hardware threads. OMP_DYNAMIC: default value is true OMP_WAIT_POLICY: active /passive Can have big impact on energy consumption. OMP_STACKSIZE: stack size for threads Omp dynamic ->dynmaic number of threads.

14 OpenMP Multithreaded, shared memory parallelism.
Compliant programs produce intended outputs. Fine tuning based on Algorithm Inputs Resources: OpenMP Home : 4.5 specs: Tutorials:


Download ppt "OpenMP Parallel Programming"

Similar presentations


Ads by Google