Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Parallel Programming With OpenMP. 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model.

Similar presentations


Presentation on theme: "1 Parallel Programming With OpenMP. 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model."— Presentation transcript:

1 1 Parallel Programming With OpenMP

2 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model  OpenMP Environment Variable  OpenMP Clauses  OpenMP Runtime Routines  General Code Structure & Sample Examples  Pros & Cons Of OpenMP  Performance of one program (Serial vs Parallel)‏

3 3 Parallel Programming Decomposes an algorithm or data into parts, which are processed by multiple processors simultaneously. Co-ordinates work and communication between processors. Threaded applications are ideal for multi-core. OpenMP Open specifications for Multi Processing, based on a thread paradigm. 3 primary component (Compiler Directives, Runtime Library Routines, Environment Variables). – Extensions for Fortran, C, C++

4 4 OpenMP vs MPI OpenMP :  Shared Memory Model  Directive Based  Easier to program & debug  Supported by gcc4.2 & higher MPI :  Distributed Memory Model  Message Passing Style  More flexible & scalable  Supported by MPICH2 library

5 Why OpenMP Does not require a special compiler. Only a library for the target computer is required. However, debuggers are more difficult to implement since a direct global view of all program memory is not available Requires a special compiler and runtime library that supports OpenMP. Code will run on 1 processor without an OpenMP compiler. Debugging tools are an extension of existing serial code debuggers Availability of application development and debugging environments Requires extra copying of data into temporary message buffers resulting in a significant amount of message handling code. Extra code complexity means readability of code suffers Small increase (usually 2-25%). Requires some knowledge of shared memory constructs Impact on code quantity and quality (readability)‏ Significant additional overhead and complexity even for implementing simple and localized constructs Easy and fast to implementAdditional complexity over serial code to be addressed by programmer Most vendors provide the ability to cluster nonshared memory systems with high performance interconnects Few vendors provide scalable shared memory systems (i.e. ccNUMA)‏ Feasibility of scaling an application to a large number of processors Relatively difficult to do Requires an all-for-nothing effort Relatively easy to doAbility to parallize small parts of an application at a time Distributed MemoryShared MemoryFeature

6 6 OpenMP Programming Model Shared Memory, Thread Based Parallelism. Explicit Parallelism. For-Join Model – Execution starts with one thread – master thread. – Parallel regions fork off new threads on entry – team thread. – Thread join back together at the end of the region – only master thread continues.

7 7 OpenMP Environment Variables OMP_SCHEDULE OMP_NUM_THREADS OMP_DYNAMIC OMP_NESTED OMP_THREAD_LIMIT OMP_STACKSIZE

8 8 OpenMP Clauses Data Scoping Clauses (shared, private, default)‏ InitializationClauses (firstprivate, lastprivate, threadprivate)‏ Data Copying Clauses (copyin, copyprivate)‏ Worksharing Clauses (do/for directive, sections directive, single directive, parallel do/for, parallel sections)‏ Scheduling Clauses (static, dynamic, guided)‏ Synchronization Clauses (master, critical, atomic, ordered, barrier, nowait, flush)‏ Reduction Clause (operator: list)‏

9 9 OpenMP Runtime Routines To set & get number of threads : – OMP_SET_NUM_THREADS – OMP_GET_NUM_THREADS To get the thread number of a thread, in a team – OMP_GET_THREAD_NUM To get the number of processors available to the program – OMP_GET_NUM_PROCS OMP_IN_PARALLEL To enable or disable dynamic adjustment of the number of threads – OMP_SET_DYNAMIC

10 10 OpenMP Runtime Routines Cont. To determine if dynamic thread adjustment is enabled or not. – OMP_GET_DYNAMIC To initialise and disassociates a lock associated with the lock variable. – OMP_INIT_LOCK – OMP_DESTROY_LOCK To own and release a lock – OMP_SET_LOCK – OMP_UNSET_LOCK To use clock timing routine – OMP_GET_WTICK

11 11 General Code Structure #include main () { int var1, var2, var3; // Serial code // Beginning of parallel section. // Specify variable scoping #pragma omp parallel private(var1, var2) shared(var3) { // Parallel section executed by all threads // All threads join master thread and disband } Resume serial code }  omp keyword distinguishes the pragma as a OpenMP pragma and is processed by OpenMP compilers.

12 12 Parallel Region Example #include main () { int nthreads, tid; /* Fork a team of threads #pragma omp parallel private(tid) { tid = omp_get_thread_num(); /* Obtain thread id */ printf("Hello World from thread = %d\n", tid); if (tid == 0) { /* Only master thread does this */ nthreads = omp_get_num_threads(); printf("Number of threads = %d\n", nthreads); } } /* All threads join master thread and terminate */ }

13 13 “for” Directive Example #include #define CHUNKSIZE 10 #define N 100 main () { int i, chunk; float a[N], b[N], c[N]; for (i=0; i < N; i++)‏ a[i] = b[i] = i * 1.0; chunk = CHUNKSIZE; #pragma omp parallel shared(a,b,c,chunk) private(i) { #pragma omp for schedule(dynamic,chunk) nowait for (i=0; i < N; i++)‏ c[i] = a[i] + b[i]; } /* end of parallel section */ }

14 14 “sections” directive example #include #define N 1000 main () { int i; float a[N], b[N], c[N], d[N]; for (i=0; i < N; i++){ a[i] = i * 1.5; b[i] = i + 22.35; } #pragma omp parallel shared(a,b,c,d) private(i) { #pragma omp sections nowait { #pragma omp section for (i=0; i < N; i++)‏ c[i] = a[i] + b[i]; #pragma omp section for (i=0; i < N; i++)‏ d[i] = a[i] * b[i]; } /* end of sections */ } /* end of parallel section */ }

15 15 “critical” Directive Example #include main() { int x; x = 0; #pragma omp parallel shared(x) { #pragma omp critical x = x + 1; } /* end of parallel section */ }

16 16 “threadprivate” Directive Example #include int a, b, i, tid; float x; #pragma omp threadprivate(a, x) main () { /* Explicitly turn off dynamic threads */ omp_set_dynamic(0); printf("1st Parallel Region:\n"); #pragma omp parallel private(b,tid) { tid = omp_get_thread_num(); a = tid; b = tid; x = 1.1 * tid +1.0; printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x); } /* end of parallel section */ printf("Master thread doing serial work here\n"); printf("2nd Parallel Region:\n"); #pragma omp parallel private(tid { tid = omp_get_thread_num(); printf("Thread %d: a,b,x= %d %d %f\n",tid,a,b,x); } /* end of parallel section */ }

17 17 “reduction” Clause Example #include main () { int i, n, chunk; float a[100], b[100], result; n = 100 ; chunk = 10 ; result = 0.0 ; for (i=0; i < n; i++) { a[i] = i * 1.0 ; b[i] = i * 2.0; } #pragma omp parallel for default(shared) private(i) schedule(static,chunk) reduction(+:result) for (i=0; i < n; i++)‏ result = result + (a[i] * b[i]); printf("Final result= %f\n",result); }

18 18 OpenMP - Pros and Cons Pros :  Simple  Incremental Parallelism.  Decomposition is handled automatically.  Unified code for both serial and parallel applications. Cons :  Runs only on shared-memory multiprocessor.  Scalability is limited by memory architecture.  Reliable error handling is missing.

19 19 Performance of “arrayUpdate.c” Test Done on 2 GHz Intel Core 2 Duo With 1 GB 667 MHz DDR2 SDRAM Array SizeSerial (sec)Parallel (sec)‏ 10000.0002210.000389 50000.0010600.000999 100000.0022010.001323 500000.0112660.005892 1000000.226380.011715 5000000.1140330.068110 10000000.2277130.123106 50000001.1347730.579176 100000002.3076441.151099 5000000012.5364665.772921 100000000194.24592958.532328

20 20 arrayUpdate.c Cont.

21 References http://www.openmp.org/ Parallel Programming in OpenMP, Morgan Kaufman Publishers.

22 22 Thank You


Download ppt "1 Parallel Programming With OpenMP. 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model."

Similar presentations


Ads by Google