OpenMP Parallel Programming

Slides:



Advertisements
Similar presentations
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
Advertisements

1 Programming Explicit Thread-level Parallelism  As noted previously, the programmer must specify how to parallelize  But, want path of least effort.
Indian Institute of Science Bangalore, India भारतीय विज्ञान संस्थान बंगलौर, भारत Supercomputer Education and Research Centre (SERC) Adapted from: o “MPI-Message.
Open[M]ulti[P]rocessing Pthreads: Programmer explicitly define thread behavior openMP: Compiler and system defines thread behavior Pthreads: Library independent.
Mohsan Jameel Department of Computing NUST School of Electrical Engineering and Computer Science 1.
Introduction to OpenMP For a more detailed tutorial see: Look at the presentations also see:
PARALLEL PROGRAMMING WITH OPENMP Ing. Andrea Marongiu
1 OpenMP—An API for Shared Memory Programming Slides are based on:
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
Parallel Programming by Tiago Sommer Damasceno Using OpenMP
Introduction to OpenMP For a more detailed tutorial see: Look at the presentations.
1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.
CSCI-6964: High Performance Parallel & Distributed Computing (HPDC) AE 216, Mon/Thurs 2-3:20 p.m. Pthreads (reading Chp 7.10) Prof. Chris Carothers Computer.
OpenMPI Majdi Baddourah
A Very Short Introduction to OpenMP Basile Schaeli EPFL – I&C – LSP Vincent Keller EPFL – STI – LIN.
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines.
1 Parallel Programming With OpenMP. 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model.
Programming with Shared Memory Introduction to OpenMP
1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 5 Shared Memory Programming with OpenMP An Introduction to Parallel Programming Peter Pacheco.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Parallel Programming in Java with Shared Memory Directives.
2 3 Parent Thread Fork Join Start End Child Threads Compute time Overhead.
Chapter 17 Shared-Memory Programming. Introduction OpenMP is an application programming interface (API) for parallel programming on multiprocessors. It.
OpenMP China MCP.
OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
1 OpenMP Writing programs that use OpenMP. Using OpenMP to parallelize many serial for loops with only small changes to the source code. Task parallelism.
OpenMP Blue Waters Undergraduate Petascale Education Program May 29 – June
OpenMP: Open specifications for Multi-Processing What is OpenMP? Join\Fork model Join\Fork model Variables Variables Explicit parallelism Explicit parallelism.
Lecture 8: OpenMP. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism / Implicit parallelism.
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.
OpenMP Martin Kruliš Jiří Dokulil. OpenMP OpenMP Architecture Review Board Compaq, HP, Intel, IBM, KAI, SGI, SUN, U.S. Department of Energy,…
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j
OpenMP fundamentials Nikita Panov
High-Performance Parallel Scientific Computing 2008 Purdue University OpenMP Tutorial Seung-Jai Min School of Electrical and Computer.
Introduction to OpenMP
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
9/22/2011CS4961 CS4961 Parallel Programming Lecture 9: Task Parallelism in OpenMP Mary Hall September 22,
Threaded Programming Lecture 2: Introduction to OpenMP.
Uses some of the slides for chapters 7 and 9 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
10/05/2010CS4961 CS4961 Parallel Programming Lecture 13: Task Parallelism in OpenMP Mary Hall October 5,
Heterogeneous Computing using openMP lecture 2 F21DP Distributed and Parallel Technology Sven-Bodo Scholz.
CS240A, T. Yang, Parallel Programming with OpenMP.
COMP7330/7336 Advanced Parallel and Distributed Computing OpenMP: Programming Model Dr. Xiao Qin Auburn University
Heterogeneous Computing using openMP lecture 1 F21DP Distributed and Parallel Technology Sven-Bodo Scholz.
OpenMP – Part 2 * *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
B. Estrade, LSU – High Performance Computing Enablement Group OpenMP II B. Estrade.
MPSoC Architectures OpenMP Alberto Bosio
Introduction to OpenMP
Martin Kruliš Jiří Dokulil
Shared Memory Parallelism - OpenMP
CS427 Multicore Architecture and Parallel Computing
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing A bug in the rwlock program Dr. Xiao Qin.
Open[M]ulti[P]rocessing
Computer Engg, IIT(BHU)
Introduction to OpenMP
Shared-Memory Programming
September 4, 1997 Parallel Processing (CS 667) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson Parallel Processing.
Computer Science Department
Multi-core CPU Computing Straightforward with OpenMP
Parallel Programming with OpenMP
Lab. 3 (May 11th) You may use either cygwin or visual studio for using OpenMP Compiling in cygwin “> gcc –fopenmp ex1.c” will generate a.exe Execute :
Programming with Shared Memory Introduction to OpenMP
DNA microarrays. Infinite Mixture Model-Based Clustering of DNA Microarray Data Using openMP.
Introduction to OpenMP
OpenMP Martin Kruliš.
WorkSharing, Schedule, Synchronization and OMP best practices
Presentation transcript:

OpenMP Parallel Programming Jyothi Krishna V S Jan 23, 2018

OpenMP API Not a parallel programming language. Multithreaded, shared memory parallelism. OpenMP Specifications: latest 4.5 (Nov 2015) Specifies Data environment Work sharing, Synchronization and Undefined behaviour for all programs that are non-compliant.

OpenMP -C /C++ #pragma omp : compiler directives. For example #pragma omp parallel: Create parallel region. No of Threads: inProgram > Environment Variable > System Default (No of hardware threads) Header file “#include <omp.h>” gcc link flag : -fopenmp gcc version 6.4 complete with OpenMP 4.5 specs. If your gcc version is lower some of the OpenMP 4.5 specs might not work.

Fork Join Model Master thread : Thread id 0. omp_get_num_threads(), omp_get_thread_num(), omp_set_num_threads() Image Source: https://computing.llnl.gov/tutorials/openMP/

Hello World #include <stdio.h> #include <omp.h> int main() { #pragma omp parallel { printf(“Hello World \n”); } Output(Considering 4 Threads): Hello World

Data Environment Thread 0 Thread 1 Shared Memory P Memory Thread 0 Shared Environment Directives : private, shared Private: #pragma omp parallel private(i) first-private/last-private: Copy-in Reduction: On shared variable and reducible operation. reduction(op: varlist) Thread 0 Thread 1 Shared Memory P Memory Thread 0 P Memory Thread 1

WorkSharing Constructs OpenMP for (#pragma omp for <clauses>): Implements loops OpenMP sections (#pragma omp sections <clauses>) Multiple omp section (#pragma omp section <clauses>) OpenMP tasks (#pragma omp task <clauses>) OpenMP single (#pragma omp single <clauses>) Special Mentions #pragma omp simd #prama omp master

OpenMP For Output: The itr is 0 with threadid 0 #pragma omp parallel { #pragma omp for schedule(..) for(i=0; i<100; i++) { printf(“The itr is %d with threadid %d\n ”, i, omp_get_thread_num()); } #pragma omp for schedule(dynamic) ordered Schedule: static/dynamic/ guided shared/private/firstprivate/lastprivate ordered/unordered reduction

OpenMP Tasks int fib(int n) { int i, j; if (n<2) return n; else { #pragma omp task shared(i) i=fib(n-1); #pragma omp task shared(j) j=fib(n-2); #pragma omp taskwait return i+j; } } Puts newly created tasks in a task pool. Task to threads happen at certain scheduling points. Taskwait : fence for task created at this level Binding: tied /untied. taskyield: reduces contention Recursive parallelism. Task scheduling pattern is random.

Synchronization Barriers : #pragma omp barrier Implicit barriers End of parallel regions End of work-sharing constructs Remove implicit barrier: nowait Taskwait /taskyeild: for tasks Ordered Data synchronizations: Atomics and criticals Flushs: #pragma omp flush OpenMP locks: extensions to pthread lock.

Atomic and Critical #pragma omp critical { sharedvariableupdate(); } #pragma omp critical updateb sharedvariableupdateb() #pragma omp atomic sharedc++; #pragma omp atomic write sharedc = 7; #pragma omp atomic read i = sharedc; Atomic v/s critcal Critical always with locks Atomics can use system atomic operations Named critical sections Atomic modes: update/read/write/capture.

Reductions int sum =0; #pragma omp parallel for for(int i =0;i<100;i++) { sum += i; } printf(“sum 1 = %d \n”,sum); Output: Run 1 sum 1 = 4824 sum 2 = 4950 Run 2 sum 1 = 4242 Run 3 sum 1 = 4950 int sum =0; #pragma omp parallel for reduction(+: sum) for(int i =0;i<100;i++) { sum += i; } printf(“sum 2 = %d \n”,sum);

Optimizations & Default values Nested Parallelism : Default Value false. OMP_MAX_ACTIVE_LEVELS : if nested is true. Data : default status is shared omp for schedule: default is static dynamic/guided chunk size: default is 1 No of Threads: default is hardware threads. OMP_DYNAMIC: default value is true OMP_WAIT_POLICY: active /passive Can have big impact on energy consumption. OMP_STACKSIZE: stack size for threads Omp dynamic ->dynmaic number of threads.

OpenMP Multithreaded, shared memory parallelism. Compliant programs produce intended outputs. Fine tuning based on Algorithm Inputs Resources: OpenMP Home : http://www.openmp.org/ 4.5 specs: http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf Tutorials: https://computing.llnl.gov/tutorials/openMP/