OpenMP – Part 2 * *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

Slides:



Advertisements
Similar presentations
OpenMP.
Advertisements

Parallel Programming – Barriers, Locks, and Continued Discussion of Parallel Decomposition David Monismith Jan. 27, 2015 Based upon notes from the LLNL.
Introduction to Openmp & openACC
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
Open[M]ulti[P]rocessing Pthreads: Programmer explicitly define thread behavior openMP: Compiler and system defines thread behavior Pthreads: Library independent.
PARALLEL PROGRAMMING WITH OPENMP Ing. Andrea Marongiu
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
Introduction to OpenMP For a more detailed tutorial see: Look at the presentations.
Games at Bolton OpenMP Techniques Andrew Williams
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 OpenMP -Example ICS 535 Design and Implementation.
1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.
OpenMPI Majdi Baddourah
A Very Short Introduction to OpenMP Basile Schaeli EPFL – I&C – LSP Vincent Keller EPFL – STI – LIN.
CS 470/570 Lecture 7 Dot Product Examples Odd-even transposition sort More OpenMP Directives.
INTEL CONFIDENTIAL OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8.
Programming with Shared Memory Introduction to OpenMP
CS470/570 Lecture 5 Introduction to OpenMP Compute Pi example OpenMP directives and options.
Shared Memory Parallelization Outline What is shared memory parallelization? OpenMP Fractal Example False Sharing Variable scoping Examples on sharing.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Parallel Programming in Java with Shared Memory Directives.
Lecture 5: Shared-memory Computing with Open MP. Shared Memory Computing.
2 3 Parent Thread Fork Join Start End Child Threads Compute time Overhead.
OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
OpenMP OpenMP A.Klypin Shared memory and OpenMP Simple Example Threads Dependencies Directives Handling Common blocks Synchronization Improving load balance.
OpenMP: Open specifications for Multi-Processing What is OpenMP? Join\Fork model Join\Fork model Variables Variables Explicit parallelism Explicit parallelism.
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j
OpenMP fundamentials Nikita Panov
High-Performance Parallel Scientific Computing 2008 Purdue University OpenMP Tutorial Seung-Jai Min School of Electrical and Computer.
Threaded Programming Lecture 4: Work sharing directives.
09/08/2011CS4961 CS4961 Parallel Programming Lecture 6: More OpenMP, Introduction to Data Parallel Algorithms Mary Hall September 8, 2011.
Introduction to OpenMP Part II White Rose Grid Computing Training Series Deniz Savas, Alan Real, Mike Griffiths RTP Module February 2012.
Introduction to OpenMP
09/09/2010CS4961 CS4961 Parallel Programming Lecture 6: Data Parallelism in OpenMP, cont. Introduction to Data Parallel Algorithms Mary Hall September.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
9/22/2011CS4961 CS4961 Parallel Programming Lecture 9: Task Parallelism in OpenMP Mary Hall September 22,
Threaded Programming Lecture 2: Introduction to OpenMP.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-3. OMP_INIT_LOCK OMP_INIT_NEST_LOCK Purpose: ● This subroutine initializes a lock associated with the lock variable.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
10/05/2010CS4961 CS4961 Parallel Programming Lecture 13: Task Parallelism in OpenMP Mary Hall October 5,
Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.
CPE779: More on OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.
CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.
COMP7330/7336 Advanced Parallel and Distributed Computing OpenMP: Programming Model Dr. Xiao Qin Auburn University
OpenMP Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.
Embedded Systems MPSoC Architectures OpenMP: Exercises Alberto Bosio
B. Estrade, LSU – High Performance Computing Enablement Group OpenMP II B. Estrade.
Introduction to OpenMP
SHARED MEMORY PROGRAMMING WITH OpenMP
Shared Memory Parallelism - OpenMP
Lecture 5: Shared-memory Computing with Open MP
SHARED MEMORY PROGRAMMING WITH OpenMP
Loop Parallelism and OpenMP CS433 Spring 2001
Open[M]ulti[P]rocessing
Computer Engg, IIT(BHU)
Introduction to OpenMP
SHARED MEMORY PROGRAMMING WITH OpenMP
Computer Science Department
OpenMP Quiz B. Wilkinson January 22, 2016.
CS4230 Parallel Programming Lecture 12: More Task Parallelism Mary Hall October 4, /04/2012 CS4230.
Introduction to High Performance Computing Lecture 20
Programming with Shared Memory Introduction to OpenMP
Introduction to OpenMP
OpenMP Quiz.
OpenMP Parallel Programming
Shared-Memory Paradigm & OpenMP
WorkSharing, Schedule, Synchronization and OMP best practices
Presentation transcript:

OpenMP – Part 2 * *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)

SECTIONS construct: –Easiest way to get different threads to carry out different kinds of work –Each section must be a structured block of code that is independent of the other sections –If there are fewer code blocks than threads, the remaining threads will be idle –If there are fewer threads than code blocks, some or all of the threads execute multiple code blocks –Depending on the type of work, this construct might lead to a load- balancing problem Work-Sharing Lab 2

SECTIONS construct for 2 functions (or threads) #pragma omp parallel { #pragma omp sections { #pragma omp section { FUNCTION_1(MAX) } #pragma omp section { FUNCTION_2(MIN) } } // Sections Ends Here } // Parallel Ends Here Work-Sharing Lab 2

This example demonstrates use of the OpenMP SECTIONS worksharing construct Note how the PARALLEL region is divided into separate sections, each of which will be executed by one thread. Run the program several times and observe any differences in output. Because there are only two sections, you should notice that some threads do not do any work. You may/may not notice that the threads doing work can vary. For example, the first time thread 0 and thread 1 may do the work, and the next time it may be thread 0 and thread 3. Work-Sharing Lab 2 bash: $ icc -openmp omp_workshare2.c -o omp_workshare2 bash: $./omp_workshare2

Work-Sharing Constructs SINGLE Constructs: –It specifies that the enclosed code is to be executed by only one thread in the team. –The thread chosen could vary from one run to another. –Threads that are not executing in the SINGLE directive wait at the END OF SINGLE unless NOWAIT is specified. #pragma omp single [clause...] structured_block C/C++

SINGLE Constructs: Only one thread initializes the shared variable a Work-Sharing Constructs

Synchronization (BARRIER) BARRIER Directive: –Synchronizes all threads in the team. –When a BARRIER directive is reached, a thread will wait at that point until all other threads have reached that barrier. –All threads then resume executing in parallel the code that follows the barrier. C/C++ #pragma omp barrier structured_block

MASTER Directive: –Specifies a region that is to be executed only by the master thread of the team. –All other threads on the team skip this section of code –It is similar to the SINGLE construct Synchronization (MASTER) C/C++ #pragma omp master Statement_or_expression

A private copy for each list variable is created for each thread. At the end of the reduction, the reduction variable is applied to all private copies of the shared variable, and the final result is written to the global shared variable. Reduction C/C++ #pragma omp … reduction(operator:list) Statement_or_expression

Reduction The syntax of the clause is: reduction(operator:list) where list is the list of variables where the operator will be applied to, and operator is one of these:

Reduction by multiplication This example calculates factorial using threads: At the beginning of the parallel block, a private copy is made of the variable and preinitialized to a certain value. At the end of the parallel block, the private copy is atomically merged into the shared variable using the defined operator.

Reduction by sum

Reduction by max /min