September 4, 1997 Parallel Processing (CS 730) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson *Parts of this lecture.

Slides:



Advertisements
Similar presentations
OpenMP.
Advertisements

Parallel Processing with OpenMP
Introduction to Openmp & openACC
Introductions to Parallel Programming Using OpenMP
NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
May 2, 2015©2006 Craig Zilles1 (Easily) Exposing Thread-level Parallelism  Previously, we introduced Multi-Core Processors —and the (atomic) instructions.
1 Programming Explicit Thread-level Parallelism  As noted previously, the programmer must specify how to parallelize  But, want path of least effort.
Parallel Programming On the IUCAA Clusters Sunu Engineer.
Scientific Programming OpenM ulti- P rocessing M essage P assing I nterface.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
Introduction to OpenMP For a more detailed tutorial see: Look at the presentations.
1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.
A Very Short Introduction to OpenMP Basile Schaeli EPFL – I&C – LSP Vincent Keller EPFL – STI – LIN.
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines.
1 Parallel Programming With OpenMP. 2 Contents  Overview of Parallel Programming & OpenMP  Difference between OpenMP & MPI  OpenMP Programming Model.
Budapest, November st ALADIN maintenance and phasing workshop Short introduction to OpenMP Jure Jerman, Environmental Agency of Slovenia.
Programming with Shared Memory Introduction to OpenMP
CS470/570 Lecture 5 Introduction to OpenMP Compute Pi example OpenMP directives and options.
Shared Memory Parallelization Outline What is shared memory parallelization? OpenMP Fractal Example False Sharing Variable scoping Examples on sharing.
Executing OpenMP Programs Mitesh Meswani. Presentation Outline Introduction to OpenMP Machine Architectures Shared Memory (SMP) Distributed Memory MPI.
CC02 – Parallel Programming Using OpenMP 1 of 25 PhUSE 2011 Aniruddha Deshmukh Cytel Inc.
Parallel Programming in Java with Shared Memory Directives.
Lecture 5: Shared-memory Computing with Open MP. Shared Memory Computing.
Chapter 17 Shared-Memory Programming. Introduction OpenMP is an application programming interface (API) for parallel programming on multiprocessors. It.
August 15, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 12: Multiprocessors: Non-Uniform Memory Access * Jeremy R. Johnson.
OpenMP: Open specifications for Multi-Processing What is OpenMP? Join\Fork model Join\Fork model Variables Variables Explicit parallelism Explicit parallelism.
Lecture 8: OpenMP. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism / Implicit parallelism.
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
Hybrid MPI and OpenMP Parallel Programming
Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j
OpenMP fundamentials Nikita Panov
Introduction to OpenMP
Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
Threaded Programming Lecture 2: Introduction to OpenMP.
CS/EE 217 GPU Architecture and Parallel Programming Lecture 23: Introduction to OpenACC.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.
CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.
COMP7330/7336 Advanced Parallel and Distributed Computing OpenMP: Programming Model Dr. Xiao Qin Auburn University
Information Technology Services B. Estrade, Louisiana Optical Network LSU OpenMP I B. Estrade.
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Introduction to OpenMP
SHARED MEMORY PROGRAMMING WITH OpenMP
Lecture 5: Shared-memory Computing with Open MP
SHARED MEMORY PROGRAMMING WITH OpenMP
CS427 Multicore Architecture and Parallel Computing
Introduction to MPI.
Computer Engg, IIT(BHU)
Introduction to OpenMP
Shared-Memory Programming
September 4, 1997 Parallel Processing (CS 667) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson Parallel Processing.
Message Passing Models
Prof. Thomas Sterling Department of Computer Science
September 4, 1997 Parallel Processing (CS 730) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson Wed. Jan. 31, 2001 *Parts.
Programming with Shared Memory Introduction to OpenMP
Distributed Systems CS
Introduction to parallelism and the Message Passing Interface
Hybrid Parallel Programming
DNA microarrays. Infinite Mixture Model-Based Clustering of DNA Microarray Data Using openMP.
Hybrid Parallel Programming
Introduction to OpenMP
Introduction to Parallel Computing
Hybrid Parallel Programming
Shared-Memory Paradigm & OpenMP
Programming Parallel Computers
Presentation transcript:

September 4, 1997 Parallel Processing (CS 730) Lecture 5: Shared Memory Parallel Programming with OpenMP* Jeremy R. Johnson *Parts of this lecture was derived from chapters 1-2 in Chandra et al. Oct. 23, 2002 Parallel Processing

September 4, 1997 Introduction Objective: To further study the shared memory model of parallel programming. Introduction to the OpenMP standard for shared memory parallel programming Topics Introduction to OpenMP hello.c hello.f Loop level parallelism Shared vs. private variables Synchronization (implicit and explicit) Parallel regions Oct. 23, 2002 Parallel Processing

OpenMP Extension to FORTRAN, C/C++ Shared memory model Uses directives (comments in FORTRAN, pragma in C/C++) ignored without compiler support Some library support required Shared memory model parallel regions loop level parallelism implicit thread model communication via shared address space private vs. shared variables (declaration) explicit synchronization via directives (e.g. critical) library routines for returning thread information (e.g. omp_get_num_threads(), omp_get_thread_num() ) Environment variables used to provide system info (e.g. OMP_NUM_THREADS) Oct. 23, 2002 Parallel Processing

Benefits Provides incremental parallelism Small increase in code size Simpler model than message passing Easier to use than thread library With hardware and compiler support smaller granularity than message passing. Oct. 23, 2002 Parallel Processing

Further Information Adopted as a standard in 1997 www.openmp.org Initiated by SGI www.openmp.org Chandra, Dagum, Kohr, Maydan, McDonald, Menon, “Parallel Programming in OpenMP”, Morgan Kaufman Publishers, 2001. Oct. 23, 2002 Parallel Processing

Shared vs. Distributed Memory P0 P1 Pn Memory P0 P1 Pn ... ... M0 M1 Mn Interconnection Network Shared memory Distributed memory Oct. 23, 2002 Parallel Processing

Shared Memory Programming Model Shared memory programming does not require physically shared memory so long as there is support for logically shared memory (in either hardware or software) If logical shared memory then there may be different costs for accessing memory depending on the physical location. UMA - uniform memory access SMP - symmetric multi-processor typically memory connected to processors via a bus NUMA - non-uniform memory access typically physically distributed memory connected via an interconnection network Oct. 23, 2002 Parallel Processing

IBM S80 An SMP with upto 24 processors (RS64 III processors) http://www.rs6000.ibm.com/hardware/enterprise/s80.html Name: Goopi.coe.drexel.edu Machine type: S80 12-Way with 8Gb RAM Specifications: 2 x 6 way 450 MHz RS64 III Processor Card, 8Mb L2 Cache 2 x 4096 Mb Memory 9 x 18.2 Gb Ultra SCSI Hot Swappable Hard Disk Drives. Name: bagha.coe.drexel.edu Machine Type: 44P Model 270 4 way with 2 Gb RAM 2 x 2 way 375 MHz POWER3-II Processor, 4 Mb L2 Cache 4 x 512 Mb SDRAM DIMMs 2 x 9.1 Gb Ultra SCSI HDD Oct. 23, 2002 Parallel Processing

hello.c #include <stdio.h> int main(int argc, char **argv) { int n; n = atoi(argv[1]); omp_set_num_threads(n); printf("Number of threads = %d\n",omp_get_num_threads()); #pragma omp parallel int id = omp_get_thread_num(); if (id == 0) printf("Hello World from %d\n",id); } exit(0); Oct. 23, 2002 Parallel Processing

hello.f program hello integer omp_get_thread_num, omp_get_num_threads print *, "Hello parallel world from threads" print *, "Num threads = ", omp_get_num_threads() !$omp parallel print *, omp_get_thread_num() !$omp end parallel print *, "Back to the sequential world" end Oct. 23, 2002 Parallel Processing

Compiling and Executing OpenMP Programs on the IBM S80 To compile a C program with OpenMP directives cc_r -qsmp=omp hello.c -o hello To compile a Fortran program with OpenMP directives xlf_r -qsmp=omp hello.f -o hello The environment variable OMP_NUM_THREADS controls the number of threads used in OpenMP parallel regions. It can be set from the C shell setenv OMP_NUM_THREAD <count> where <count> is a positive integer On Sun Machines Cc –xopenmp hello.c –o hello Oct. 23, 2002 Parallel Processing

Parallel Loop subroutine saxpy(z, a, x, y, n) integer I, n read z(n), a, x(n), y do i = 1, n z(i) = a * x(i) + y end do return end subroutine saxpy(z, a, x, y, n) integer I, n read z(n), a, x(n), y !$omp parallel do do i = 1, n z(i) = a * x(i) + y end do return end Oct. 23, 2002 Parallel Processing

Execution Model Master thread Parallel Region Master and slave threads Implicit thread creation Parallel Region Master and slave threads Implicit barrier synchronization Master thread Oct. 23, 2002 Parallel Processing

More Complicated Example Real*8 x, y integer i, j, m, n, maxiter integer depth(*,*) integer mandel_val … maxiter = 200 do i = 1, m do j = 1, m x = i/real(m) y = j/real(n) depth(j,i) = mandel_val(x, y, maxiter) end do Oct. 23, 2002 Parallel Processing

Parallel Loop !$omp parallel do private(j,x, y) maxiter = 200 do i = 1, m do j = 1, m x = i/real(m) y = j/real(n) depth(j,i) = mandel_val(x, y, maxiter) end do !$omp end parallel do Oct. 23, 2002 Parallel Processing

Parallel Loop maxiter = 200 !$omp parallel do private(j,x, y) do i = 1, m do j = 1, m x = i/real(m) y = j/real(n) depth(j,i) = mandel_val(x, y, maxiter) end do !$omp end parallel do Oct. 23, 2002 Parallel Processing

Explicit Synchronization maxiter = 200 total_iters = 0 !$omp parallel do private(j,x, y) do i = 1, m do j = 1, m x = i/real(m) y = j/real(n) depth(j,i) = mandel_val(x, y, maxiter) !$omp critical total_iters = total_iters + depth(j,i) !$omp end critical end do !$omp end parallel do Oct. 23, 2002 Parallel Processing

Reduction Variables maxiter = 200 total_iters = 0 !$omp parallel do private(j,x, y) !$omp+ reduction(+:total_iters) do i = 1, m do j = 1, m x = i/real(m) y = j/real(n) depth(j,i) = mandel_val(x, y, maxiter) total_iters = total_iters + depth(j,I) end do !$omp end parallel do Oct. 23, 2002 Parallel Processing