SHARED-MEMORY PROGRAMMING 6 th week. -2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM SHARED-MEMORY PROGRAMMING 6 th week References Introduction.

Slides:



Advertisements
Similar presentations
OpenMP.
Advertisements

NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.
8a-1 Programming with Shared Memory Threads Accessing shared data Critical sections ITCS4145/5145, Parallel Programming B. Wilkinson Jan 4, 2013 slides8a.ppt.
Multi-core Programming Programming with Posix Threads.
PARALLEL PROGRAMMING WITH OPENMP Ing. Andrea Marongiu
1 OpenMP—An API for Shared Memory Programming Slides are based on:
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
1 Tuesday, November 07, 2006 “If anything can go wrong, it will.” -Murphy’s Law.
DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.
Fundamental Design Issues for Parallel Architecture Todd C. Mowry CS 495 January 22, 2002.
Computer Architecture II 1 Computer architecture II Programming: POSIX Threads OpenMP.
8-1 JMH Associates © 2004, All rights reserved Windows Application Development Chapter 10 - Supplement Introduction to Pthreads for Application Portability.
1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.
OpenMPI Majdi Baddourah
Introduction to OpenMP Introduction OpenMP basics OpenMP directives, clauses, and library routines.
Programming with Shared Memory Introduction to OpenMP
Shared Memory Parallelization Outline What is shared memory parallelization? OpenMP Fractal Example False Sharing Variable scoping Examples on sharing.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
COM503 Parallel Computer Architecture & Programming
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
OpenMP - Introduction Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Introduction to Pthreads. Pthreads Pthreads is a POSIX standard for describing a thread model, it specifies the API and the semantics of the calls. Model.
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
1 OpenMP Writing programs that use OpenMP. Using OpenMP to parallelize many serial for loops with only small changes to the source code. Task parallelism.
4.1 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 4: Threads Overview Multithreading Models Thread Libraries  Pthreads  Windows.
OpenMP: Open specifications for Multi-Processing What is OpenMP? Join\Fork model Join\Fork model Variables Variables Explicit parallelism Explicit parallelism.
Lecture 8: OpenMP. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism / Implicit parallelism.
O PEN MP (O PEN M ULTI -P ROCESSING ) David Valentine Computer Science Slippery Rock University.
The University of Adelaide, School of Computer Science
OpenMP – Introduction* *UHEM yaz çalıştayı notlarından derlenmiştir. (uhem.itu.edu.tr)
04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.
CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.
PARALLEL PARADIGMS AND PROGRAMMING MODELS 6 th week.
Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j
Multi-threaded Programming with POSIX Threads CSE331 Operating Systems Design.
Programming with POSIX* Threads Intel Software College.
Introduction to OpenMP
Introduction to OpenMP Eric Aubanel Advanced Computational Research Laboratory Faculty of Computer Science, UNB Fredericton, New Brunswick.
Pthreads: A shared memory programming model
1 Pthread Programming CIS450 Winter 2003 Professor Jinhua Guo.
Lecture 7: POSIX Threads - Pthreads. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Pthreads.
Shared Memory Parallelism - OpenMP Sathish Vadhiyar Credits/Sources: OpenMP C/C++ standard (openmp.org) OpenMP tutorial (
Threaded Programming Lecture 2: Introduction to OpenMP.
Parallel Programming Models (Shared Address Space) 5 th week.
3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,
NCHU System & Network Lab Lab #6 Thread Management Operating System Lab.
Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.
Thread Basic Thread operations include thread creation, termination, synchronization, data management Threads in the same process share:  Process address.
CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.
OpenMP An API : For Writing Portable SMP Application Software Rider NCHC GTD.
Introduction to OpenMP
Shared Memory Parallelism - OpenMP
CS427 Multicore Architecture and Parallel Computing
Shared-Memory Programming with Threads
Threads Threads.
Multithreading Tutorial
Computer Engg, IIT(BHU)
Introduction to OpenMP
Computer Science Department
Chapter 4: Threads Overview Multithreading Models Thread Libraries
Programming with Shared Memory
PTHREADS AND SEMAPHORES
Multithreading Tutorial
Introduction to High Performance Computing Lecture 20
Multithreading Tutorial
Programming with Shared Memory
Multithreading Tutorial
Introduction to OpenMP
Presentation transcript:

SHARED-MEMORY PROGRAMMING 6 th week

-2- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM SHARED-MEMORY PROGRAMMING 6 th week References Introduction The ANSI X3H5 Shared-Memory Model The POSIX Threads Model The OpenMP Standard

-3- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM REFERENCES Scalable Parallel Computing: Technology, Architecture and Programming, Kai Hwang and ZhiweiXu, ch12 Parallel Processing Course – Yang-Suk Kee School of EECS, Seoul National University

-4- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Introduction to Shared-Memory Programming Model Thread (Process) Thread (Process) System X read(X)write(X) ProcessorMemory Shared variable Shared-Memory Model / Shared Address Space (SAS) Model

-5- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM Introduction… (cont’d) Naming – Any process can name any variable in shared space Operations – Loads and stores, plus those needed for ordering Simplest Ordering Model – Within a process/thread: sequential program order – Across threads: some interleaving (as in time-sharing) – Additional orders through synchronization – Again, compilers/hardware can violate orders without getting caught

-6- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM SYNCHORNIZATION Mutual exclusion (locks) – Ensure certain operations on certain data can be performed by only one process at a time – Room that only one person can enter at a time – No ordering guarantees Event synchronization – Ordering of events to preserve dependences – e.g. producer —> consumer of data – 3 main types: point-to-point global group

-7- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM NAMING AND OPEATIONS Naming and operations in programming model can be directly supported by lower levels, or translated by compiler, libraries or OS Example – Shared virtual address space in programming model Hardware interface supports shared physical address space – Direct support by hardware through v-to-p mappings, no software layers

-8- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM NAMING AND OPERATIONS (cont’d) Hardware supports independent physical address spaces – Can provide SAS through OS, so in system/user interface v-to-p mappings only for data that are local remote data accesses incur page faults; brought in via page fault handlers same programming model, different hardware requirements and cost model – Or through compilers or runtime, so above sys/user interface shared objects, instrumentation of shared accesses, compiler support

-9- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM SHARED-MEMORY STANDARDS No widely-accepted standard Three popular platform-independent Shared- Memory standards are – X3H5 – OpenMP – POSIX Pthreads

-10- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE ANSI X3H5 MODEL Established in 1993 Has greatly influencence on many commercial shared-memory systems Defines one conceptual standard programming model and 3 bindings for C, Fortran 77 and Fortran 90

-11- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE ANSI X3H5 MODEL (cont’d) Main features – Parallelism Constructs – Parallel Blocks – Parallel Loop – Implicit Barrier – Support for thread Interaction and synchronization

-12- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PARALLELISM CONSTRUCTS Is a pair of parallel and end parallel with the enclosed code Program starts in sequential mode with one initial thread (base thread/ master thread) When the program encounters a parallel, it switches to parallel mode by creating a number of children threads. The team of master thread and children threads execute in parallel till an end parallel After the end parallel, the program switches back to sequential mode (only base thread continues execution)

-13- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PARALLEL CONSTRUCTS IN AN X3H5 PROGRAM Program main A paralllel B psections section C section D end psections psingle E end psingle pdo i=1,6 F(i) end pdo no wait G end parallel H End executed by only the base thread executed by every thread in the team (parallel mode) executed by one team member executed by another thread executed by only one thread (sequential mode ) all threads share 6 iterations of the loop to execute

-14- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PARALLEL CONSTRUCTS IN AN X3H5 PROGRAM: ILLUSTRATION Threads Implicit barrier no Implicit barrier B B B C D E P Q R A F(1:2) G G F(3:4)F(5:6) G H

-15- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM OTHER CONSTRUCTS Inside a parallel construct, there are – Work-sharing constructs Parallel block Parallel loop (pdo…end pdo) A single process (psingle…end psingle) – Other code to be duplicatedly executed by every thread in the team Parallel Block – Consists of many sections (psections…end psections) – Used to specify MPMD parallelism – Each section is to be executed by a team member

-16- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM OTHER CONSTRUCTS (cont’d) Parallel Loop ( pdo … end pdo) – Used to specify SPMD parallelism – The same code is to be executed by all team members

-17- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM OTHER FEATURES OF X3H5 Implicit Barrier – At parallel, end parallel, end psections, end pdo and end psingle ( use no wait to avoid this) – Fence operation forces all memory acceses up to the barrier point to be consistent Parallel and Work-sharing constructs can be nested Support for thread interaction – shared/privated variable in a parallel construct – implicit and explicit barrirer – 4 types of synchornization objects: latch, lock, event and ordinal Support for thread synchronization – Lock/event synchornization – Critical region and ordinal objects

-18- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE POSIX THREADS (Pthreads) MODEL Established by IEEE in 1995 Functionality and interface are similar to those of Solaris Threads Defines a set of primitive routines to manage and synchornize threads Uses mutex objects and conditional variables for thread synchronization

-19- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE Pthreads MODEL (cont’d) Thread management – pthread_create – pthread_exit – pthread_join – pthread_self Thread synchornization primitives – pthread_mutex_init – pthread_mutex_destroy – pthread_mutex_lock – pthread_mutex_trylock – pthread_mutex_unlock – pthread_cond_init – pthread_cond_destroy – pthread_cond_wait – pthread_cond_timedwait – pthread_cond_signal – pthread_cond_broadcast

-20- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM HELLO WORLD PROGRAM: PTHREAD VERSION int main(void){ pthread_t thread[4]; pthread_attr_t attr; int arg[4] = {0,1,2,3}; int i; // setup joinable threads with // system scope pthread_attr_init(&attr); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE ); pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM); ….. ….(cont’d)… //create N threads for(i=0; i<4; i++) pthread_create(&thread[i], &attr, thrfunc, (void*)&arg[i]); //wait for the N threads to finish for(i=0; i<4; i++) pthread_join(thread[i], NULL); }//end main #include void* thrfunc(void* arg){ printf(“hello from thread %d\n”, *(int*)arg); }//end thrfunc

-21- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE OpenMP STANDARD An Application Program Interface (API) to be used to explicitly direct multi-threaded, shared memory parallelism Inherits many concepts from ANSI X3H5 model Three API components – Compiler Directives – Runtime Library Routines – Environment Variables Portable – APIs for C/C++ and Fortran – Multiple platforms: most Unix platforms and Windows NT

-22- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM THE OpenMP STANDARD (cont’d) Standardized – Jointly proposed by a group of major computer hardware and software vendors – Expected to become an ANSI standard What does OpenMP stand for? – Open specifications for multi-processing Collaborative work with interested parties from the hardware and software industry, government and academia

-23- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM OpenMP IS NOT… Distributed memory parallel systems by itself Implemented identically by all vendors Guaranteed to make the most efficient use of shared memory – There are no data locality constructs

-24- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM GOALS OF OpenMP Standardization – Provide a standard among a variety of shared memory architectures(platforms) – High-level interfaces to thread programming Lean and Mean – A simple and limited set of directives for shared address space programming – Just 3 or 4 directives are enough to represent significant parallelism

-25- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM HELLO WORLD: OpenMP VERSION #include int main(void) { #pragma omp parallel printf(“hello from thread %d\n”, omp_get_thread_num()); }

-26- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM GOALS OF OpenMP (cont’d) Ease of use – Incrementally parallelize a serial program Unlike all or nothing approach of message-passing – Implement both coarse-grain and fine-grain parallelism Portability – Fortran (77, 90, and 95), C, and C++ – Public forum for API and membership

-27- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM MATRIX MULTIPLICATION: SEQUENTIAL VERSION for (i=0; i<N; i++) { for (j=0; j<N; j++) { temp = 0; for (k=0; k<N; k++) temp += a[i][k] * b[k][j]; c[i][j] = temp; }

-28- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM MATRIX MULTIPLICATION: OPENMP VERSION Add directive #pragma omp parallel for private(temp), schedule(static) for (i=0; i<N; i++) { for (j=0; j<N; j++) { temp = 0; for (k=0; k<N; k++) temp += a[i][k] * b[k][j]; c[i][j] = temp; }

-29- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PROGRAMMING MODEL Thread Based Parallelism – A shared memory process with multiple threads – Based upon multiple threads in the shared memory programming paradigm Explicit Parallelism – Explicit (not automatic) programming model – Offer the programmer full control over parallelization

-30- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PROGRAMMING MODEL (cont’d) Fork - Join Model – All OpenMP programs begin as a single sequential process: the master thread – Fork at the beginning of parallel constructs The master thread creates a team of parallel threads The statements enclosed by the parallel region construct are executed in parallel – Join at the end of parallel constructs The threads synchronize and terminate after completing the statements in the parallel construct Only the master thread exists

-31- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM FORK-JOIN MODEL

-32- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM PROGRAMMING MODEL (cont’d) Compiler Directive Based – Parallelism is specified through the use of compiler directives imbedded in C/C++ or Fortran source code Nested Parallelism Support – Parallel constructs may include other parallel constructs inside. – Implementation-dependent Dynamic Threads – Alter the number of threads used to execute parallel regions – Implementation-dependent

-33- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM GENERAL CODE STRUCTURE #include main () { int var1, var2, var3; Serial code... /*Beginning of parallel section. Fork a team of threads. Specify variable scoping */ #pragma omp parallel private(var1, var2) shared(var3) { Parallel section executed by all threads... All threads join master thread and disband } Resume serial code }

-34- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM OPENMP COMPONENTS Directives – Work-sharing constructs – Data environment clauses – Synchronization constructs Runtime libraries Environment variables

-35- Khoa Coâng Ngheä Thoâng Tin – Ñaïi Hoïc Baùch Khoa Tp.HCM COMPARISON OF 5 SHARED-MEMORY PROGRAMMING STANDARD AttributeX3H5MPIPthreadsHPFOpenMP Scalable NoYesSometimesYes Fotran binding Yes NoYes C binding Yes NoPlanned High level YesNo Yes Performace oriented NoYesNoYes Supports data parallelism YesNo Yes Portable Yes Vendors support NoWidelyUnix SMPWidelyStarting Incremental parallelization YesNo Yes Courtesy: OpenMP Standards Board, 1997