INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6.

Slides:

Advertisements

Similar presentations

Implementing Task Decompositions Intel Software College Introduction to Parallel Programming – Part 5.

Advertisements

Confronting Race Conditions Intel Software College Introduction to Parallel Programming – Part 4.

Shared-Memory Model and Threads Intel Software College Introduction to Parallel Programming – Part 2.

Improving Parallel Performance Intel Software College Introduction to Parallel Programming – Part 7.

Implementing Domain Decompositions Intel Software College Introduction to Parallel Programming – Part 3.

INTEL CONFIDENTIAL Implementing a Task Decomposition Introduction to Parallel Programming – Part 9.

Introduction to Openmp & openACC

More on Semaphores, and Classic Synchronization Problems CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.

NewsFlash!! Earth Simulator no longer #1. In slightly less earthshaking news… Homework #1 due date postponed to 10/11.

The Intel® Software Community Real-world resources you can use.

INTEL CONFIDENTIAL Improving Parallel Performance Introduction to Parallel Programming – Part 11.

Open[M]ulti[P]rocessing Pthreads: Programmer explicitly define thread behavior openMP: Compiler and system defines thread behavior Pthreads: Library independent.

INTEL CONFIDENTIAL Deadlock Introduction to Parallel Programming – Part 7.

PARALLEL PROGRAMMING WITH OPENMP Ing. Andrea Marongiu

DISTRIBUTED AND HIGH-PERFORMANCE COMPUTING CHAPTER 7: SHARED MEMORY PARALLEL PROGRAMMING.

5.6 Semaphores Semaphores –Software construct that can be used to enforce mutual exclusion –Contains a protected variable Can be accessed only via wait.

Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 OpenMP -Example ICS 535 Design and Implementation.

1 ITCS4145/5145, Parallel Programming B. Wilkinson Feb 21, 2012 Programming with Shared Memory Introduction to OpenMP.

Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.

1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.

INTEL CONFIDENTIAL OpenMP for Domain Decomposition Introduction to Parallel Programming – Part 5.

INTEL CONFIDENTIAL OpenMP for Task Decomposition Introduction to Parallel Programming – Part 8.

Threaded Programming Methodology Intel Software College.

INTEL CONFIDENTIAL Why Parallel? Why Now? Introduction to Parallel Programming – Part 1.

INTEL CONFIDENTIAL Reducing Parallel Overhead Introduction to Parallel Programming – Part 12.

INTEL CONFIDENTIAL Parallel Decomposition Methods Introduction to Parallel Programming – Part 2.

Semaphores Questions answered in this lecture: Why are semaphores necessary? How are semaphores used for mutual exclusion? How are semaphores used for.

INTEL CONFIDENTIAL Finding Parallelism Introduction to Parallel Programming – Part 3.

10/04/2011CS4961 CS4961 Parallel Programming Lecture 12: Advanced Synchronization (Pthreads) Mary Hall October 4, 2011.

Programming Models using Windows* Threads Intel Software College.

Programming with OpenMP* Intel Software College. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or.

CS470/570 Lecture 5 Introduction to OpenMP Compute Pi example OpenMP directives and options.

1 Copyright © 2010, Elsevier Inc. All rights Reserved Chapter 5 Shared Memory Programming with OpenMP An Introduction to Parallel Programming Peter Pacheco.

Lecture 5: Shared-memory Computing with Open MP. Shared Memory Computing.

Chapter 17 Shared-Memory Programming. Introduction OpenMP is an application programming interface (API) for parallel programming on multiprocessors. It.

Recognizing Potential Parallelism Intel Software College Introduction to Parallel Programming – Part 1.

Multi-core Programming: Basic Concepts. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered.

1 OpenMP Writing programs that use OpenMP. Using OpenMP to parallelize many serial for loops with only small changes to the source code. Task parallelism.

INTEL CONFIDENTIAL Predicting Parallel Performance Introduction to Parallel Programming – Part 10.

04/10/25Parallel and Distributed Programming1 Shared-memory Parallel Programming Taura Lab M1 Yuuki Horita.

CS 838: Pervasive Parallelism Introduction to OpenMP Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from online references.

Programming with OpenMP* Intel Software College. Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or.

Work Replication with Parallel Region #pragma omp parallel { for ( j=0; j

Programming with POSIX* Threads Intel Software College.

High-Performance Parallel Scientific Computing 2008 Purdue University OpenMP Tutorial Seung-Jai Min School of Electrical and Computer.

Correcting Threading Errors with Intel® Parallel Inspector.

INTEL CONFIDENTIAL Shared Memory Considerations Introduction to Parallel Programming – Part 4.

Concurrency Control 1 Fall 2014 CS7020: Game Design and Development.

Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.

9/22/2011CS4961 CS4961 Parallel Programming Lecture 9: Task Parallelism in OpenMP Mary Hall September 22,

3/12/2013Computer Engg, IIT(BHU)1 OpenMP-1. OpenMP is a portable, multiprocessing API for shared memory computers OpenMP is not a “language” Instead,

Special Topics in Computer Engineering OpenMP* Essentials * Open Multi-Processing.

Agenda  Quick Review  Finish Introduction  Java Threads.

CPE779: More on OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.

CPE779: Shared Memory and OpenMP Based on slides by Laxmikant V. Kale and David Padua of the University of Illinois.

Programming with OpenMP*” Part II Intel Software College.

Tuning Threaded Code with Intel® Parallel Amplifier.

Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?

1 Programming with Shared Memory - 2 Issues with sharing data ITCS 4145 Parallel Programming B. Wilkinson Jan 22, _Prog_Shared_Memory_II.ppt.

OpenMP Lab Antonio Gómez-Iglesias Texas Advanced Computing Center.

Introduction to OpenMP

Lecture 5: Shared-memory Computing with Open MP

SHARED MEMORY PROGRAMMING WITH OpenMP

Computer Engg, IIT(BHU)

Introduction to High Performance Computing Lecture 20

ე ვ ი ო Ш Е Т И О А С Д Ф К Ж З В Н М W Y U I O S D Z X C V B N M

Shared Memory Programming

Background and Motivation

Introduction to OpenMP

Parallel Programming with OPENMP

Presentation transcript:

INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Described how to add OpenMP pragmas to programs that have suitable blocks of code or for loops Demonstrated how to use private and reduction clauses At the end of this part you should be able to: Give practical examples of ways that threads may contend for shared resources Describe what race conditions are and explain how to eliminate them in OpenMP code 2

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example double area, pi, x; int i, n;... area = 0.0; for (i = 0; i < n; i++) { x = (i + 0.5)/n; area += 4.0/(1.0 + x*x); } pi = area / n; What happens when we make the for loop parallel? 3

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Race Condition A race condition is nondeterministic behavior caused by the order in which two or more threads access a shared variable For example, suppose both Thread 1 and Thread 2 are executing the statement area += 4.0 / (1.0 + x*x); 4

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. One Timing  Correct Sum 5 Value of area Thread 1Thread

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Another Timing  Incorrect Sum 6 Value of area Thread 1Thread

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Another Race Condition Example struct Node { int data; struct Node *next;} struct List { struct Node *head; } void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; } 7 data next Node head List

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original Singly-Linked List 8 head data next list node_a void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 1 after Stmt. 1 of AddHead 9 head data next list node_a data next node_b void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 2 Executes AddHead 10 head data next list node_a data next node_b data next node_c void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 1 After Stmt. 2 of AddHead 11 head data next list node_a data next node_b data next node_c void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Why Race Conditions Are Nasty Programs with race conditions exhibit nondeterministic behavior Sometimes give correct result Sometimes give erroneous result Programs often work correctly on trivial data sets and small number of threads Errors more likely to occur when number of threads and/or execution time increases Hence debugging race conditions can be difficult 12

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. How to Avoid Race Conditions Scope variables to be private to threads Use OpenMP private clause Variables declared within threaded functions Allocate on thread’s stack (pass as parameter) Control shared access with critical region Mutual exclusion and synchronization 13

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Mutual Exclusion We can prevent the race conditions described earlier by ensuring that only one thread at a time references and updates shared variable or data structure Mutual exclusion refers to a kind of synchronization that allows only a single thread or process at a time to have access to a shared resource Mutual exclusion is implemented using some form of locking 14

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Critical Regions A critical region is a portion of code that threads execute in a mutually exclusive fashion The critical pragma in OpenMP immediately precedes a statement or block representing a critical section Good news: critical regions eliminate race conditions Bad news: critical regions are executed sequentially More bad news: you have to identify critical regions yourself 15

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; } 16

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 17 list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 18 list Thread 1 Thread 1 node

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 19 list Thread 1 Thread 1 node

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 20 list Thread 1 Thread 1 node Thread 2 Thread 2 node

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 21 list Thread 1 Thread 1 node Thread 2 Thread 2 node

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 22 list Thread 1 node Thread 2 Thread 2 node

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Protect All References to Shared Data You must protect both read and write accesses to any shared data For the AddHead() function, both lines need to be protected 23

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 24 list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 25 Thread 1 Thread 1 node list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 26 Thread 1Thread 2 Thread 2 node Thread 1 node list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 27 Thread 1Thread 2 Thread 2 node Thread 1 node list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 28 Thread 2 Thread 2 node list

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Important: Lock Data, Not Code Locks should be associated with data objects Different data objects should have different locks 29

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. OpenMP atomic Construct Special case of a critical section to ensure atomic update to memory location Applies only to simple operations: pre- or post-increment (++) pre- or post-decrement (--) assignment with binary operator (of scalar types) Works on a single statement #pragma omp atomic counter += 5; 30

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Critical vs. Atomic #pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp critical x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); } critical protects: Call to WorkOne() Finding value of index[i] Addition of x[index[i]] and results of WorkOne() Assignment to x array element Essentially, updates to elements in the x array are serialized #pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp atomic x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); } atomic protects: Addition and assignment to x array element Non-conflicting updates will be done in parallel Protection needed only if there are two threads where the index[i] values match 31

Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, “Deadlock”, CS 537, Introduction to Operating Systems, Computer Sciences Department, University of Wisconsin- Madison. Jim Beveridge and Robert Wiener, Multithreading Applications in Win32®, Addison-Wesley (1997). Richard H. Carver and Kuo-Chung Tai, Modern Multithreading: Implementing, Testing, and Debugging Java and C++/Pthreads/ Win32 Programs, Wiley-Interscience (2006). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). Brent E. Rector and Joseph M. Newcomer, Win32 Programming, Addison-Wesley (1997). N. Wirth, Programming in Modula-2, Springer (1985). 32