INTEL CONFIDENTIAL Confronting Race Conditions Introduction to Parallel Programming – Part 6
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Review & Objectives Previously: Described how to add OpenMP pragmas to programs that have suitable blocks of code or for loops Demonstrated how to use private and reduction clauses At the end of this part you should be able to: Give practical examples of ways that threads may contend for shared resources Describe what race conditions are and explain how to eliminate them in OpenMP code 2
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example double area, pi, x; int i, n;... area = 0.0; for (i = 0; i < n; i++) { x = (i + 0.5)/n; area += 4.0/(1.0 + x*x); } pi = area / n; What happens when we make the for loop parallel? 3
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Race Condition A race condition is nondeterministic behavior caused by the order in which two or more threads access a shared variable For example, suppose both Thread 1 and Thread 2 are executing the statement area += 4.0 / (1.0 + x*x); 4
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. One Timing Correct Sum 5 Value of area Thread 1Thread
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Another Timing Incorrect Sum 6 Value of area Thread 1Thread
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Another Race Condition Example struct Node { int data; struct Node *next;} struct List { struct Node *head; } void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; } 7 data next Node head List
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Original Singly-Linked List 8 head data next list node_a void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 1 after Stmt. 1 of AddHead 9 head data next list node_a data next node_b void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 2 Executes AddHead 10 head data next list node_a data next node_b data next node_c void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Thread 1 After Stmt. 2 of AddHead 11 head data next list node_a data next node_b data next node_c void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; }
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Why Race Conditions Are Nasty Programs with race conditions exhibit nondeterministic behavior Sometimes give correct result Sometimes give erroneous result Programs often work correctly on trivial data sets and small number of threads Errors more likely to occur when number of threads and/or execution time increases Hence debugging race conditions can be difficult 12
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. How to Avoid Race Conditions Scope variables to be private to threads Use OpenMP private clause Variables declared within threaded functions Allocate on thread’s stack (pass as parameter) Control shared access with critical region Mutual exclusion and synchronization 13
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Mutual Exclusion We can prevent the race conditions described earlier by ensuring that only one thread at a time references and updates shared variable or data structure Mutual exclusion refers to a kind of synchronization that allows only a single thread or process at a time to have access to a shared resource Mutual exclusion is implemented using some form of locking 14
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Critical Regions A critical region is a portion of code that threads execute in a mutually exclusive fashion The critical pragma in OpenMP immediately precedes a statement or block representing a critical section Good news: critical regions eliminate race conditions Bad news: critical regions are executed sequentially More bad news: you have to identify critical regions yourself 15
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; list->head = node; } 16
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 17 list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 18 list Thread 1 Thread 1 node
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 19 list Thread 1 Thread 1 node
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 20 list Thread 1 Thread 1 node Thread 2 Thread 2 node
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 21 list Thread 1 Thread 1 node Thread 2 Thread 2 node
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Motivating Example void AddHead (struct List *list, struct Node *node) { node->next = list->head; #pragma omp critical list->head = node; } 22 list Thread 1 node Thread 2 Thread 2 node
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Protect All References to Shared Data You must protect both read and write accesses to any shared data For the AddHead() function, both lines need to be protected 23
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 24 list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 25 Thread 1 Thread 1 node list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 26 Thread 1Thread 2 Thread 2 node Thread 1 node list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 27 Thread 1Thread 2 Thread 2 node Thread 1 node list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Corrected Example void AddHead (struct List *list, struct Node *node) { #pragma omp critical { node->next = list->head; list->head = node; } 28 Thread 2 Thread 2 node list
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Important: Lock Data, Not Code Locks should be associated with data objects Different data objects should have different locks 29
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. OpenMP atomic Construct Special case of a critical section to ensure atomic update to memory location Applies only to simple operations: pre- or post-increment (++) pre- or post-decrement (--) assignment with binary operator (of scalar types) Works on a single statement #pragma omp atomic counter += 5; 30
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. Critical vs. Atomic #pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp critical x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); } critical protects: Call to WorkOne() Finding value of index[i] Addition of x[index[i]] and results of WorkOne() Assignment to x array element Essentially, updates to elements in the x array are serialized #pragma omp parallel for for (i = 0; i < n; i++) { #pragma omp atomic x[index[i]] += WorkOne(i); y[i] += WorkTwo(i); } atomic protects: Addition and assignment to x array element Non-conflicting updates will be done in parallel Protection needed only if there are two threads where the index[i] values match 31
Copyright © 2009, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. * Other brands and names are the property of their respective owners. References Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau, “Deadlock”, CS 537, Introduction to Operating Systems, Computer Sciences Department, University of Wisconsin- Madison. Jim Beveridge and Robert Wiener, Multithreading Applications in Win32®, Addison-Wesley (1997). Richard H. Carver and Kuo-Chung Tai, Modern Multithreading: Implementing, Testing, and Debugging Java and C++/Pthreads/ Win32 Programs, Wiley-Interscience (2006). Michael J. Quinn, Parallel Programming in C with MPI and OpenMP, McGraw-Hill (2004). Brent E. Rector and Joseph M. Newcomer, Win32 Programming, Addison-Wesley (1997). N. Wirth, Programming in Modula-2, Springer (1985). 32