CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library.

Slides:



Advertisements
Similar presentations
Threads Cannot be Implemented As a Library Andrew Hobbs.
Advertisements

Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Memory Consistency Models Kevin Boos. Two Papers Shared Memory Consistency Models: A Tutorial – Sarita V. Adve & Kourosh Gharachorloo – September 1995.
CS 162 Memory Consistency Models. Memory operations are reordered to improve performance Hardware (e.g., store buffer, reorder buffer) Compiler (e.g.,
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
By Sarita Adve & Kourosh Gharachorloo Review by Jim Larson Shared Memory Consistency Models: A Tutorial.
Chapter 2.3 : Interprocess Communication
1 Sharing Objects – Ch. 3 Visibility What is the source of the issue? Volatile Dekker’s algorithm Publication and Escape Thread Confinement Immutability.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
1 Lecture 22: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
© 2009 Matthew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
CS510 Concurrent Systems Introduction to Concurrency.
Foundations of the C++ Concurrency Memory Model Hans-J. Boehm Sarita V. Adve HP Laboratories UIUC.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
By Sarita Adve & Kourosh Gharachorloo Slides by Jim Larson Shared Memory Consistency Models: A Tutorial.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Threads Cannot be Implemented as a Library Hans-J. Boehm.
Multiprocessor Cache Consistency (or, what does volatile mean?) Andrew Whitaker CSE451.
Threads cannot be implemented as a library Hans-J. Boehm (presented by Max W Schwarz)
Chapter 6: Process Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Module 6: Process Synchronization Background The.
Operating Systems CMPSC 473 Mutual Exclusion Lecture 11: October 5, 2010 Instructor: Bhuvan Urgaonkar.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CS533 Concepts of Operating Systems Jonathan Walpole.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Week 9, Class 3: Java’s Happens-Before Memory Model (Slides used and skipped in class) SE-2811 Slide design: Dr. Mark L. Hornick Content: Dr. Hornick Errors:
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?
pThread synchronization
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Lecture 20: Consistency Models, TM
Speculative Lock Elision
Background on the need for Synchronization
Threads Cannot Be Implemented As a Library
Atomic Operations in Hardware
Multithreading Tutorial
Atomic Operations in Hardware
CS533 Concepts of Operating Systems Class 3
Chapter 5: Process Synchronization
Threads and Memory Models Hal Perkins Autumn 2011
Symmetric Multiprocessors: Synchronization and Sequential Consistency
The C++ Memory model Implementing synchronization)
Symmetric Multiprocessors: Synchronization and Sequential Consistency
Implementing synchronization
Multithreading Tutorial
Threads and Memory Models Hal Perkins Autumn 2009
Concurrency: Mutual Exclusion and Process Synchronization
Chapter 6: Process Synchronization
Multithreading Tutorial
CS533 Concepts of Operating Systems Class 3
Multithreading Tutorial
Kernel Synchronization II
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
Relaxed Consistency Finale
Programming with Shared Memory Specifying parallelism
Atomicity, Mutex, and Locks
Presentation transcript:

CS510 Concurrent Systems Class 5 Threads Cannot Be Implemented As a Library

Reasoning about program outcomes  What are the valid outcomes for this program? o Is it valid for both r1 and r2 to contain 0 ? CS510 - Concurrent Systems 2 Thread 1 x = 1; r1 = y; Thread 2 y = 1; r2 = x; Initialization:x = y = 0;

Sequential consistency  If your intuition is based on sequential consistency, you may think that r1 = r2 = 0 is invalid  But sequential consistency does not hold on most (any?) modern architectures o Compilers may reorder memory operations under the (invalid) assumption that the program is sequential! o Hardware may reorder memory operations, as we saw in the last class You are expected to use memory barriers if this hardware reordering would be problematic for your program! CS510 - Concurrent Systems 3

4 The problem  Languages such as C and C++ do not support concurrency o C/C++ compilers do not implement it o Threads are implemented in library routines (i.e. Pthreads) o The compiler, unaware of threads, reorders operations as if it was dealing with a sequential program o Some of this reordering is invalid for concurrent programs  But C programs with Pthreads work just fine … don’t they?

The Pthreads solution  Pthreads defines synchronization primitives o Pthread_mutex_lock o Pthread_mutex_unlock  Programmers must use them to prevent concurrent access to memory o Define critical sections and protect them with mutex locks o Do not allow “data races” in the program CS510 - Concurrent Systems 5

Pthreads synchronization primitives  Lock and unlock must be implemented carefully because they themselves contain data races! o Use memory barriers to prevent hardware reordering o Disable compiler optimizations to prevent compiler reordering  This seems ok, doesn’t it? o Lock and unlock are implemented carefully o The code between lock and unlock is executed sequentially due to mutual exclusion enforced by locks o The code outside the critical section has no races CS510 - Concurrent Systems 6

The catch  The programmer needs to determine where there is a race in order to place the locking primitives  How can the programmer tell whether there is a data race? o Need to reason about ordering of memory operations to determine if there is a race o This requires the programming language to formally define a memory model C / C++ do not! Without a memory model, how can you tell if there is a race? CS510 - Concurrent Systems 7

Example  Does this program contain a race? o Is it possible for either x or y to be modified? o Is x == y == 1 a possible outcome? CS510 - Concurrent Systems 8 Initially: x = y = 0; if (x == 1) ++y; if (y == 1) ++x;

Valid compiler transformations  Assuming sequential consistency, there is no race o But sequential consistency is a bad assumption!  Also, a compiler could legally transform CS510 - Concurrent Systems 9 if (x == 1) ++y; if (y == 1) ++x; ++y; if (x != 1) --y; ++x; if (y != 1) --x; to This transformation produces a race! How can the compiler know not to do this?

The programmer’s problem  Programmer needs to know the constraints on compiler or hardware memory reordering in order to determine whether there is a race o Then needs to prevent the race by using the mutual exclusion primitives o Some hardware defines a formal memory consistency model o But the C programming language doesn’t define a formal memory model, so its not possible to answer the first question with confidence in all cases! CS510 - Concurrent Systems 10

The compiler developer’s problem  The compiler needs to know about concurrency in order to know that its not OK to use certain sequential optimizations o Code transformations that are valid for sequential programs but not concurrent ones o Alternatively, the compiler must be conservative all the time, leading to performance degradation for sequential programs CS510 - Concurrent Systems 11

Another example  Rewriting adjacent data (bit fields)  If thread 1 updates x.a and thread 2 updates x.b, is there a race? CS510 - Concurrent Systems 12 struct { int a:17; int b:15 } x;

Another example  Rewriting adjacent data (bit fields) CS510 - Concurrent Systems 13 struct { int a:17; int b:15 } x; x.a = 42 possibly implemented by the compiler as … { tmp = x; // Read both fields into a // 32 bit temporary variable tmp &= ~0x1ffff; // Mask off old a; tmp |= 42;// Or in new value x = tmp; // Overwrite all of x }

The programmer’s problem  An update to x.a also updates x.b by side effect! o There appears to be no race in the program, but this transparent compiler optimization creates one o Mutual exclusion primitives should have been used! o … but a separate lock for x.a and x.b will not do either!  If x.a and x.b are protected using separate locks a thread updating only x.a would need to also acquire the lock for x.b o How is the programmer to know this? CS510 - Concurrent Systems 14

The compiler developer’s problem  How can you know that this optimization (which is fine for a sequential program) is not ok? o If your language has no concept of concurrency?  The granularity of memory operations (bit, byte, word etc) is architecture specific o Programs that need to know this kind of detail are not portable! o If the language defined a minimum granularity for atomic updates, what should it be? CS510 - Concurrent Systems 15

Register promotion  Register promotion can also introduce updates where there were non before. o Example: conditional locking for multithreaded execution: CS510 - Concurrent Systems 16 for (...) {... if (mt) pthread_mutex_lock (...); x =... x... if (mt) pthread_mutex_unlock (...); }

Register promotion  Register promotion can also introduce updates where there were non before. o Example: conditional locking for multithreaded execution: Transformed to CS510 - Concurrent Systems 17

Register promotion  Register promotion can also introduce updates where there were non before. o Example: conditional locking for multithreaded execution: CS510 - Concurrent Systems 18 r = x; for (...) {... if (mt) { x = r; pthread_mutex_lock (...); r = x; } r =... r... if (mt) { x = r; pthread_mutex_unlock (...); r = x; }

The problem  The variable x is now accessed outside the critical section  The compiler caused this problem (i.e., broke the program) o but how is the compiler supposed to know this is a mistake if it is unaware of concurrency and hence unaware of critical sections?  Bottom line: the memory abstraction is under- defined! CS510 - Concurrent Systems 19

Other problems  Sticking to the Pthreads rules prevents important performance optimizations o not just in the compiler  It also makes programs based on non-blocking synchronization illegal! o NBS programs contain data races, by definition!  Even with a formal memory model, how would we extend the Pthreads approach to cover NBS? CS510 - Concurrent Systems 20