Mutual Exclusion.

Slides:



Advertisements
Similar presentations
Concurrency: Deadlock and Starvation Chapter 6. Deadlock Permanent blocking of a set of processes that either compete for system resources or communicate.
Advertisements

1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Mutual Exclusion.
Semaphores. Announcements No CS 415 Section this Friday Tom Roeder will hold office hours Homework 2 is due today.
Chapter 2.3 : Interprocess Communication
Jonathan Walpole Computer Science Portland State University
CPS110: Implementing threads/locks on a uni-processor Landon Cox.
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
1 Race Conditions/Mutual Exclusion Segment of code of a process where a shared resource is accessed (changing global variables, writing files etc) is called.
CS510 Concurrent Systems Introduction to Concurrency.
Concurrency, Mutual Exclusion and Synchronization.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
1 Announcements The fixing the bug part of Lab 4’s assignment 2 is now considered extra credit. Comments for the code should be on the parts you wrote.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Kernel Locking Techniques by Robert Love presented by Scott Price.
CY2003 Computer Systems Lecture 04 Interprocess Communication.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms to ensure the orderly execution.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Synchronization CSCI 3753 Operating Systems Spring 2005 Prof. Rick Han.
Implementing Lock. From the Previous Lecture  The “too much milk” example shows that writing concurrent programs directly with load and store instructions.
Implementing Mutual Exclusion Andy Wang Operating Systems COP 4610 / CGS 5765.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
CS162 Section 2. True/False A thread needs to own a semaphore, meaning the thread has called semaphore.P(), before it can call semaphore.V() False: Any.
NETW 3005 Monitors and Deadlocks. Reading For this lecture, you should have read Chapter 7. NETW3005 (Operating Systems) Lecture 06 - Deadlocks2.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Process Management Deadlocks.
Synchronization Deadlocks and prevention
Interprocess Communication Race Conditions
CSE 120 Principles of Operating
Concurrency: Deadlock and Starvation
CS703 – Advanced Operating Systems
Process Synchronization: Semaphores
Background on the need for Synchronization
Concurrency.
Atomic Operations in Hardware
Atomic Operations in Hardware
ITEC 202 Operating Systems
Chapter 5: Process Synchronization
Lecture 11: Mutual Exclusion
Jonathan Walpole Computer Science Portland State University
Designing Parallel Algorithms (Synchronization)
Topic 6 (Textbook - Chapter 5) Process Synchronization
COT 5611 Operating Systems Design Principles Spring 2014
Module 7a: Classic Synchronization
Jonathan Walpole Computer Science Portland State University
Lecture 2 Part 2 Process Synchronization
Implementing Mutual Exclusion
Concurrency: Mutual Exclusion and Process Synchronization
Implementing Mutual Exclusion
Lecture 11: Mutual Exclusion
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
CSCI1600: Embedded and Real Time Software
CSE 451 Section 1/27/2000.
CSE 153 Design of Operating Systems Winter 2019
Process/Thread Synchronization (Part 2)
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Mutual Exclusion

Overview Concurrent programming and race conditions Mutual exclusion Implementing mutual exclusion Deadlocks, starvation, livelock

Concurrent Programming Programming with two or more threads that cooperate to perform a common task Threads cooperate by sharing data via shared address space What types of data/variables are shared? Problems Race Conditions E.g., Two threads T1 and T2 read and update the same variable, so access to the threads must be exclusive (i.e. one at a time) Synchronization E.g., T1 initializes a variable, T2 runs after variable is initialized, so ordering between T1 and T2 must be enforced variables that are shared: globals and heap variables

Race Condition Example What thread interleaving would lead to problems? worker() { …; counter = counter + 1; … } Dump of assembler code for function worker: … 0x00401398 <+15>: mov 0x406018,%eax ; 1. read from mem 0x0040139d <+20>: add $0x1,%eax ; 2. increment reg 0x004013a0 <+23>: mov %eax,0x406018 ; 3. write to mem Thread 1 worker() { …; counter = counter + 1; … } Dump of assembler code for function worker: … 0x00401398 <+15>: mov 0x406018,%eax ; 1. read from mem 0x0040139d <+20>: add $0x1,%eax ; 2. increment reg 0x004013a0 <+23>: mov %eax,0x406018 ; 3. write to mem Thread 2 Say Thread 1 start executing step 1. Before it can complete step 3, assume that a thread switch occurs and Thread 2 starts running, and executes its step 1. Then both threads have read the old value of counter. Only one of the increments will then be recorded, i.e., one of the increments will be lost.

Why do Races Occur? Result depends on timing of execution of threads Some execution sequences lead to unexpected results How can we avoid this problem? Ri is read of counter value, Wi is write of counter value, CS is context switch solve this problem: need to ensure that operation appears atomic, see next slide a special kind of a race condition is a data race: two threads access a shared variable concurrently, at least one access is a write, and the threads do not use any synchronization to control access to the variable. the example shown in this slide is a data race. race conditions can be of other types as well.

Atomicity and Mutual Exclusion Need to ensure that reading and updating the counter is an atomic operation An operation is atomic if it appears to occur instantaneously to the rest of the system The operation appears indivisible, so the rest of the system either doesn’t observe any of the effects of the operation, or all the effects of the operation One way to ensure atomicity is by ensuring that only one thread can read and update the counter at a time This is called mutual exclusion The code region on which mutual exclusion is enforced is called a critical section Accesses to shared variables must be done in critical sections The reason we say that an operation appears to occur instantaneously is that it doesn’t occur instantaneously. For instance, other threads could run while the operation is running. Put another way, the other threads could run but they should not observe any intermediate states of the operation.

Mutex Lock Abstraction A mutex lock helps ensures mutual exclusion mutex = lock_create(): create a free lock, called mutex lock_destroy(mutex): destroy the mutex lock lock(mutex): Acquire the lock if it is free Otherwise wait (or sleep) until it can be acquired Lock is now acquired unlock(mutex): Release the lock If there are waiting threads wake up one of them Lock is now free Critical section is accessed in between lock, unlock A toilet is a critical section! You don’t want any races there … You acquire a lock by closing the door, and release the lock by opening the door.

Mutex Locks Lock Acquired Unlock Lock Acquired Unlock

Using a Mutex Lock Thread 1 Thread 2 // counter and lock are located in shared address space int counter; struct lock *l; // same lock used by both threads while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } Using locks: it is the programmer’s responsibility to add locks to critical sections. The code would not work if we added two different locks, e.g., say T1 uses lock L1, and T2 uses lock L2. What if T2 performs “counter—”? Should we use the same lock L, for both threads or not? Yes, we should because we would like to protect both sections of code. In general, it is important to think about shared data structures (globals and heap) that you are trying to protect, and then create a single lock for a specific data structure (or component of data structure, e.g., linked list item) that can be read and modified by multiple threads concurrently, and then use the lock in all code that accesses that data structure. Thread 1 Thread 2

Mutual Exclusion Conditions No two threads simultaneously in critical section No assumption on the speed of thread execution No thread running outside its critical section may block another thread Why? No thread must wait forever to enter its critical section why: bad for performance. presumably a thread outside the critical section is not doing anything critical, and so should not stop a thread in a critical section. Also, if a thread in a critical section is blocked, then all threads waiting to enter the critical section also get blocked. why: starvation, no progress is made

Implementing Mutex Locks Naive implementation: use a global variable (int l) to track whether a thread is in the critical section Is there a problem with this implementation? lock() and unlock() access a shared variable So they themselves need to be atomic! lock(l) { while (l == TRUE) ; // no-op l = TRUE; } unlock(l) { l = FALSE; } problem: One thread invokes lock(). It read checks (l == TRUE), and find it is false. Before it can set l to TRUE, there is a context switch, and another thread invokes lock(). It will also find l is false, and then will set to TRUE and acquire the lock. Then the first thread will again set l to TRUE and also acquire the lock. So no mutual exclusion.

Implementing Mutex Locks Naive implementation: make lock() atomic Disabling interrupt ensures that pre-emption doesn’t occur in the lock() code, ensuring it runs atomically Is there a problem with this implementation? lock(l) { disable interrupts; while (l == TRUE) ; // no-op l = TRUE; enable interrupts; } unlock(l) { l = FALSE; } we can’t seem to implement locks by reading and writing the lock variable, so let’s give up and try using another option, interrupt disabling. problem: one thread acquires a lock, but before it performs an unlock, say another thread runs (note that interrupts are enabled between lock() and unlock() calls. The second thread will disable interrupts, and get stuck in the while loop because the first thread has set the lock variable to true. Since interrupts are disabled, the original thread doesn’t get to run, and so now both threads are stuck forever. this is called a deadlock.

Implementation 1: Interrupt Disabling What about this implementation? Any problem with this implementation? lock() { disable_interrupts; } unlock() { enable_interrupts; } this implementation works on a single CPU because the critical section is executed atomically without preemption. In practice, the implementation of unlock() will set the interrupt level to its value before the call to lock(). Suppose interrupts are enabled and lock is called twice. Then two unlocks are needed to enable interrupts. problem: doesn’t work for multi-processors, as we see next.

Atomic Instructions Previous implementation only works on single CPU Interrupts are disabled only on local CPU But threads could still run on another CPU, causing a race Hardware support for locking Interrupts provide h/w support for locking on single CPU Need h/w support for locking on multi-processors Multi-processor h/w provides atomic instructions Atomic Increment, Atomic Test and Set Lock, Atomic Compare and Swap These instructions operate on a memory word Notice they perform 2 operations on the word indivisibly How does h/w performs these operations indivisibly? Show the assembly of worker function in threads-atomic.exe which uses an atomic increment instruction. Note that threads_sync.exe uses pthread locks (these are blocking locks that we will discuss later). How does h/w performs these operations indivisibly? CPU requests memory controller to lock memory location, so the two operations on the memory location are performed without other CPUs interfering. in essence, to build a lock primitive in software, we are again using a lock/atomic primitive provided by h/w (previously, we used interrupt disabling for uniprocessor, now we are using atomic instructions for multiprocessors).

Test-and-Set Lock Instruction Tset instruction operates on an integer It reads and returns the old value of the integer It updates the value of the integer to 1 These two operations are performed atomically int tset(int *lock) { // atomic in hardware int old = *lock; *lock = 1; return old; } While atomic instructions such as atomic increment allow performing specific operations (increment counter) atomically, the tset instruction is a general atomic instruction for implementing spin locks, as described in the next slide.

Implementation 2: Spin Locks Lock uses tset in a loop *l is initialized to 0 If returned value is 0, lock is acquired If returned value is 1, then someone else has lock, try again This mutex lock is called a spin lock because threads wait in a tight loop Problem: While a thread waits, CPU performs no useful work lock(int *l) { while (tset(l)) ; // no-op } unlock(int *l) { *l = FALSE; } Spin locks allow implementing an arbitrary size critical section, by surrounding the critical section code with lock and unlock.

Implementation 3: Yielding Locks Yield the CPU voluntarily while waiting for the lock Recall that thread_yield runs another thread, so the CPU can perform useful work This mutex is a yielding lock Problem: scheduler determines when thread_yield() returns lock_s(int *l) { while (tset(l)) thread_yield(); } unlock_s(int *l) { *l = FALSE; } Thread might get unlucky and wait for a long time in thread_yield(), even if the the other thread releases the lock soon after the call to yield.

Implementation 4: Blocking Locks Both spin and yielding locks are essentially polling for lock to become available Choosing right polling frequency is not simple: spin locks waste CPU, yielding locks can delay lock acquire Ideally, lock() would block until unlock() was called Invoke thread_sleep() when lock is not available Unlock() should invoke thread_wakeup() These functions access shared ready list, so they need to be critical sections, i.e., we need locking while trying to implement blocking locks! How can we solve this problem? lock() will block until unlock() was called: this has similarities with interrupts helping determine when a device has completed work, and letting the CPU know at that time. In this case, while lock() is blocked, the CPU can do work on behalf of other ready threads.

Using a Previous Solution Previous solutions work correctly but don’t block Interrupt disabling works correctly on single CPU Spin locks work correctly on multi-processor We can use these solutions to access the shared data structures in the thread scheduler Scheduler implements blocking, so it can’t use a blocking lock! Lab 3 requires you to implement blocking locks

Locking Solutions Notice how locking solutions depend on lower-level locking The lock implementation of lower-level locks is more efficient, so why use higher-level locks? Uniprocessor Multiprocessor blocking lock blocking lock interrupt disabling spin lock For example, an atomic instruction is a single instruction, spin locks loop on an atomic instruction, while blocking locks require manipulating the ready list, wait list, etc. atomic instruction

Which Lock to Use? Lock When to use Atomic instruction Most efficient, use when available Interrupt disabling, spin locks Use when critical sections are short, in particular, the critical section will not block (i.e., it will not call thread_sleep or thread_yield) Blocking locks Use when critical sections are long, especially if the critical section may block Use when available: say that all we want to do is to increment a counter atomically. Since an atomic instruction is available, we can use it. Why would you block in a critical section: we will see later that we may need to block in a critical section because we need to synchronize with other threads.

Using Locks Note that to protect shared variables, we need to create lock variables that are also shared variables Locks must be global variables or allocated on the heap We have talked about how locks are implemented, now lets see how they are used?

Using Locks When using locks, make sure to use the same lock for all critical sections that access some shared data // counter and lock are located in shared address space int counter; struct lock *l; while() { lock(l); // critical section; counter++; unlock(l); // remainder section; } while() { lock(l); // critical section; counter--; unlock(l); // remainder section; } If Thread 1 and Thread 2 used different locks while accessing the counter variable, then they could run concurrently, possibly corrupting counter. Thread 1 Thread 2

Using Locks Say, multiple threads access a linked list One thread adds elements to the list Another thread deletes elements from the list Should the add and delete code use the same lock or different lock? How many lock variables should be created? We could create one lock for the entire list, or we could create one lock per list node More locks allow more concurrency but more potential for bugs Let’s see one kind of bug when using multiple locks Same or different lock: same lock because they are accessing the same data structure More concurrency but more bugs: for example, when using a separate lock per node, two threads can updates two different nodes of the list concurrently, however adding and removing elements becomes more tricky because one needs to acquire multiple locks to perform the list update correctly.

Deadlocks A set of threads is deadlocked if each thread is waiting for a resource (an event) that some another thread in the set holds (can perform) So no thread can run Breaking deadlocks generally requires killing threads Thread_A() { lock(resource_2); lock(resource_1); use resource 1 and 2; unlock(resource_1); unlock(resource_2); } Thread_B() { lock(resource_1); lock(resource_2); use resource 1 and 2; unlock(resource_2); unlock(resource_1); } Databases have deadlocks – however, they need heavy machinery to deal with them

Deadlock Conditions A deadlock situation can occur if and only if the following conditions hold simultaneously Mutual exclusion – each resource is assigned to one thread Hold and wait – threads can get more than one resource No preemption – acquired resources cannot be preempted Circular wait – threads form a circular chain, each waiting for a resource from the next thread in chain

Examples of Deadlock Mahjong Gridlock

Detecting Deadlocks Deadlocks can be detected using wait-for graphs Deadlock  Cycle in the wait-for graph requests Thread P1 R2 Resource holds R1 P2 Resource

Preventing Deadlocks Avoid hold and wait Prevent circular wait If a lock is unavailable, release previously acquired locks, and try to reacquire all locks again What are the problems with this approach? Prevent circular wait Number each of the resources Require each thread to acquire lower numbered resources before higher numbered resources Problems? problems: 1. can cause livelock (no progress) since we may keep retrying to acquire locks, 2. need to make sure that any data structure changes made within previous locks are undone before releasing locks. in practice, this is more tricky than it sounds. hard to use third-party software, since it is hard to number all resources if software comes from different sources

Deadlock, Starvation, Livelock A particular set of threads perform no work because of a circular wait condition Once a deadlock occurs, it does not go away Starvation A particular set of threads perform no work because the resources they need are being used by others constantly Starvation can be a temporary condition Livelock A set of threads continue to run but make no progress! Examples include interrupt livelock How can we solve interrupt livelock? Need to disable interrupts and switch to polling when interrupts are arriving too often, think about how you check email, text messages, etc.

Summary Concurrent programming model Races Threads enable concurrent execution Threads cooperate by accessing shared variables Races Concurrent accesses to shared variables can lead to races, i.e., incorrect execution under some thread interleavings Critical sections and mutual exclusion Avoiding races requires defining critical code sections that are run atomically (indivisibly) using mutual exclusion, i.e., only one thread accesses the critical section at a time Mutual exclusion is implemented using locks Locking requires h/w support (interrupts, atomic instructions)

Think Time What is a race condition? How can we protect against race conditions? Can locks be implemented by reading and writing to a binary variable? Why is it better to block rather than spin on a uniprocessor? Why is a blocking lock better than interrupt disabling or using spin locks? Is the blocking lock always better? race condition: when certain thread interleavings are possible that lead to incorrect results protect against races: certain code needs to be executed atomically/indivisibly locks with reading/writing binary var: no, the read and write need to be performed atomically. better to block: with spinning, no progress is possible since there is only one processor blocking lock is better: Interrupt disabling is performed for the entire critical section. Similarly, spin locks are held for the entire critical section. With blocking locks, interrupts are disabled or spin locks are held only in the blocking lock implementation, not for entire critical section E.g., if critical section is accessing disk, then it would be best to put thread to sleep and let other threads run on the same processor. is blocking lock always better: blocking locks have overhead because they call the thread scheduler to perform a thread switch. when the critical section is short (say a few instructions), it is better to use interrupt disabling (on single CPU) or spin locks (on multi-processor) because the blocking lock has high overhead for short critical sections.

Think Time How can one avoid starvation? How can one avoid livelock? avoiding starvation: use fifo queuing whenever a thread needs to wait on a condition (e.g., while trying to acquire a lock), so the first thread to wait on the condition gets served first (e.g., the lock implementation can use a wait queue from which threads acquire locks in fifo order). avoiding livelock: we need to ensure that threads run for a while before switching to running some other thread