Lecture 4: Synchronization

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

Synchronization NOTE to instructors: it is helpful to walk through an example such as readers/writers locks for illustrating the use of condition variables.
1 Chapter 5 Concurrency: Mutual Exclusion and Synchronization Principals of Concurrency Mutual Exclusion: Hardware Support Semaphores Readers/Writers Problem.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
COSC 3407: Operating Systems Lecture 8: Semaphores, Monitors and Condition Variables.
Chapter 6: Process Synchronization. Outline Background Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores Classic Problems.
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
CPS110: Implementing threads/locks on a uni-processor Landon Cox.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Concurrency Recitation – 2/24 Nisarg Raval Slides by Prof. Landon Cox.
9/8/2015cse synchronization-p1 © Perkins DW Johnson and University of Washington1 Synchronization Part 1 CSE 410, Spring 2008 Computer Systems.
CS510 Concurrent Systems Introduction to Concurrency.
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Synchronization. Synchronization Motivation When threads concurrently read/write shared memory, program behavior is undefined – Two threads write to the.
Chapter 6: Process Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Module 6: Process Synchronization Background The.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
1 Condition Variables CS 241 Prof. Brighten Godfrey March 16, 2012 University of Illinois.
COSC 3407: Operating Systems Lecture 9: Readers-Writers and Language Support for Synchronization.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
1 Previous Lecture Overview  semaphores provide the first high-level synchronization abstraction that is possible to implement efficiently in OS. This.
Implementing Lock. From the Previous Lecture  The “too much milk” example shows that writing concurrent programs directly with load and store instructions.
CS 153 Design of Operating Systems Winter 2016 Lecture 7: Synchronization.
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Implementing Mutual Exclusion Andy Wang Operating Systems COP 4610 / CGS 5765.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
CS162 Section 2. True/False A thread needs to own a semaphore, meaning the thread has called semaphore.P(), before it can call semaphore.V() False: Any.
6.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 6.5 Semaphore Less complicated than the hardware-based solutions Semaphore S – integer.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
CS703 - Advanced Operating Systems
CSE 120 Principles of Operating
CS703 – Advanced Operating Systems
Background on the need for Synchronization
Synchronization.
Monitors, Condition Variables, and Readers-Writers
Chapter 5: Process Synchronization
Introduction to Operating Systems
Anthony D. Joseph and Ion Stoica
Module 7a: Classic Synchronization
CS703 - Advanced Operating Systems
Anthony D. Joseph and John Canny
Critical section problem
Grades.
Implementing Mutual Exclusion
CSCI1600: Embedded and Real Time Software
February 6, 2013 Ion Stoica CS162 Operating Systems and Systems Programming Lecture 5 Semaphores, Conditional Variables.
Implementing Mutual Exclusion
September 12, 2012 Ion Stoica CS162 Operating Systems and Systems Programming Lecture 5 Semaphores, Conditional Variables.
CSE 451: Operating Systems Autumn 2004 Module 6 Synchronization
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2004 Module 6 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CSE 153 Design of Operating Systems Winter 2019
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
CSCI1600: Embedded and Real Time Software
Lecture 20: Synchronization
CSE 542: Operating Systems
CPS110: Thread cooperation
September 19, 2018 Prof. Ion Stoica
Review The Critical Section problem Peterson’s Algorithm
Presentation transcript:

Lecture 4: Synchronization SE350: Operating Systems Lecture 4: Synchronization

Main Points Challenges Locks and condition variables Why does execution depend on interleaving of threads? Locks and condition variables How can we enforce a logical sequence of operations? Designing shared objects What is a good way of writing multithreaded programs?

Synchronization Motivation When threads concurrently read/write shared memory, program behavior is undefined Two threads write to a variable; which one should win? Thread schedule is non-deterministic Behavior changes over different runs of the same program Compiler and hardware reorder instructions You want program behavior to be a specific function of input – not of the sequence of who went first. You want the behavior to be deterministic – not to vary from run to run Even if you ignore those, the compiler will mess you up bad (compared to what you think will happen) And even If you ignore that, the hardware will mess you up bad. To solve any of these, you need synchronization.

Question: Can This Panic? // Thread 1 p = someComputation(); pInitialized = true; // Thread 2 While (!pInitialized); q = someFunc(p); If (q != someFunc(p)) panic();

Why Reordering? Why do compilers reorder instructions? Generating efficient code needs ctrl/data dependency analysis If variables can spontaneously change, most compiler optimizations become impossible Why do CPUs reorder instructions? Write buffering: allow next instruction to execute while write is being completed Fix: memory barrier Instruction to compiler/CPU All ops before barrier complete before barrier returns No op after barrier starts until barrier returns And how can you tell if your compiler might be doing this to you? Or if you wrote the program and set up the Makefile to use the right compiler flags, what is to keep someone else from coming along and seeing the Makefile, and say – hm, I wonder why they haven’t turned on optimization? I need it to go fast, so let’s try that. And it works, so you move on. Only a few months later when it gets out in the real world, it starts crashing! Not just the compiler! Also the hardware! To be able to reason about programs, for the rest of the lecture we assume instructions are not reordered, unless it is explicitly mentioned otherwise.

Too Much Milk Example Roommate A Roommate B 12:30 Look in fridge. Out of milk. 12:35 Leave for store. 12:40 Arrive at store. 12:45 Buy milk. 12:50 Arrive home, put milk away. 12:55 01:00 Arrive home, put milk away. Oh no!

Definitions Race condition: output of a concurrent program depends on order of operations between threads Mutual exclusion: only one thread does a particular thing at a time Critical section: piece of code that only one thread can execute at once

Definitions (cont.) Lock: prevent someone from doing something Lock before entering critical section, before accessing shared data Unlock when leaving, after done accessing shared data Wait if locked (all synchronization involves waiting!) Safety: program never enters a bad state Never more than one person buys milk Liveness: program eventually enters a good state Someone eventually buys milk if needed

Too Much Milk (Try #1) if (!milk) { if (!note) { leave note; buy milk; remove note; } What can go wrong? Safety! Thread A, Thread B

Too Much Milk (Try #2) // Thread A leave note A; if (!note B) { if (!milk) buy milk; } remove note A; // Thread B leave note B; if (!note A) { if (!milk) buy milk; } remove note B; This is safe but what can go wrong? Liveness!

Too Much Milk (Try #3) Can guarantee at X and Y that either: // Thread A leave note A; while (note B) // X do nothing; if (!milk) buy milk; remove note A; // Thread B leave note B; if (!note A) { // Y if (!milk) buy milk; } remove note B; Can guarantee at X and Y that either: Safe for me to buy Other will buy, ok to quie   At Y: if noNote A, safe for B to buy (means A hasn't started yet) if note A, A is either buying, or waiting for B to quit, so ok for B to quit At X: if nonote B, safe to buy if note B, don't know. A hangs around. Either: if B buys, done if B doesn't buy, A will.

Lessons Solution is complicated “obvious” code often has bugs Modern compilers/architectures reorder instructions Making reasoning even more difficult Generalizing to many threads/processors Even more complex: see Peterson’s algorithm We can use atomic load and store and get safety and liveness. This is not satisfying though because the code is complex, it has busy waiting and it could fail if we have reordering. Generalization of this solution for n threads is called Peterson’s algorithm which is even more complex.

Roadmap Sync. variables layer Atomic instructions layer (we don’t want to do it using only atomic load and store) read-modify-write instructions are processor-specific instructions. Hardware layer

Locks At most one lock holder at a time (safety) Lock::acquire() Wait until lock is free, then take it Lock::release() Release lock, waking up anyone waiting for it At most one lock holder at a time (safety) If no one holding, acquire gets lock (progress) If all lock holders finish and no higher priority waiters, waiter eventually gets lock (bounded waiting) Progress + bounded waiting = liveness

Question: Why only Acquire/Release Suppose we add a method to a lock, to ask if the lock is free. Suppose it returns true. Is the lock: Free? Busy? Don’t know?

Too Much Milk (Try #4) lock.acquire(); if (!milk) buy milk; lock.release(); Locks allow concurrent code to be much simpler

Lock Example: Malloc/Free char *malloc (n) { heaplock.acquire(); p = allocate memory; heaplock.release(); return p; } void free(char *p) { heaplock.acquire(); put p back on free list; heaplock.release(); } It is essential to use the same lock!

Rules for Using Locks Lock is initially free Always acquire before accessing shared data Beginning of procedure! Always release after finishing with shared data End of procedure! Only the lock holder can release DO NOT throw lock for someone else to release Never access shared data without lock Danger! Don’t do it even if it’s tempting!

Double Checked Locking getP() { if (p == NULL) { lock.acquire(); if (p == NULL) p = newP(); lock.release(); } return p; newP() { tmp = malloc(sizeof(p)); tmp->field1 = …; tmp->field2 = …; return tmp; } Will this code work? No! p can be written by hardware/compiler before initialization is complete

Single Checked Locking getP() { lock.acquire(); if (p == NULL) p = newP(); lock.release(); return p; } newP() { tmp = malloc(sizeof(p)); tmp->field1 = …; tmp->field2 = …; return tmp; } Expensive but safe!

Example: Bounded Buffer tryget() { lock.acquire(); item = NULL; if (front < tail) { item = buf[front%MAX]; front++; } lock.release(); return item; tryput(item) { lock.acquire(); success = FALSE; if ((tail–front) < MAX) { buf[tail%MAX] = item; tail++; success = TRUE; } lock.release(); return success; For simplicity, assume no wraparound on the integers front and last; I’ll assume you can fix that if you want Front = total number of items that have ever been removed; tail is total # ever inserted Locks at beginning of procedure; unlock at end; no access outside of locks Note that we don’t know when we release the lock whether the buffer is still empty – we only know the state of the buffer while holding the lock! Initially: front = tail = 0; lock = FREE; MAX is buffer capacity

Question If tryget() returns NULL, do we know the buffer is empty? Hint: NO ! If we poll tryget() in a loop, what happens to a thread calling tryput()? Hint: delay! No – someone else might have filled the buffer. We only know the buffer *was* empty

Condition Variables CV::wait(Lock *lock) Atomically release lock and relinquish processor Reacquire the lock when wakened CV::signal() Wake up a waiter, if any CV::broadcast() Wake up all waiters, if any Waiting inside a critical section Called only when holding a lock

Properties of Condition Variables CV is memoryless No internal memory except a queue of waiting threads No effect in calling signal/broadcast on empty queue wait atomically releases the lock No separation between adding the thread to the waiting queue and releasing the lock Re-enabled waiting thread may not run immediately No atomicity between signal/broadcast and the return from wait

Condition Variable Design Pattern methodThatWaits() { lock.acquire(); // Read/write shared state while (!testSharedState()) cv.wait(&lock); lock.release(); } methodThatSignals() { lock.acquire(); // Read/write shared state // If testSharedState is now true cv.signal(); lock.release(); } An efficient way to wait for something to happen.

Example: Bounded Buffer get() { lock.acquire(); while (front == tail) empty.wait(&lock); item = buf[front % MAX]; front++; full.signal(); lock.release(); return item; } put(item) { lock.acquire(); while ((tail–front) == MAX) full.wait(&lock); buf[tail % MAX] = item; tail++; empty.signal(); lock.release(); } For simplicity, assume no wraparound on the integers front and last; I’ll assume you can fix that if you want Locks at beginning of procedure; unlock at end; no access outside of procedure Initially: front = tail = 0; MAX is buffer capacity empty and full are condition variables

Question Does the kth call to get return the kth item put? Hint: wait must re-acquire the lock after the signaller releases it. 

Pre/Post Conditions What is state of the bounded buffer at lock acquire? front <= tail front + MAX >= tail These are also true on return from wait And at lock release Allows for proof of correctness

Pre/Post Conditions methodThatWaits() { lock.acquire(); // Pre-condition: State is consistent // Read/write shared state while (!testSharedState()) cv.wait(&lock); // WARNING: shared state may have // changed! But testSharedState is // TRUE and pre-condition is true lock.release(); } methodThatSignals() { lock.acquire(); // Pre-condition: State is consistent // Read/write shared state // If testSharedState is now true cv.signal(); // NO WARNING: signal keeps lock lock.release(); } An efficient way to wait for something to happen.

Rules for Condition Variables ALWAYS hold lock when calling wait, signal, broadcast CV::wait MUST be called in a loop while (needToWait()) { condition.wait(&lock); } And not: if (needToWait()) {

Java Manual When waiting upon a Condition, a “spurious wakeup” is permitted to occur, in general, as a concession to the underlying platform semantics. This has little practical impact on most application programs as a Condition should always be waited upon in a loop, testing the state predicate that is being waited for.

Structured Synchronization Identify objects or data structures that can be accessed by multiple threads concurrently Add locks to object/module Grab lock on start to every method/procedure Release lock on finish If you need to wait Use while(needToWait()) { condition.wait(lock); } Do not assume when you wake up, signaler just ran If you do something that might wake someone up Signal or Broadcast Always leave shared state variables in a consistent state When lock is released, or when waiting

Remember the Rules Use consistent structure Always use locks and condition variables Always acquire lock at the beginning of procedure, release at the end Always hold lock when using a condition variable Always wait in while loop Never spin in sleep()

Roadmap

Implementing Synchronization Take 1: using memory load/store See too much milk solution/Peterson’s algorithm Take 2: Lock::acquire() { disable interrupts } Lock::release() { enable interrupts }

Lock Implementation (Uniprocessor) Lock::acquire() { disableInterrupts(); if (value == BUSY) { waiting.add(myTCB); myTCB->state = WAITING; next = readyList.remove(); thread_switch(myTCB, next); myTCB->state = RUNNING; } else { value = BUSY; } enableInterrupts(); Lock::release() { disableInterrupts(); if (!waiting.Empty()) { next = waiting.remove(); next->state = READY; readyList.add(next); } else { value = FREE; } enableInterrupts(); Note that I throw the lock value Also: enable/disable interrupts is a memory barrier operation – so it forces all memory writes to complete first

What Thread is Currently Running? Scheduler needs to know TCB of running thread  To suspend and switch to a new thread  To check if current thread holds a lock before acquiring or releasing it  On a uniprocessor, easy: just use a global variable  Change the value in switch  On a multiprocessor?

What Thread is Currently Running? (Multiprocessor Version) Compiler dedicates a register  Hardware register holds processor number  x86 RDTSCP: read timestamp counter and processor ID  OS keeps an array, indexed by processor ID, listing current thread on each CPU  Fixed-size thread stacks: put a pointer to the TCB at the bottom of its stack  Find it by masking the current stack pointer 

Mutual Exclusion Support on a Multiprocessor Read-modify-write instructions Atomically read a value from memory, operate on it, and then write it back to memory Intervening instructions prevented in hardware Examples Test and set Intel: xchgb, lock prefix Compare and swap Any of these can be used for implementing locks and condition variables!

Spinlocks A spinlock is a lock where the processor waits in a loop for the lock to become free Assumes lock will be held for a short time Used to protect CPU scheduler and to implement locks

Spinlocks (cont.) Spinlock::acquire() { while (testAndSet(&lockValue) == BUSY); } Spinlock::release() { lockValue = FREE; memorybarrier();

Spinlocks and Interrupt Handlers Suppose an interrupt handler needs to access some shared data ⇒ acquires spinlock  To put a thread on the ready list (I/O completion) To switch between threads (time slice)  What happens if a thread holds that spinlock with interrupts enabled?  Deadlock is possible unless ALL uses of that spinlock are with interrupts disabled

How Many Spinlocks? Various data structures Queue of waiting threads on lock X Queue of waiting threads on lock Y List of threads ready to run One spinlock per kernel? Bottleneck! One spinlock per lock One spinlock for the scheduler ready list Per-core ready list: one spinlock per core Scheduler lock requires interrupts off!

Lock Implementation (Multiprocessor) Lock::acquire() { disableInterrupts(); spinLock.acquire(); if (value == BUSY) { waiting.add(myTCB); suspend(&spinlock); } else { value = BUSY; } spinLock.release(); enableInterrupts(); Lock::release() { disableInterrupts(); spinLock.acquire(); if (!waiting.Empty()) { next = waiting.remove(); scheduler.makeReady(next); } else { value = FREE; } spinLock.release(); enableInterrupts(); Assumes that suspend releases the spinlock once its safe to do so. Also, note the scheduler protected by a different spinlock. MyTCB is the macro for the previous slide – whatever machine-dependent way to find the current TCB.

Semaphores Semaphore has a non-negative integer value P() atomically waits for value to become > 0, then decrements V() atomically increments value (waking up waiter if needed) Semaphores are like integers except: Only operations are P and V Operations are atomic If value is 1, two P’s will result in value 0 and one waiter Semaphores are useful for Unlocked wait: interrupt handler, fork/join

Semaphore Implementation Semaphore::P() { oldIPL = setInterrupts(OFF); spinLock.acquire(); if (value == 0) { waiting.add(myTCB); suspend(&spinlock); } else { value--; } spinLock.release(); setInterrupts(oldIPL); Semaphore::V() { oldIPL = setInterrupts(OFF); spinLock.acquire(); if (!waiting.Empty()) { next = waiting.remove(); scheduler.makeReady(next); } else { value++; } spinLock.release(); setInterrupts(oldIPL); Assumes that suspend releases the spinlock once its safe to do so. Also, note the scheduler protected by a different spinlock. MyTCB is the macro for the previous slide – whatever machine-dependent way to find the current TCB.

Lock Implementation (Multiprocessor) Sched::suspend(SpinLock ∗lock) { TCB ∗next; oldIPL = setInterrupts(OFF); schedSpinLock.acquire(); spinLock−>release(); myTCB−>state = WAITING; next = readyList.remove(); thread_switch(myTCB, next); myTCB−>state = RUNNING; schedSpinLock.release(); setInterrupts(oldIPL); } Sched::makeReady(TCB ∗thread) { oldIPL = setInterrupts(OFF); schedSpinLock.acquire(); readyList.add(thread); thread−>state = READY; schedSpinLock.release(); enableInterrupts(); }

Lock Implementation, Linux Most locks are free most of the time. Why? Linux implementation takes advantage of this fact Fast path If lock is FREE, and no one is waiting, two instructions to acquire lock If no one is waiting, two instructions to release the lock Slow path If lock is BUSY or someone is waiting (see multiproc) Two versions: one with interrupts off, one w/o

Lock Implementation, Linux (cont.) struct mutex { // 1: unlocked; 0: locked; // negative : locked, // possible waiters atomic_t count; spinlock_t wait_lock; struct list_head wait_list; }; // atomic decrement // %eax is pointer to count lock decl (%eax) jns 1f // jump if not signed // (if value is now 0) call slowpath_acquire 1:

Application Locks A system call for every lock acquire/release Context switch in the kernel Faster alternative:  Spinlock at user level  “Lazy” switch into kernel if spin for period of time  Or scheduler activations:  Thread context switch at user level

Readers/Writers Lock A common variant for mutual exclusion One writer at a time, if no readers  Many readers, if no writer  How might we implement this?  ReaderAcquire(), ReaderRelease()  WriterAcquire(), WriterRelease()  Need a lock to keep track of shared state  Need CV for waiting if readers/ writers are in progress  Some state variables

Readers/Writers Lock (cont.) Lock lock = FREE; CV okToRead = NULL; CV okToWrite = NULL; int AW = 0; //active writers int AR = 0; // active readers int WW = 0; // waiting writers int WR = 0; // waiting readers

Readers/Writers Lock (cont.) // Readers lock.acquire(); while (AW > 0 || WW > 0) { WR++; okToRead.wait(&lock); WR--; } AR++; lock.release(); // Read data! lock.acquire(); AR--; if (AR == 0 && WW > 0) okToWrite.signal(); // Writers lock.acquire(); while (AW > 0 || AR > 0) { WW++; okToRead.wait(&lock); WW--; } AW++; lock.release(); // Write data! lock.acquire(); AW--; if (WW > 0) { okToWrite.signal(); } else if (WR > 0) { okToRead.broadcast();

Questions Can readers starve? Can writers starve? Yes: writers take priority  Can writers starve?  Yes: a waiting writer may not be able to proceed, if another writer slips in between signal and wakeup 

Readers/Writers Lock, w/o Writer Starvation (Take 1) Writer() { lock.acquire(); // check if another thread is already waiting while ((AW + AR + WW) > 0) { WW++; okToWrite.wait(&lock); WW--; } AW++; lock.release(); ... }

Readers/Writers Lock, w/o Writer Starvation (Take 2) // check in lock.acquire(); myPos = numWriters++; while ((AW + AR > 0) || (myPos > nextToGo)) { WW++; okToWrite.wait(&lock); WW--; } AW++; lock.release(); // check out lock.acquire(); AW--; nextToGo++; if (WW > 0) { okToWrite.signal(); } else if (WR > 0) { okToRead.broadcast(); } lock.release();

Readers/Writers Lock, w/o Writer Starvation (Take 3) // check in lock.acquire(); myPos = numWriters++; myCV = new CV; Writers.append(myCV); while ((AW + AR > 0) || (myPos > nextToGo)) { WW++; myCV.wait(&lock); WW--; } AW++; delete myCV; lock.release(); // check out lock.acquire(); AW--; nextToGo++; if (WW > 0) { cv = writers.removeFront(); cv.signal(); } else if (WR > 0) { okToRead.broadcast(); } lock.release();

Mesa vs. Hoare Semantics Signal puts waiter on ready list Signaller keeps lock and processor Hoare Signal gives processor and lock to waiter When waiter finishes, processor/lock given back to signaller Nested signals possible!

FIFO Bounded Buffer (Hoare Semantics) get() { lock.acquire(); if (front == tail) empty.wait(&lock); item = buf[front % MAX]; front++; full.signal(); lock.release(); return item; } put(item) { lock.acquire(); if ((tail – front) == MAX) full.wait(&lock); buf[last % MAX] = item; last++; empty.signal(); // CAREFUL: someone else ran lock.release(); } For simplicity, assume no wraparound on the integers front and last; I’ll assume you can fix that if you want Locks at beginning of procedure; unlock at end; no access outside of procedure Walk through an example Initially: front = tail = 0; MAX is buffer capacity empty and full are condition variables

FIFO Bounded Buffer (Mesa Semantics) Create a condition variable for every waiter Queue condition variables (in FIFO order) Signal picks the front of the queue to wake up Careful if spurious wakeups! Easily extends to case where queue is LIFO, priority, priority donation, … With Hoare semantics, not as easy

FIFO Bounded Buffer (Mesa Semantics, put() is Similar) get() { lock.acquire(); myPosition = numGets++; self = new Condition; nextGet.append(self); while (front < myPosition || front == tail) { self.wait(&lock); } // nextGet.first == self delete nextGet.remove(); item = buf[front % MAX]; front++; if (next = nextPut.first()) next->signal(); lock.release(); return item; } For simplicity, assume no wraparound on the integers front and last; I’ll assume you can fix that if you want Locks at beginning of procedure; unlock at end; no access outside of procedure Walk through an example Initially: front = tail = numGets = 0; MAX is buffer capacity nextGet and nextPut are queues of Condition Variables

Semaphore Bounded Buffer get() { fullSlots.P(); mutex.P(); item = buf[front % MAX]; front++; mutex.V(); emptySlots.V(); return item; } put(item) { emptySlots.P(); mutex.P(); buf[last % MAX] = item; last++; mutex.V(); fullSlots.V(); } Why does producer P + V different semaphores than the consumer? Producer creates full buffers; destroys empty buffers!   Is order of P's important? Yes! Deadlock. Is order of V's important? No, except it can affect scheduling efficiency. What if we have 2 producers or 2 consumers? Do we need to change anything? Can we use semaphores for FIFO ordering? Initially: front = last = 0; MAX is buffer capacity mutex = 1; emptySlots = MAX; fullSlots = 0;

Implementing Condition Variables using Semaphores (Take 1) wait(lock) { lock.release(); semaphore.P(); lock.acquire(); } signal() { semaphore.V(); Condition variables have no history, but semaphores do have history.   What if thread signals and no one is waiting? No op. What if thread later calls wait? Thread waits. What if thread V's and no one is waiting? Increment. What if thread later does P? Decrement and continue. In other words, P + V are commutative -- result is the same no matter what order they occur. Condition variables are not commutative: wait doesn't return until next signal. That's why they must be in a critical section -- need to access state variables to do their job. With monitors, if I signal 15000 times, when no one is waiting, next wait will still go to sleep! But with the above code, next 15000 threads that wait will return immediately!

Implementing Condition Variables using Semaphores (Take 2) wait(lock) { lock.release(); semaphore.P(); lock.acquire(); } signal() { if (semaphore is not empty) semaphore.V(); Does this work? For one, not legal to look at contents of semaphore queue. But also: race condition -- signaller can slip in after lock is released, and before wait. Then waiter never wakes up!   Or: suppose, put thread on separate Condition variable queue. Then release lock, then P. Won't work because a third thread can come in and try to wait, gobbling up the V, so that the original waiter never wakes up! Need to release lock and go to sleep atomically.

Implementing Condition Variables using Semaphores (Take 3) wait(lock) { semaphore = new Semaphore; queue.Append(semaphore); // queue of waiting threads lock.release(); semaphore.P(); lock.acquire(); } signal() { if (!queue.Empty()) { semaphore = queue.Remove(); semaphore.V(); // wake up waiter Does this work? For one, not legal to look at contents of semaphore queue. But also: race condition -- signaller can slip in after lock is released, and before wait. Then waiter never wakes up!   Or: suppose, put thread on separate Condition variable queue. Then release lock, then P. Won't work because a third thread can come in and try to wait, gobbling up the V, so that the original waiter never wakes up! Need to release lock and go to sleep atomically.

Communicating Sequential Processes (CSP/Google Go) A thread per shared object Only thread allowed to touch object’s data To call a method on the object, send thread a message with method name, arguments Thread waits in a loop, get msg, do operation No memory races!

Example: Bounded Buffer get() { lock.acquire(); while (front == tail) empty.wait(&lock); item = buf[front % MAX]; front++; full.signal(); lock.release(); return item; } put(item) { lock.acquire(); while ((tail–front) == MAX) full.wait(&lock); buf[tail % MAX] = item; tail++; empty.signal(); lock.release(); } For simplicity, assume no wraparound on the integers front and last; I’ll assume you can fix that if you want Locks at beginning of procedure; unlock at end; no access outside of procedure Walk through an example Initially: front = tail = 0; MAX is buffer capacity empty and full are condition variables

Bounded Buffer (CSP) while (cmd = getNext()) { if (cmd == GET) { if (front < tail) { // do get // send reply // if pending put, do it // and send reply } else { // queue get operation } } else { // cmd == PUT if ((tail – front) < MAX) { // do put // send reply // if pending get, do it // and send reply } else { // queue put operation }

Locks/CVs vs. CSP Create a lock on shared data = create a single thread to operate on data Call a method on a shared object = send a message/wait for reply Wait for a condition = queue an operation that can’t be completed just yet Signal a condition = perform a queued operation, now enabled

Remember the Rules Use consistent structure Always use locks and condition variables Always acquire lock at beginning of procedure, release at end Always hold lock when using a condition variable Always wait in while loop Never spin in sleep()

Acknowledgment This lecture is a slightly modified version of the one prepared by Tom Anderson