Review: threads and synchronization

Slides:

Advertisements

Similar presentations

Operating Systems Semaphores II

Advertisements

Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.

Chapter 6: Process Synchronization

Mutual Exclusion.

CPS110: Implementing threads/locks on a uni-processor Landon Cox.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.

Threads and Critical Sections Vivek Pai / Kai Li Princeton University.

Concurrency Recitation – 2/24 Nisarg Raval Slides by Prof. Landon Cox.

CS510 Concurrent Systems Introduction to Concurrency.

CPS110: Intro to processes, threads and concurrency Author: Landon Cox.

COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.

11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.

CPS110: Implementing threads Landon Cox. Recap and looking ahead Hardware OS Applications Where we’ve been Where we’re going.

1 Computer Systems II Introduction to Processes. 2 First Two Major Computer System Evolution Steps Led to the idea of multiprogramming (multiple concurrent.

Operating Systems CSE 411 CPU Management Sept Lecture 10 Instructor: Bhuvan Urgaonkar.

CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.

U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.

CPS110: Thread cooperation Landon Cox. Constraining concurrency  Synchronization  Controlling thread interleavings  Some events are independent  No.

1 Previous Lecture Overview  semaphores provide the first high-level synchronization abstraction that is possible to implement efficiently in OS. This.

CPS110: Intro to processes Landon Cox. OS Complexity  Lines of code  XP: 40 million  Linux 2.6: 6 million  (mostly driver code)  Sources of complexity.

Review: threads and processes Landon Cox January 20, 2016.

Implementing Lock. From the Previous Lecture  The “too much milk” example shows that writing concurrent programs directly with load and store instructions.

1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.

CS510 Concurrent Systems Jonathan Walpole. Introduction to Concurrency.

Slides on threads borrowed by Chase Landon Cox. Thread states BlockedReady Running Thread is scheduled Thread is Pre-empted (or yields) Thread calls Lock.

Concurrency: Locks and synchronization Slides by Prof. Cox.

Implementing Mutual Exclusion Andy Wang Operating Systems COP 4610 / CGS 5765.

Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.

CPS110: Implementing threads on a uni-processor Landon Cox January 29, 2008.

CS162 Section 2. True/False A thread needs to own a semaphore, meaning the thread has called semaphore.P(), before it can call semaphore.V() False: Any.

6/27/20161 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam King,

Scheduler activations Landon Cox March 23, What is a process? Informal A program in execution Running code + things it can read/write Process ≠

December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.

Tutorial 2: Homework 1 and Project 1

Multiprogramming. Readings r Chapter 2.1 of the textbook.

CS703 - Advanced Operating Systems

Processes and threads.

EMERALDS Landon Cox March 22, 2017.

CPS110: Reader-writer locks

Process Management Process Concept Why only the global variables?

Multi-processor Scheduling

CS 6560: Operating Systems Design

CS703 – Advanced Operating Systems

Background on the need for Synchronization

Scheduler activations

COMPSCI210 Recitation 12 Oct 2012 Vamsi Thummala

Atomic Operations in Hardware

Atomic Operations in Hardware

Designing Parallel Algorithms (Synchronization)

Processor Fundamentals

CSE 451: Operating Systems Spring 2012 Module 6 Review of Processes, Kernel Threads, User-Level Threads Ed Lazowska 570 Allen.

Jonathan Walpole Computer Science Portland State University

PROCESS MANAGEMENT Information maintained by OS for process management

Thread Implementation Issues

Lecture Topics: 11/1 General Operating System Concepts Processes

Implementing Mutual Exclusion

Slides by Prof. Landon Cox and Vamsi Thummala

Implementing Mutual Exclusion

CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization

CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization

CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization

CSE 153 Design of Operating Systems Winter 19

CS333 Intro to Operating Systems

Chapter 6: Synchronization Tools

Foundations and Definitions

EECE.4810/EECE.5730 Operating Systems

Don Porter Portions courtesy Emmett Witchel

CPS110: Thread cooperation

Threads CSE 2431: Introduction to Operating Systems

Presentation transcript:

Review: threads and synchronization Landon Cox January 13/18, 2017

Intro to processes Remember, for any area of OS, ask Physical reality? What interface does the hardware provide? What interface does the OS provide? Physical reality? Single computer (CPUs + memory) Execute instructions from many programs What applications see? Each app thinks it has its own CPU + memory

Hardware, OS interfaces Applications Job 1 Job 2 Job 3 CPU, Mem CPU, Mem CPU, Mem OS Memory CPUs Hardware

What is a process? Informal Formal A program in execution Running code + things it can read/write Process ≠ program Formal ≥ 1 threads in their own address space (soon threads will share an address space)

Parts of a process Thread Address space Sequence of executing instructions Active: does things Address space Data the process uses as it runs Passive: acted upon by threads

Play analogy Process is like a play performance Program is like the play’s script What are the threads? Threads What is the address space? Address space

What is in the address space? Program code Instructions, also called “text” Data segment Global variables, static variables Heap (where “new” memory comes from) Stack Where local variables are stored

Review of the stack Each stack frame contains a function’s Local variables Parameters Return address Saved values of calling function’s registers The stack enables recursion

Example stack A C B A main tmp=0 RA=0x8048347 const=0 RA=0x8048354 void C () { A (0); } void B () { C (); void A (int tmp){ if (tmp) B (); int main () { A (1); return 0; 0x8048347 0x8048354 0x8048361 0x804838c Code Memory Stack SP 0xfffffff tmp=0 RA=0x8048347 A SP const=0 RA=0x8048354 C SP RA=0x8048361 B … SP tmp=1 RA=0x804838c A SP const1=1 const2=0 main 0x0

The stack and recursion void A (int bnd){ if (bnd) A (bnd-1); } int main () { A (3); return 0; 0x8048361 0x804838c Code Memory Stack SP 0xfffffff bnd=0 RA=0x8048361 A SP bnd=1 RA=0x8048361 A SP bnd=2 RA=0x8048361 A … … SP bnd=3 RA=0x804838c A SP const1=3 const2=0 main How can recursion go wrong? Can overflow the stack … Keep adding frame after frame 0x0

What is missing? What state isn’t in the address space? Registers Program counter (PC) General purpose registers Review architecture for more details

Multiple threads in an addr space Several actors on a single set Sometimes they interact (speak, dance) Sometimes they are apart (different scenes)

Private vs global thread state What state is private to each thread? PC (where actor is in his/her script) Stack, SP (actor’s mindset) What state is shared? Global variables, heap (props on set) Code (like the script)

Concurrency Concurrency Primary topics Having multiple threads active at a time Concurrent != parallel Thread is the unit of concurrency Primary topics How threads cooperate on a single task How multiple threads can share the CPU

Address spaces Address space Primary topics Unit of “state partitioning” Primary topics Many addr spaces sharing physical memory Efficiency Safety (protection)

Cooperating threads Memory CPU Thread A Thread B Thread C Assume each thread has its own CPU We will relax this assumption later CPUs run at unpredictable speeds Can be stalled for any number of reasons Source of non-determinism Memory CPU Thread A Thread B Thread C

Non-determinism and ordering Thread A Thread B Thread C Global ordering Time Why do we care about the global ordering? Might have dependencies between events Different orderings can produce different results Why is this ordering unpredictable? Can’t predict how fast processors will run

Non-determinism example 1 Thread A: cout << “ABC”; Thread B: cout << “123”; Possible outputs? “A1BC23”, “ABC123”, … Impossible outputs? Why? “321CBA”, “B12C3A”, … What is shared between threads? Screen, maybe the output buffer

Non-determinism example 2 y=10; Thread A: int x = y+1; Thread B: y = y*2; Possible results? A goes first: x = 11 and y = 20 B goes first: y = 20 and x = 21 What is shared between threads? Variable y

Non-determinism example 3 Thread A: x = 1; Thread B: x = 2; Possible results? B goes first: x = 1 A goes first: x = 2 Is x = 3 possible?

Example 3, continued What if “x = <int>;” is implemented as x := x & 0 x := x | <int> Consider this schedule Thread A: x := x & 0 Thread B: x := x & 0 Thread B: x := x | 1 Thread A: x := x | 2

Atomic operations Must know what operations are atomic Atomic before we can reason about cooperation Atomic Indivisible Happens without interruption Between start and end of atomic action No events from other threads can occur

Review of examples Print example (ABC, 123) What did we assume was atomic? What if “print” is atomic? What if printing a char was not atomic? Arithmetic example (x=y+1, y=y*2)

Atomicity in practice On most machines Memory assignment/reference is atomic E.g.: a=1, a=b Many other instructions are not atomic E.g.: double-precision floating point store (often involves two memory operations)

Virtual/physical interfaces Applications SW atomic operations OS If you don’t have atomic operations, you can’t make one. HW atomic operations Hardware

Constraining concurrency Synchronization Controlling thread interleavings Some events are independent No shared state Relative order of these events don’t matter Other events are dependent Output of one can be input to another Their order can affect program results

Goals of synchronization All interleavings must give correct result Correct = works no matter how fast threads run Constrain program as little as possible Why? Constraints slow program down Constraints create complexity

Raising the level of abstraction Locks Also called mutexes Provide mutual exclusion Prevent threads from entering a critical section Lock operations Lock (aka Lock::acquire) Unlock (aka Lock::release)

Lock operations Lock: wait until lock is free, then acquire it This is a busy-waiting implementation Unlock: atomic lock = 0 do { if (lock is free) { lock = 1 break } } while (1) Must be atomic with respect to other threads calling this code

Elements of locking Key idea Threads are either running or blocked The lock is initially free Threads acquire lock before an action Threads release lock when action completes Lock() must wait if someone else has lock Key idea All synchronization involves waiting Threads are either running or blocked

Example: thread-safe queue enqueue () { lock (qLock) // ptr is private // head is shared new_element = new node(); if (head == NULL) { head = new_element; } else { node *ptr; // find queue tail for (ptr=head; ptr->next!=NULL; ptr=ptr->next){} ptr->next=new_element; } new_element->next=0; unlock(qLock); dequeue () { lock (qLock); element=NULL; if (head != NULL) { // if queue non-empty if (head->next!=0) { // remove head element=head->next; head->next= head->next->next; } else { element = head; head = NULL; } unlock (qLock); return element; What can go wrong?

Thread-safe queue Can enqueue unlock anywhere? Must leave shared data No Must leave shared data In a consistent/sane state Data invariant “consistent/sane state” “always” true enqueue () { lock (qLock) // ptr is private // head is shared new_element = new node(); if (head == NULL) { head = new_element; } else { node *ptr; // find queue tail for (ptr=head; ptr->next!=NULL; ptr=ptr->next){} ptr->next=new_element; } unlock(qLock); // safe? new_element->next=0;

Invariants What are the queue invariants? Each node appears once (from head to null) Enqueue results in prior list + new element Dequeue removes exactly one element Can invariants ever be false? Must be Otherwise you could never change states

More on invariants So when is the invariant broken? Can only be broken while lock is held And only by thread holding the lock

BROKEN INVARIANT (CLOSE AND LOCK DOOR) http://www.flickr.com/photos/jacobaaron/3489644869/ http://www.flickr.com/photos/jacobaaron/3489644869/

INVARIANT RESTORED (UNLOCK DOOR) http://www.flickr.com/photos/jacobaaron/3489644869/ http://www.flickr.com/photos/jacobaaron/3489644869/

More on invariants So when is the invariant broken? Can only be broken while lock is held And only by thread holding the lock Really a “public” invariant The data’s state in when the lock is free Like having your house tidy before guests arrive Hold lock whenever accessing shared data

More on invariants What about reading shared data? Still must hold lock Else another thread could break invariant (Thread A prints Q as Thread B enqueues)

Intro to ordering constraints Say you want dequeue to wait while the queue is empty Can we just busy-wait? No! Still holding lock dequeue () { lock (qLock); element=NULL; while (head==NULL) {} // remove head element=head->next; head->next=NULL; unlock (qLock); return element; }

Release lock before spinning? dequeue () { lock (qLock); element=NULL; unlock (qLock); while (head==NULL) {} // remove head element=head->next; head->next=NULL; return element; } What can go wrong? Head might be NULL when we try to remove entry

One more try Does it work? Why? Downside? Seems ok Shared state is dequeue () { lock (qLock); element=NULL; while (head==NULL) { unlock (qLock); } // remove head element=head->next; head->next=NULL; return element; Does it work? Seems ok Why? Shared state is protected Downside? Busy-waiting Wasteful

Ideal solution Would like dequeueing thread to “sleep” Pause execution, add self to “waiting list” Enqueuer can resume when Q is non-empty Problem: what to do with the lock? Why can’t dequeueing thread sleep with lock? Enqueuer would never be able to add

Release the lock before sleep? enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock dequeue () { acquire lock … if (queue empty) { release lock add self to wait list sleep } Does this work?

Release the lock before sleep? enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock dequeue () { acquire lock … if (queue empty) { release lock add self to wait list sleep } 1 2 3 Thread can sleep forever

Release the lock before sleep? enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock dequeue () { acquire lock … if (queue empty) { add self to wait list release lock sleep }

Release the lock before sleep? enqueue () { acquire lock find tail of queue add new element if (dequeuer waiting){ remove from wait list wake up dequeuer } release lock dequeue () { acquire lock … if (queue empty) { add self to wait list release lock sleep } 1 2 3 Problem: missed wake-up Note: this can be fixed, but it’s messy

Two types of synchronization As before we need to raise the level of abstraction Mutual exclusion One thread doing something at a time Use locks Ordering constraints Describe “before-after” relationships One thread waits for another Use monitors: a lock + its condition variable

Locks and condition variables Let threads sleep inside a critical section Internal atomic actions (for now, by definition) CV State = queue of waiting threads + one lock // begin atomic release lock put thread on wait queue go to sleep // end atomic

Condition variable operations Lock always held wait (lock){ release lock put thread on wait queue go to sleep // after wake up acquire lock } Atomic Lock always held signal (){ wakeup one waiter (if any) } Lock usually held Atomic broadcast (){ wakeup all waiters (if any) } Lock usually held Atomic

CVs and invariants Ok to leave invariants violated before wait? No: wait can release the lock Larger rule about returning from wait Lock may have changed hands State can change between wait entry and return Don’t make assumptions about shared state

Multi-threaded queue enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } dequeue () { acquire lock if (queue empty) { wait (lock, CV) } remove item from queue release lock return removed item What if “queue empty” takes more than one instruction? Any problems with the “if” statement in dequeue?

Multi-threaded queue enqueue () { dequeue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue return removed item

Multi-threaded queue 1 2 4 3 enqueue () { dequeue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue return removed item 1 2 4 dequeue () { acquire lock … return removed item } 3

Multi-threaded queue How to solve? enqueue () { dequeue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } dequeue () { acquire lock if (queue empty) { // begin atomic wait release lock add wait list, sleep // end atomic wait re-acquire lock } remove item from queue return removed item How to solve?

Multi-threaded queue Solve with a while loop (“loop before you leap”) The “condition” in condition variable enqueue () { acquire lock find tail of queue add new element signal (lock, CV) release lock } dequeue () { acquire lock while (queue empty) { wait (lock, CV) } remove item from queue release lock return removed item Solve with a while loop (“loop before you leap”)

Recap and looking ahead Applications Threads, synchronization primitives OS Atomic Load-Store, Interrupt enable- disable, Atomic Test-Set Hardware

Threads that aren’t running What is a non-running thread? thread=“sequence of executing instructions” non-running thread=“paused execution” Must save thread’s private state To re-run, re-load private state Want thread to start where it left off

Private vs global thread state What state is private to each thread? PC (where actor is in his/her script) Stack, SP (actor’s mindset) What state is shared? Global variables, heap (props on set) Code (like lines of a play)

Thread control block (TCB) What needs to access threads’ private data? The CPU This info is stored in the PC, SP, other registers The OS needs pointers to non-running threads’ data Thread control block (TCB) Container for non-running threads’ private data Values of PC, code, SP, stack, registers

Thread control block TCB1 TCB2 TCB3 Thread 1 running Ready queue PC SP Address Space TCB1 PC SP registers TCB2 PC SP registers TCB3 PC SP registers Ready queue Code Code Code Stack Stack Stack Thread 1 running CPU PC SP registers

Thread control block TCB2 TCB3 Thread 1 running Ready queue PC SP Address Space TCB2 PC SP registers TCB3 PC SP registers Ready queue Code Stack Stack Stack Thread 1 running CPU PC SP registers

Thread states Running Ready Blocked Currently using the CPU Ready to run, but waiting for the CPU Blocked Stuck in lock (), wait (), or down ()

Switching threads What needs to happen to switch threads? Thread returns control to OS For example, via the “yield” call OS chooses next thread to run OS saves state of current thread To its thread control block OS loads context of next thread From its thread control block Run the next thread On Linux swapcontext

1. Thread returns control to OS How does the thread system get control? Voluntary internal events Thread might block inside lock or wait Thread might call into kernel for service (read a file) Thread might call yield Are internal events enough?

1. Thread returns control to OS Involuntary external events (events not initiated by the thread) Hardware interrupts Transfer control directly to OS interrupt handlers From your architecture course CPU checks for interrupts while executing Jumps to OS code with interrupt mask set Interrupts lead to pre-emption (a forced yield) Common interrupt: timer interrupt

2. Choosing the next thread If no ready threads, just spin Modern CPUs execute a “halt” instruction Loop switches to thread if one is ready Many ways to prioritize ready threads Huge literature on scheduling algorithms

3. Saving state of current thread What needs to be saved? Registers, PC, SP What makes this tricky? Self-referential sequence of actions Need registers to save state But you’re trying to save all the registers Saving the PC is particularly tricky

Saving the PC Why won’t this work? Instruction address Returning thread will execute instruction at 100 And just re-execute the switch Really want to save address 102 Instruction address 100 store PC in TCB 101 switch to next thread

4. OS loads the next thread Where is the next thread’s state/context? Thread control block (in memory) How to load the registers? Use load instructions to grab from memory How to load the stack? Stack is already in memory, load SP

5. OS runs the next thread How to resume thread’s execution? Jump to the saved PC On whose stack are these steps running? or Who jumps to the saved PC? The thread that called yield (or was interrupted or called lock/wait) How does this thread run again? Some other thread must switch to it

Why use locks? disable interrupts while (1){} If we have disable-enable, why do we need locks? Program could bracket critical sections with disable-enable Might not be able to give control back to thread library Can’t have multiple locks (over-constrains concurrency) disable interrupts while (1){}

Why use locks? How do we know if disabling interrupts is safe? Need hardware support CPU has to know if running code is trusted (i.e, is the OS) Example of why we need the kernel Other things that user programs shouldn’t do? Manipulate page tables Reboot machine Communicate directly with hardware Will cover in upcoming memory review

Lock implementation #1 Kernel implementation Disable interrupts + busy-waiting lock () { disable interrupts while (value != FREE) { enable interrupts } value = BUSY unlock () { disable interrupts value = FREE enable interrupts } Why is it ok for lock code to disable interrupts? It’s in the trusted kernel (we have to trust something).

Lock implementation #1 Do we need to disable interrupts in unlock? Kernel implementation Disable interrupts + busy-waiting lock () { disable interrupts while (value != FREE) { enable interrupts } value = BUSY unlock () { disable interrupts value = FREE enable interrupts } Do we need to disable interrupts in unlock? Only if “value = FREE” is multiple instructions (safer)

Lock implementation #1 Why enable-disable in lock loop body? Kernel implementation Disable interrupts + busy-waiting lock () { disable interrupts while (value != FREE) { enable interrupts } value = BUSY unlock () { disable interrupts value = FREE enable interrupts } Why enable-disable in lock loop body? Otherwise, no one else will run (including unlockers)

Using read-modify-write instructions Disabling interrupts Ok for uni-processor, breaks on multi-processor Why? Could use atomic load-store to make a lock Inefficient, lots of busy-waiting Hardware people to the rescue!

Using read-modify-write instructions Most modern processor architectures Provide an atomic read-modify-write instruction Atomically Read value from memory into register Write new value to memory Implementation details Lock memory location at the memory controller

Test&set on most architectures Slightly different on x86 (Exchange) Atomically swaps value between register and memory test&set (X) { tmp = X X = 1 return (tmp) } Set: sets location to 1 Test: retruns old value

Lock implementation #2 Use test&set Initially, value = 0 while (test&set(value) == 1) { } unlock () { value = 0 } What happens if value = 1? What happens if value = 0?

Locks and busy-waiting All implementations have used busy-waiting Wastes CPU cycles To reduce busy-waiting, integrate Lock implementation Thread dispatcher data structures

Lock implementation #3 Interrupt disable, no busy-waiting lock () { disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts

Lock implementation #3 Who gets the lock after someone calls unlock? disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts This is called a “hand-off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts Who gets the lock after someone calls unlock?

Lock implementation #3 This is called a “hand-off” lock. disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts This is called a “hand-off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts Who might get the lock if it weren’t handed-off directly? (i.e., if value weren’t set BUSY in unlock)

Lock implementation #3 This is called a “hand-off” lock. disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts This is called a “hand-off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts What kind of ordering of lock acquisition guarantees does the hand-off lock provide? Fumble lock?

Lock implementation #3 What does this mean? Are we saving the PC? disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { add thread to queue of threads waiting for lock switch to next ready thread // don’t add to ready queue } enable interrupts This is called a “hand-off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts What does this mean? Are we saving the PC?

Lock implementation #3 No, just adding a pointer to the TCB/context. disable interrupts if (value == FREE) { value = BUSY // lock acquire } else { lockqueue.push(&current_thread->ucontext); swapcontext(&current_thread->ucontext, &new_thread->ucontext)); } enable interrupts This is called a “hand-off” lock. unlock () { disable interrupts value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } enable interrupts No, just adding a pointer to the TCB/context.

Thread A Thread B B holds lock yield () { disable interrupts … switch (B->A) enable interrupts } // exit thread library <user code> lock () { switch (A->B) back from switch (B->A) } // exit yield unlock () // moves A to ready queue back from switch (A->B) } // exit lock B holds lock

Lock implementation #4 Test&set, minimal busy-waiting lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable

Lock implementation #4 Why is this better than t&s-only lock implementation? lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable Only busy-wait while another thread is in lock or unlock unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable Before, we busy-waited while lock was held

Lock implementation #4 What is the switch invariant? Threads promise to call switch with guard set to 1. lock () { while (test&set (guard)) {} // like interrupt disable if (value == FREE) { value = BUSY } else { put on queue of threads waiting for lock switch to another thread // don’t add to ready queue } guard = 0 // like interrupt enable unlock () { while (test&set (guard)) {} // like interrupt disable value = FREE if anyone on queue of threads waiting for lock { take waiting thread off queue, put on ready queue value = BUSY } guard = 0 // like interrupt enable

Summary of implementing locks Synchronization code needs atomicity Three options Atomic load-store Lots of busy-waiting Interrupt disable-enable No busy-waiting Breaks on a multi-processor machine Atomic test-set Minimal busy-waiting Works on multi-processor machines

Upcoming Next! After that Questions? Review of memory and address spaces After that We’ll start reading papers Start programming Project 1 Questions?