Chien-Chung Shen CIS/UD

Slides:

Advertisements

Similar presentations

Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.

Advertisements

Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.

Chapter 6: Process Synchronization

Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.

5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.

Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.

Mutual Exclusion.

Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.

Synchronization. Shared Memory Thread Synchronization Threads cooperate in multithreaded environments – User threads and kernel threads – Share resources.

Operating Systems ECE344 Ding Yuan Synchronization (I) -- Critical region and lock Lecture 5: Synchronization (I) -- Critical region and lock.

CS444/CS544 Operating Systems Synchronization 2/21/2006 Prof. Searleman

CS444/CS544 Operating Systems Synchronization 2/16/2007 Prof. Searleman

Synchronization Principles. Race Conditions Race Conditions: An Example spooler directory out in 4 7 somefile.txt list.c scores.txt Process.

CPS110: Implementing threads/locks on a uni-processor Landon Cox.

Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.

Synchronization CSCI 444/544 Operating Systems Fall 2008.

9/8/2015cse synchronization-p1 © Perkins DW Johnson and University of Washington1 Synchronization Part 1 CSE 410, Spring 2008 Computer Systems.

© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Mutual Exclusion.

Chapter 28 Locks Chien-Chung Shen CIS, UD

COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.

Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.

11/18/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Roy Campbell, Sam.

Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 5: Process Synchronization.

Lecture 10 Locks.

Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms to ensure the orderly execution.

Operating Systems CMPSC 473 Mutual Exclusion Lecture 11: October 5, 2010 Instructor: Bhuvan Urgaonkar.

CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.

U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.

Implementing Lock. From the Previous Lecture  The “too much milk” example shows that writing concurrent programs directly with load and store instructions.

Operating System Concepts and Techniques Lecture 13 Interprocess communication-2 M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and.

Implementing Mutual Exclusion Andy Wang Operating Systems COP 4610 / CGS 5765.

Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.

CSE 120 Principles of Operating

CS703 – Advanced Operating Systems

Background on the need for Synchronization

Synchronization.

Atomic Operations in Hardware

Atomic Operations in Hardware

Chapter 5: Process Synchronization

Lecture 11: Mutual Exclusion

Designing Parallel Algorithms (Synchronization)

Lecture 14: Pthreads Mutex and Condition Variables

Chapter 26 Concurrency and Thread

Concurrency: Locks Questions answered in this lecture:

Topic 6 (Textbook - Chapter 5) Process Synchronization

Jonathan Walpole Computer Science Portland State University

Mutual Exclusion.

UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department

Lecture 2 Part 2 Process Synchronization

Chapter 30 Condition Variables

Implementing Mutual Exclusion

Concurrency: Mutual Exclusion and Process Synchronization

Implementing Mutual Exclusion

Lecture 14: Pthreads Mutex and Condition Variables

Kernel Synchronization II

Lecture 11: Mutual Exclusion

CSE 451: Operating Systems Autumn 2004 Module 6 Synchronization

CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization

CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization

CSE 451: Operating Systems Winter 2004 Module 6 Synchronization

CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization

CSE 153 Design of Operating Systems Winter 19

CS333 Intro to Operating Systems

Chapter 6: Synchronization Tools

Process/Thread Synchronization (Part 2)

CS444/544 Operating Systems II Scheduler

Presentation transcript:

Chien-Chung Shen CIS/UD cshen@udel.edu Chapter 28 Locks Chien-Chung Shen CIS/UD cshen@udel.edu

Basic Ideas Problem on concurrent programming Solution need to execute a sequence of instructions atomically presence of interrupts on single CPU Solution put locks around critical sections lock_t mutex; // some globally-allocated lock ’mutex’ . . . . . lock(&mutex); balance = balance + 1; // critical section unlock(&mutex); States of lock: available/unlocked/free vs. acquired/locked/held will not return while lock is being held by another thread

Semantics of Lock and Unlock lock(&mutex); balance = balance + 1; // critlcal section unlock(&mutex); lock() tries to acquire lock mutex if no other thread holds the lock (i.e., it is free), the thread will acquire the lock and enter critical section (and becomes the owner of the lock) if another thread then calls lock() on that same lock, it will not return while the lock is held by another thread; in this way, other threads are prevented from entering the critical section while the first thread that holds the lock is in there Once owner of the lock calls unlock(), the lock is now available (free) again if there are waiting threads (stuck in lock()), one of them will (eventually) notice (or be informed of) this change of the lock’s state, acquire the lock, and enter the critical section

Lock and Scheduling Threads as entities created by programmer but scheduled by OS, in any fashion that the OS chooses Locks provide some minimal amount of control over scheduling (back) to programmers e.g., guarantee that no more than a single thread can ever be active within that code (critical section)

Pthread Locks mutex lock for mutual exclusion between threads pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER; Pthread_mutex_lock(&lock); // wrapper for pthread_mutex_lock() balance = balance + 1; Pthread_mutex_unlock(&lock); #include <pthread.h> #include <assert.h> void Pthread_mutex_lock(pthread_mutex_t *m) { int rc = pthread_mutex_lock(m); assert(rc == 0); // abort the program if assertion is false } void Pthread_mutex_unlock(pthread_mutex_t *m) int rc = pthread_mutex_unlock(m); assert(rc == 0); // abort the program if assertion is false mutex lock for mutual exclusion between threads

Course-grained Locking int count; int salary; mutex_lock A; thread 1 thread 2 lock(A); lock(A); count+=1; count+=2; salary+=50; salary+=100; unlock(A); unlock(A); How to allow more threads to execute (more) different critical sections at once ? How to increase concurrency?

Fine-grained Locking allowing more threads to execute (more) different critical sections at once int count; int salary; mutex_lock thread 1 thread 2 lock(A); lock(A); count+=1; count+=2; unlock(A); unlock(A); lock(B); lock(B); salary+=50; salary+=100; unlock(B); unlock(B); A, B;

Building a Lock Now, we understanding how a lock works, from the perspective of a programmer (i.e., how to use a lock) But how should we build a lock? OS (software) support? hardware support? Over the years, different hardware primitives have been added to the instruction sets of various computer architectures won’t study how these instructions are implemented (issue of computer architecture, CISC 360) will study how to use them in order to build a lock will study how OS gets involved to complete the picture and enable us to build a sophisticated locking library

Locks: Goal and Evaluation Goals and efficiency evaluation of implementations Three criteria correctness – guarantee mutual exclusion fairness – avoid starvation performance – overhead incurred by using locks the case of no contention; when a single thread is running and grabs and releases the lock, what is the overhead of doing so? the case where multiple threads are contending for the lock on a single CPU; are there performance concerns? how does the lock perform when there are multiple CPUs involved, and threads on each contending for the lock?

Root Cause On a single CPU, what is the root cause of race condition among multiple threads? Interrupts

Solution #1 Disable interrupts for critical sections Negatives? void lock () { DisableInterrupt(); // special hardware instruction } void unlock() EnableInterrupt(); // special hardware instruction Negatives? allow calling thread to perform privileged operations – “trust?” does not work on multiprocessors (other threads may enter CS) lose interrupts if being turned off for a long time Simple: no interrupt, no interference on single CPU! lock(); balance = balance + 1; unlock();

First Attempt – use a variable typedef struct __lock_t { int flag; } lock_t; void init(lock_t *mutex) { mutex->flag = 0; // 0 -> lock is available, 1 -> held } void lock(lock_t *mutex) { while (mutex->flag == 1) // TEST the flag ; // spin-wait (do nothing) mutex->flag = 1; // now SET it! void unlock(lock_t *mutex) { mutex->flag = 0; What problems does this solution have?

Malicious Scheduler Pretend being a malicious scheduler, one that interrupts threads at the most inopportune of times in order to foil their feeble attempts at building synchronization primitives Although the exact sequence of interrupts may be improbable, it is possible, and that is all we need to demonstrate that a particular approach does not work it can be useful to think maliciously (sometimes)!

Thread Interleaving Problems? initially Problems? correctness – no guarantee of mutual exclusion performance – spin-wait false On single CPU, the thread [that the waiter is waiting for] cannot even run (at least, until a context switch occurs)! false

Solution #2 Test-and-Set (Atomic Exchange) instruction Semantics Need other hardware support! Test-and-Set (Atomic Exchange) instruction Semantics int TestAndSet(int *old_ptr, int new) { int old = *old_ptr; // fetch old value at old_ptr *old_ptr = new; // store ’new’ into old_ptr return old; // return the old value } Returns the old value pointed to by old_ptr, and simultaneously updates said value to new Make “test” (of old value) and “set” (of new value) a single atomic operation (assembly instruction) SPARC – ldstub // load/store unsigned byte x86 – xchg // atomic exchange

Spin Lock with Test-and-Set typedef struct __lock_t { int flag; } lock_t; void init(lock_t *lock) { lock->flag = 0; // 0 -> lock is available, 1 -> held } void lock(lock_t *lock) { while (TestAndSet(&lock->flag, 1) == 1) // TEST the flag ; // spin-wait (do nothing) void unlock(lock_t *lock) { lock->flag = 0; As long as the lock is held by another thread, TestAndSet() will repeatedly return 1, and thus the calling thread will spin-wait What kind of scheduler do we need on single processor? preemptive scheduler (interrupt threads via timer)

Evaluation of Spin Lock Three criteria (how effective it is?) Correctness - mutual exclusion ? Yes Fairness – avoid starvation ? No – a waiting thread may wait (spin) forever under contention Performance – overhead ? bad on single CPU – how bad is it with N processes? N-1 time slices reasonably well on multiple CPUs (if # threads ~ # CPUs), assuming critical sections are short

Solution #3 On Sparc - compare-and-swap instruction On x86 - compare-and-exchange cmpxchgl Semantics int CompareAndSwap(int *ptr, int expected, int new) { int actual = *ptr; if (actual == expected) *ptr = new; return actual; } Lock void lock(lock_t *lock) { while (CompareAndSwap(&lock->flag, 0, 1) == 1) ; // spin Test whether the value at the address specified by ptr is equal to expected; if so, update the memory location pointed to by ptr with the new value. If not, do nothing. Return the original value at ptr checks if flag is 0 and if so, atomically swaps in 1 thus acquiring the lock

CompareAndSwap C-callable x86-version of compare-and-swap

Which One is More Powerful? int TestAndSet(int *old_ptr, int new) { int old = *old_ptr; *old_ptr = new; return old; } int CompareAndSwap(int *ptr, int expected, int new) { int actual = *ptr; if (actual == expected) *ptr = new; return actual; } for lock-free synchronization

Let’s Design a Lock So far, the designs do not provide fairness Any inspiration from your daily life?

Solution #4 Anything good ? ensure progress for all threads and fair int FetchAndAdd(int *ptr) { // semantics int old = *ptr; *ptr = old + 1; return old; } typedef struct __lock_t { int ticket; int turn; } lock_t; void lock_init(lock_t *lock) { lock->ticket = 0; lock->turn = 0; } void lock(lock_t *lock) { int myturn = FetchAndAdd(&lock->ticket); // get a ticket (for my turn) while (lock->turn != myturn) ; // spin if not my turn void unlock(lock_t *lock) { lock->turn = lock->turn + 1; // enable the next waiting thread } // (if any) to enter CS Anything good ? ensure progress for all threads and fair Ticket Lock with FetchAndAdd atomically increments a value while returning the old value at a particular address

Too Much Spinning: What Now? Hardware support provides correctness and fairness OS, in addition to hardware, supports efficiency When context switching occurs inside a CS, “other” threads start to spin endlessly, waiting for the interrupted (lock-holding) thread to be run again What is the question? if lock has been acquired, just yield CPU (OS primitive) void init() { flag = 0; } void lock() { while (TestAndSet(&flag, 1) == 1) // TEST “flag” yield(); // give up CPU and move to READY state } // another thread is promoted to RUNNING void unlock() { flag = 0; } How to avoid spinning?

Yield: Cost and Issue Overhead Still one problem not solved 2 threads on one CPU How about 100 threads on one CPU? if one thread acquires the lock and is preempted before releasing it, the other 99 will each call lock(), find the lock held, and yield the CPU (99 context switches) better than spin-lock which wastes 99 time slices spinning context switching overhead Still one problem not solved Starvation: A thread may get caught in an endless yield loop while other threads repeatedly enter and exit the critical section (no discipline on which thread should be executed next)

Starvation !!!

Which Data Structure to Use? To prevent starvation, we need to explicitly exert some control over who gets to acquire the lock next after the current holder releases the lock What “data structure” would you use to keep track of which threads are waiting to acquire the lock? Queue OS support park() – put calling thread to sleep (not READY) unpark(thread_ID) – wake up thread thread_ID Used to build a lock that puts a caller to sleep if it tries to acquire a held lock and wakes a sleeping thread (if any) when the lock is free

Queue and Yield/Wakeup typedef struct __lock_t { int flag; int guard; queue_t *q; // queue of lock waiters } lock_t; void lock_init(lock_t *m) { m->flag = 0; m->guard = 0; queue_init(m->q); } void lock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (m->flag == 0) { m->flag = 1; // lock is acquired m->guard = 0; } else { queue_add(m->q, gettid()); // added to the lock’s queue (an explicit queue of lock waiters) park(); // put calling thread to sleep void unlock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (queue_empty(m->q)) m->flag = 0; // let go of lock; no one wants it else unpark(queue_remove(m->q)); // wake up next thread and hold lock (for the next thread!) More efficient lock with queue No starvation guard as a spin-lock around flag and queue manipulation code

Questions void lock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (m->flag == 0) { m->flag = 1; // lock is acquired m->guard = 0; } else { queue_add(m->q, gettid()); // added to the lock’s queue x: m->guard = 0; y: park(); // put calling thread to sleep } Why is guard used? Can x and y be swapped? void unlock(lock_t *m) { while (TestAndSet(&m->guard, 1) == 1) ; //acquire guard lock by spinning if (queue_empty(m->q)) m->flag = 0; // let go of lock; no one wants it else unpark(queue_remove(m->q)); // hold lock (for next thread!) Why flag does not get set to 0 when another thread gets woken up? the waking thread does not hold the guard quard is used as a spin-lock to protect flag and queue manipulation used by lock m