OSE 2013 – synchronization (lec3) 1 Operating Systems Engineering Locking & Synchronization [chapter #4] By Dan Tsafrir, 2013-04-03.

Slides:



Advertisements
Similar presentations
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Advertisements

CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Chapter 6: Process Synchronization
Secure Operating Systems Lesson 5: Shared Objects.
OSE 2013 – synchronization (lec4) 1 Operating Systems Engineering Sleep & Wakeup [chapter #5] By Dan Tsafrir,
CSE 451: Operating Systems Spring 2012 Module 7 Synchronization Ed Lazowska Allen Center 570.
Synchronization. Shared Memory Thread Synchronization Threads cooperate in multithreaded environments – User threads and kernel threads – Share resources.
Operating Systems ECE344 Ding Yuan Synchronization (I) -- Critical region and lock Lecture 5: Synchronization (I) -- Critical region and lock.
Read-Copy Update P. E. McKenney, J. Appavoo, A. Kleen, O. Krieger, R. Russell, D. Saram, M. Soni Ottawa Linux Symposium 2001 Presented by Bogdan Simion.
“THREADS CANNOT BE IMPLEMENTED AS A LIBRARY” HANS-J. BOEHM, HP LABS Presented by Seema Saijpaul CS-510.
Monitors CSCI 444/544 Operating Systems Fall 2008.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
CS533 - Concepts of Operating Systems 1 CS533 Concepts of Operating Systems Class 8 Synchronization on Multiprocessors.
Computer Science 162 Discussion Section Week 3. Agenda Project 1 released! Locks, Semaphores, and condition variables Producer-consumer – Example (locks,
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Discussion Week 3 TA: Kyle Dewey. Overview Concurrency overview Synchronization primitives Semaphores Locks Conditions Project #1.
Solution to Dining Philosophers. Each philosopher I invokes the operations pickup() and putdown() in the following sequence: dp.pickup(i) EAT dp.putdown(i)
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
Implementing Synchronization. Synchronization 101 Synchronization constrains the set of possible interleavings: Threads “agree” to stay out of each other’s.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
Chapter 28 Locks Chien-Chung Shen CIS, UD
COMP 111 Threads and concurrency Sept 28, Tufts University Computer Science2 Who is this guy? I am not Prof. Couch Obvious? Sam Guyer New assistant.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Kernel Locking Techniques by Robert Love presented by Scott Price.
Shared Memory Consistency Models. SMP systems support shared memory abstraction: all processors see the whole memory and can perform memory operations.
Chapter 7 -1 CHAPTER 7 PROCESS SYNCHRONIZATION CGS Operating System Concepts UCF, Spring 2004.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
CS140 Project 1: Threads Slides by Kiyoshi Shikuma.
CHARLES UNIVERSITY IN PRAGUE faculty of mathematics and physics Advanced.NET Programming I 5 th Lecture Pavel Ježek
CS 3204 Operating Systems Godmar Back Lecture 7. 12/12/2015CS 3204 Fall Announcements Project 1 due on Sep 29, 11:59pm Reading: –Read carefully.
Fundamentals of Parallel Computer Architecture - Chapter 71 Chapter 7 Introduction to Shared Memory Multiprocessors Yan Solihin Copyright.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
CS510 Concurrent Systems Jonathan Walpole. RCU Usage in Linux.
1 Previous Lecture Overview  semaphores provide the first high-level synchronization abstraction that is possible to implement efficiently in OS. This.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Concurrency: Locks and synchronization Slides by Prof. Cox.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Lecture 5 Page 1 CS 111 Summer 2013 Bounded Buffers A higher level abstraction than shared domains or simple messages But not quite as high level as RPC.
6.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 6.5 Semaphore Less complicated than the hardware-based solutions Semaphore S – integer.
Kernel Synchronization David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Advanced .NET Programming I 11th Lecture
Synchronization.
ICS143A: Principles of Operating Systems Lecture 15: Locking
Semester Review Chris Gill CSE 422S - Operating Systems Organization
Kernel Synchronization II
COT 5611 Operating Systems Design Principles Spring 2014
Jonathan Walpole Computer Science Portland State University
Thread Implementation Issues
Kernel Synchronization I
CSE 451: Operating Systems Winter 2007 Module 6 Synchronization
Kernel Synchronization II
Why Threads Are A Bad Idea (for most purposes)
CSE 451: Operating Systems Autumn 2004 Module 6 Synchronization
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2004 Module 6 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CSE 153 Design of Operating Systems Winter 2019
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
CSE 451: Operating Systems Winter 2007 Module 6 Synchronization
Lecture 20: Synchronization
CSE 451: Operating Systems Autumn 2009 Module 7 Synchronization
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Problems with Locks Andrew Whitaker CSE451.
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

OSE 2013 – synchronization (lec3) 1 Operating Systems Engineering Locking & Synchronization [chapter #4] By Dan Tsafrir,

OSE 2013 – synchronization (lec3) 2 FINE-GRAINED SYNCHRONIZATION When critical section is very short

OSE 2013 – synchronization (lec3) 3 Agenda: mutex locks  Locks  Why do we need them?  How do we implement them? (In xv6 in particular)  How do we use them?

OSE 2013 – synchronization (lec3) 4 Why locks?  The point of multiprocessors / multicores?  Goal: increase performance  Means: running different tasks concurrently  Problem  Simultaneous use of the same data structure(s) might cause incorrect results  Solution  Synchronize & coordinate  So as to serialize access where needed  To do that, we will use spinlocks

OSE 2013 – synchronization (lec3) 5 Example struct List { int data; struct List *next; }; List *list = 0; insert(int data) { List *l = new List; l->data = data; l->next = list; list = l; }  Works correctly when serial  Breaks when two cores doing it simultaneously

OSE 2013 – synchronization (lec3) 6 Race condition Core #1: insert(int data) { // data=3 List *l = new List; l->data = data; l->next = list; list = l; // list = [3 -> null] } Core #2: insert(int data) { // data=2 List *l = new List; l->data = data; l->next = list; list = l; // list = [2 -> null] }

OSE 2013 – synchronization (lec3) 7 Race condition  Why is it a called a “race”?  Nondeterministic nature  Debug printing might  Eliminate the race, or introduce a new one…  This nondeterministic behavior makes it very hard to debug  How to avoid races? invariants  Introduce a lock to protect a collection of invariants  Data structure properties that are maintained across ops  For linked list: –‘list’ points to head + each ‘next’ points to next element  Op might temporarily violate invariants  Which is why we need it to be “atomic”

OSE 2013 – synchronization (lec3) 8 Serialize with “lock abstraction” insert(int data) { acquire( &list_lock ); List *l = new List; l->data = data; l->next = list; list = l; release( &list_lock ) }  Associate a lock object with the list  2 ops + 2 matching states: 1. acquire => “locked” 2. release => “unlocked”  Semantics:  Only one thread can ‘acquire’ lock  Other threads must wait until it’s ‘release’-d

OSE 2013 – synchronization (lec3) 9 Serialize with “lock abstraction” insert(int data) { List *l = new List; l->data = data; acquire( &list_lock ); l->next = list; list = l; release( &list_lock ) }  Would this work?  Why would we want to do it?

OSE 2013 – synchronization (lec3) 10 Acquire & Release (broken) struct Lock { int locked; // 0=free; 1=taken }; void broken_acquire(Lock *lock) { while(1) { if(lock->locked == 0) { lock->locked = 1; break; } void release (Lock *lock) { lock->locked = 0; }

OSE 2013 – synchronization (lec3) 11 Acquire & Release (broken) struct Lock { int locked; // 0=free; 1=taken }; void broken_acquire(Lock *lock) { while(1) { if(lock->locked == 0) { lock->locked = 1; break; } void release (Lock *lock) { lock->locked = 0; } Can be acquired simultaneously (We want these 2 lines to be atomic)

OSE 2013 – synchronization (lec3) 12 Atomic instructions  Most CPUs provide atomic operations  Often doing exactly what we needed in the previous slide  In x86  xchng %eax, addr  Is equivalent to 1.freeze any memory activity for address addr 2.temp := *addr 3.*addr := %eax 4.%eax = temp 5.un-freeze other memory activity swap(%eax,*addr )

OSE 2013 – synchronization (lec3) 13 void release (Lock *lock) { xchg(&lock->locked,0); // why not do: // lock->locked = 0 // ? // [see next slide…] } Now its atomic Acquire & Release (works) int xchg(addr, value){ %eax = value xchg %eax, (addr) return %eax } void acquire(Lock *lock) { while(1) if( xchg(&lock->locked,1) == 0 ) break; }

OSE 2013 – synchronization (lec3) 14 Memory ordering (memory fence/barrier) insert(int data) { acquire( &list_lock );// p0 List *l = new List; l->data = data; l->next = list;// p1 list = l;// p2 release( &list_lock ) }  Both processor & compiler can reorder operations for out-of-order, to maximize ILP  We must make sure p1 & p2 don’t get reordered and end before p0  xchg serves as a memory “fence” or “barrier”  Full fence: all reads/writes prior to p0 occur before any reads/writes after p0  Partial: all write before p0 occur before any reads after  (Also, xchg is “asm volatile”)

OSE 2013 – synchronization (lec3) 15 Spin or block?  Constructs like while(1) { xchg } are called  “Busy waiting” or “spinning”  Indeed, the lock we’ve used is called a “spinlock”  Not advisable when the critical section is long  (critical section = everything between the acquire-release)  Because other threads can utilize the core instead  The alternative: blocking  In xv6: sleep() & wakeup()  Not advisable when wait is short (otherwise induces overhead)  There’s also two-phase waiting (“the ski problem”)

OSE 2013 – synchronization (lec3) 16 Locking granularity  How much code should we lock?  Once, Linux had a “big lock” (for the entire kernel!)  (Completely removed only in 2011…)  What’s the appropriate locking granularity for file-system:  All of it with a single lock? Every directory? Every file?  Coarse granularity: Pros: simple, easier, tends to be more correct…  Cons: performance degradation, negates concurrency  Fine granularity: If done right, allows for concurrency most of the time  Messes up the code, hard to get right, hard to debug  Xv6  To simplify, tends to favor coarse (but no “kernel big lock”)  E.g., locking entire process table instead of individual entries

OSE 2013 – synchronization (lec3) 17 Lock ordering  When utilizing more than one lock  All code paths must acquire locks in same order  Otherwise => deadlock!  Xv6  Simple / coarse => not a lot of locks held together  Longest chain is 2, e.g.,  When removing a file: lock dir => lock file (in that order)  ideinter() / ide lock => wakeup / ptable lock  …

OSE 2013 – synchronization (lec3) 18 Synchronizing with interrupts  A core uses locks to synchronize with other cores  (interrupt and non-interrupt code)  A core must also synchronize with itself  Namely, its non-interrupt + interrupt code  Both are essentially different threads of execution that might manipulate the same data  Example: both of the following use tickslock (3063)  tick handler (timer interrupt; 3116), and  sys_sleep (3473)  xv6 solution  Disable interrupts while holding a lock  Thus, interrupts will never be processed with a lock  Which ensures no deadlocks due to interrupts

OSE 2013 – synchronization (lec3) 19 xv6 locking code  Pages 14 – 15

OSE 2013 – synchronization (lec3) 20 Locks mess up modularity  When calling function F, best if we don’t know if it uses a lock  Supposedly should be “internal details”  Alas, in the kernel, it oftentimes doesn’t work that way  Performance is paramount  Typically not trading off elegance for performance  If I hold lock L and invoke F, it must not attempt to acquire L  If F acquires L, I must not attempt to call it while holding L  => Locks break modularity and contaminate the OS code  => Consider locks as part of the function specification  “call this function while holding L1…”  “if that function returns -2 release L2…”  Ordering of locks needs to be know too

OSE 2013 – synchronization (lec3) 21 Real world  Locking is hard  Despite years of experience, it’s still very challenging!  Deadlocks are common when developing even for experts  Our spinlock, with xchg, keeps memory busy => suboptimal  We can sometimes do better (see later lecture)  RCU (read-copy-update) optimizes for mostly reading shared state; heavily used in Linux  Synchronization is an active field of research  Transactional memory (via software, hardware, or both)  Lock-free data structures  A newly proposed research OS (2010) that is always deterministic (see later lecture)  Many papers about automatically identifying races