Linux Synchronization

Slides:



Advertisements
Similar presentations
Operating Systems Semaphores II
Advertisements

1 Interprocess Communication 1. Ways of passing information 2. Guarded critical activities (e.g. updating shared data) 3. Proper sequencing in case of.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Resource management and Synchronization Akos Ledeczi EECE 354, Fall 2010 Vanderbilt University.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Kernel Synchronization
Synchronization. Shared Memory Thread Synchronization Threads cooperate in multithreaded environments – User threads and kernel threads – Share resources.
CS533 Concepts of Operating Systems Class 4 Linux Kernel Locking Issues.
CS510 Concurrent Systems Class 1
Concurrency: Mutual Exclusion, Synchronization, Deadlock, and Starvation in Representative Operating Systems.
CS510 Concurrent Systems Class 1 Linux Kernel Locking Techniques.
CS533 Concepts of Operating Systems Class 17 Linux Kernel Locking Techniques.
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
CS533 Concepts of Operating Systems Linux Kernel Locking Techniques.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
1 Race Conditions/Mutual Exclusion Segment of code of a process where a shared resource is accessed (changing global variables, writing files etc) is called.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Kernel Locking Techniques by Robert Love presented by Scott Price.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
CS 3204 Operating Systems Godmar Back Lecture 7. 12/12/2015CS 3204 Fall Announcements Project 1 due on Sep 29, 11:59pm Reading: –Read carefully.
Background Concurrent access to shared data may result in data inconsistency Maintaining data consistency requires mechanisms to ensure the orderly execution.
CS533 Concepts of Operating Systems Jonathan Walpole.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science Computer Systems Principles Synchronization Emery Berger and Mark Corner University.
Operating System Concepts and Techniques Lecture 13 Interprocess communication-2 M. Naghibzadeh Reference M. Naghibzadeh, Operating System Concepts and.
CSCI1600: Embedded and Real Time Software Lecture 17: Concurrent Programming Steven Reiss, Fall 2015.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
CS162 Section 2. True/False A thread needs to own a semaphore, meaning the thread has called semaphore.P(), before it can call semaphore.V() False: Any.
6.1 Silberschatz, Galvin and Gagne ©2005 Operating System Principles 6.5 Semaphore Less complicated than the hardware-based solutions Semaphore S – integer.
Inside The RT Patch Steven Rostedt (Red Hat) Darren V Hart (IBM) Talk: Benchmarks :
Kernel Synchronization David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
Interprocess Communication Race Conditions
Chapter 6: Process Synchronization
CS703 – Advanced Operating Systems
Process Synchronization
Process Synchronization: Semaphores
Chapter 5: Process Synchronization – Part 3
Background on the need for Synchronization
Synchronization.
Atomic Operations in Hardware
Atomic Operations in Hardware
SYNCHRONIZATION IN LINUX
Chapter 5: Process Synchronization
Topic 6 (Textbook - Chapter 5) Process Synchronization
Jonathan Walpole Computer Science Portland State University
COT 5611 Operating Systems Design Principles Spring 2012
Lecture 2 Part 2 Process Synchronization
Chapter 7: Synchronization Examples
CSCI1600: Embedded and Real Time Software
CS510 Concurrent Systems Class 1a
Kernel Synchronization II
Userspace Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
CSCI1600: Embedded and Real Time Software
Lecture 20: Synchronization
CS510 Concurrent Systems Jonathan Walpole.
Linux Kernel Locking Techniques
“The Little Book on Semaphores” Allen B. Downey
Process/Thread Synchronization (Part 2)
Akos Ledeczi EECE 6354, Fall 2017 Vanderbilt University
Akos Ledeczi EECE 6354, Fall 2015 Vanderbilt University
Presentation transcript:

Linux Synchronization Computer Science & Engineering Department Arizona State University Tempe, AZ 85287 Dr. Yann-Hang Lee yhlee@asu.edu (480) 727-7507

Synchronization Primitives Hardware provides simple low-level atomic operations, upon which we can build high-level synchronization primitives Then, implementation of critical sections and build correct multi-threaded/multi-process programs Low-level synchronization primitives in Linux Memory barrier Atomic operations Synchronize with interrupts Spin locks High-level synchronization primitives in Linux Completion Semaphore Mutex Futex

Memory Barrier Motivation Memory Barriers: Linux barrier operation Reorder code as long as it correctly maintains data flow dependencies within a function and with called functions Reorder instruction execution as long as it correctly maintains register flow dependencies Reorder memory modification as long as it correctly maintains data flow dependencies Memory Barriers: instructions to compiler and/or hardware to complete all pending accesses before issuing any more prevent compiler/hardware reordering Linux barrier operation barrier – prevent only compiler reordering (For gcc: __asm__ __volatile__("": : :"memory")) mb – prevents load and store reordering (asm volatile("mfence":::"memory") rmb – prevents load reordering ( asm volatile("lfence":::"memory") wmb – prevents store reordering ( asm volatile("sfence" ::: "memory")

Atomic Operations In RISC, load-link/store conditional (ldrex/strex) Atomic operations provide instructions that are executable atomically; without interruption Not possible for two atomic operations by a single CPU to occur concurrently Atomic 80x86 instructions Instructions that make zero or one aligned memory access Read-modify-write instructions (inc or dec) Read-modify-write instructions whose opcode is prefixed by the lock byte (0xf0) In RISC, load-link/store conditional (ldrex/strex) store can succeed only if no updates have occurred to that location since the load-link. Linux kernel two sets of interfaces for atomic operations: one for integers and another for individual bits Implementation is hardware-dependent.

Linux Atomic Operations Uses atomic_t data type Atomic operations on integer counter in Linux For example: a counter to be incremented by multiple threads Atomic operate at the bit level, such as unsigned long word = 0; set_bit(0, &word); /* bit zero is now set (atomically) */ Function Description atomic_read(v) atomic_set(v,i) atomic_add(i,v) atomic_sub(i,v) atomic_sub_and_test(i,v) atomic_inc(v) atomic_dec(v) atomic_dec_and_test(v) atomic_inc_and_test(v) atomic_add_negative(i,v) Return *v set *v to i add i to *v subtract i from *v subtract i from *v and return 1 if result is 0 add 1 to *v subtract 1 from *v subtract 1 from *v and return 1 if result is 0 add 1 to *v and return 1 if result is 0 add i to *v and return 1 if result is negative

Spinlock Distinct implementations in SMP and UP kernels. Ensuring mutual exclusion using a busy-wait lock. if the lock is available, it is taken, the mutually-exclusive action is performed, and then the lock is released. If the lock is not available, the thread busy-waits on the lock until it is available. it keeps spinning, thus wasting the processor time If the waiting duration is short, faster than putting the thread to sleep and then waking it up later when the lock is available. really only useful in SMP systems Distinct implementations in SMP and UP kernels. certain locks not to exist at all in a UP kernel. different combinations of CONFIG_SMP and CONFIG_PREEMPT Same APIs.

Linux Spinlock Operations Basic spin_lock_init – initialize a spin lock before using it for the first time spin_lock – acquire a spin lock, spin waiting if it is not available spin_unlock – release a spin lock spin_unlock_wait – spin waiting for spin lock to become available, but don't acquire it spin_trylock – acquire a spin lock if it is currently free, otherwise return error spin_is_locked – return spin lock state Spinlock with local CPU interrupt disable spin_lock_irqsave( &my_spinlock, flags ); // critical section spin_unlock_irqrestore( &my_spinlock, flags ); Reader/writer spinlock – allows multiple readers with no writer

Linux Spinlock Implementation in UP #define _raw_spin_lock_irqsave(lock, flags) __LOCK_IRQSAVE(lock, flags) #define __LOCK_IRQSAVE(lock, flags) \ do { local_irq_save(flags); __LOCK(lock); } while (0) \ /* irq disabled (cli) */ #define local_irq_save(flags) do { raw_local_irq_save(flags); } while (0) #define __LOCK(lock) \ do { preempt_disable(); __acquire(lock); (void)(lock); } while (0) #define preempt_disable() barrier() \ /* need preempt disable/enable to be barriers, \ processor stop a while and it make sure any memory \ operation (especially write) has been done */ # define __acquire(x) (void)0 /*no op */ Note: Any expression that doesn't have any side-effects can be treated as a no-op by the compiler, which dosn't have to generate any code for it (though it may do{ } while(0) --- Helps grouping multiple statements into a single one, so that a function-like macro can actually be used as a function.

Linux Spinlock Implementation in SMP (1) static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock) { unsigned long flags; local_irq_save(flags); preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); do_raw_spin_lock_flags(lock, &flags); return flags; } do_raw_spin_lock_flages()  arch_spin_lock_flags() A simple implementation of spinning represented by an integer value. 1 indicates that the lock is available. spin_lock() code works by decrementing the value, then looking to see whether the result is zero; if so, the lock has been successfully obtained. If negative, the lock is owned by somebody else. busy-waiting until the value of the lock becomes positive; then tries again lock of fairness and wait time can be arbitrary long

Linux Spinlock Implementation in SMP (2) static inline unsigned long __raw_spin_lock_irqsave(raw_spinlock_t *lock) { unsigned long flags; local_irq_save(flags); preempt_disable(); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); do_raw_spin_lock_flags(lock, &flags); return flags; } do_raw_spin_lock_flages()  arch_spin_lock_flags() A simple implementation of spinning represented by an integer value. 1 indicates that the lock is available. spin_lock() code works by decrementing the value, then looking to see whether the result is zero; if so, the lock has been successfully obtained. If negative, the lock is owned by somebody else. busy-waiting until the value of the lock becomes positive; then tries again lock of fairness and wait time can be arbitrary long

Linux Spinlock Implementation in SMP (3) Ticket spinlock Each process keeps track of its turn via the value of its ticket Implementation /arch/x86/include/asm/spinlock.h two parts, one indicating the current head of the queue, and the other indicating the current tail. acquire lock by atomically noting the tail and incrementing it by one waiting until the head becomes equal to the initial value of the tail use an xadd instruction with lock prefix: xadd() adds "inc" to "*ptr" and atomically returns the previous value of "*ptr". ticketLock_init(int *next_ticket, int *now_serving) { *now_serving = *next_ticket = 0; } ticketLock_acquire(int *next_ticket, int *now_serving) my_ticket = fetch_and_inc(next_ticket); while(*now_serving != my_ticket) {} ticketLock_release(int *now_serving) now_serving++;

Linux Semaphore Kernel semaphores struct semaphore: count, wait queue, and number of sleepers void sem_init(struct semaphore *sem, int val); // Initialize a semaphore’s counter sem->count to given value void down(struct semaphore *sem); //try to lock the critical section by decreasing sem->count void up(struct semaphore *sem); // release the semaphore int down_trylock(struct semaphore *sem); // try to get the semaphore without blocking, otherwise return an error blocked thread can be in TASK_UNINTERRUPTIBLE or TASK_INTERRUPTIBLE (by timer or signal) Read/Write semaphores

Linux Semaphore Implementation Goal: optimize for uncontended (common) case Implementation idea Uncontended case: use atomic operations Contended case: use spin locks and wait queues in __down_common() call schedule_timeout() to put the process into sleep state in __up() struct semaphore { raw_spinlock_t lock; unsigned int count; struct list_head wait_list; }; void down(struct semaphore *sem) { unsigned long flags; raw_spin_lock_irqsave(&sem->lock, flags); if (likely(sem->count > 0)) sem->count--; else __down(sem); raw_spin_unlock_irqrestore(&sem->lock, flags); } static noinline void __sched __up(struct semaphore *sem) { struct semaphore_waiter *waiter = list_first_entry(&sem->wait_list, struct semaphore_waiter, list); list_del(&waiter->list); waiter->up = true; wake_up_process(waiter->task); }

Mutex in Linux (1) Two states: locked and unlocked. if locked, wait until it is unlocked only the thread that locked the mutex may unlock it Various implementations for performance/function tradeoffs Speed or correctness (deadlock detection) lock the same mutex multiple times priority-based and priority inversion forget to unlock or terminate unexpectedly Available types normal fast error checking recursive: owner can lock multiple times (couting) robust: return an error code when crashes while holding a lock RT: priority inheritance

Mutex in Linux(2) Operations: struct mutex mutex_unlock – release the mutex mutex_lock – get the mutex (can block) mutex_lock_interruptible – get the mutex, but allow interrupts mutex_trylock – try to get the mutex without blocking, otherwise return an error mutex_is_locked – determine if mutex is locked struct mutex count=1 if the mutex is unlocked, or 0 if locked. spinlock is to protect the wait queue Fast path – non-contended case, an atomic “decl count” Mid path Optimistic spinning while the lock owner is running. Only if there are no other processes ready to run that have higher priority. Slow path – add to wait queue, change task state to TASK_INTERRUPTIBLE call schedule_preempt_disabled() struct mutex { atomic_t count; spinlock_t wait_lock; struct list_head wait_list; };

Pthread Futex Lightweight and scalable In the non-contended case can be acquired/released from userspace without having to enter the kernel. lock is a user-space address, e.g. a 32-bit lock variable field. “uncontended” and “waiter-pending” kernel provides futex queue, and sys_futex system call invoke sys_futex only when there is a need to use futex queue need atomic operations in user space race condition: atomic update of ulock and system call are not atomic typedef struct ulock_t { long status; } ulock_t;

Linux Completions To wait for the completion, call A common pattern in kernel programming (wait and wake-up) Start a new thread or IO Wait for that activity to complete or an interrupt arrives To use and create a completion #include <linux/completion.h> DECLARE_COMPLETION(my_completion); DECLARE_COMPLETION_ONSTACK(setup_done); struct completion my_completion; init_completion(&my_completion); struct completion { unsigned int done; wait_queue_head_t wait; }; To wait for the completion, call void wait_for_completion(struct completion *c); void wait_for_completion_interruptible(struct completion *c); void wait_for_completion_timeout(struct completion *c, unsigned long timeout); To signal a completion event, call one of the following void complete(struct completion *c); void complete_all(struct completion *c);

Preempt-RT Patch Main features of preempt-RT Wiki – https://rt.wiki.kernel.org/index.php/Main_Page Design goals: 100% Preemptible kernel Not actually possible, Removal of disabling of interrupts and disabling other forms of preemption Quick reaction times! bring latencies down to a minimum In preemptable Linux kernel Preempt anywhere except within spin_locks and some minor other areas (preempt_disable). Main features of preempt-RT spin_locks are now mutexes. (can be preempted, sleeping spinlock) Interrupts as threads (interrupt handlers can be scheduled) request_threaded_irq() Priority inheritance inside the kernel (not just for user mutexes) High resolution timers

Spin_lock in Preempt-RT Source code: /include/linux/spinlock_rt.h /kernel/locking/rtmutex.c Converting into mutexes Use rt_mutex for PI support Enables preemption within critical sections. If cannot enter a critical section, scheduled out and sleep until the mutex is released. Interrupt handlers must also be converted to threads since interrupt handlers also use spin_locks. raw_spin_lock (the original spin lock) is still used in scheduler, rt_mutex, debugging/tracing, and timer interrupts void __lockfunc rt_spin_lock(spinlock_t *lock) { migrate_disable(); rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); } static inline void rt_spin_lock_fastlock(struct rt_mutex *lock, void (*slowfn)(struct rt_mutex *lock)) might_sleep(); if (likely(rt_mutex_cmpxchg(lock, NULL, current))) rt_mutex_deadlock_account_lock(lock, current); else slowfn(lock);