Kernel Synchronization

Slides:



Advertisements
Similar presentations
Tutorial 3 - Linux Interrupt Handling -
Advertisements

Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
5.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 5: CPU Scheduling.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Mutual Exclusion.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Kernel Synchronization in Linux (Chap. 5 in Understanding the Linux Kernel) J. H. Wang Sep. 29, 2011.
Kernel Synchronization
CS533 Concepts of Operating Systems Class 4 Linux Kernel Locking Issues.
Chapter 6: Process Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Feb 8, 2005 Objectives Understand.
Home: Phones OFF Please Unix Kernel Parminder Singh Kang Home:
Chapter 6: Synchronization. 6.2 Silberschatz, Galvin and Gagne ©2005 Operating System Principles Module 6: Synchronization 6.1 Background 6.2 The Critical-Section.
Instructor: Umar KalimNUST Institute of Information Technology Operating Systems Process Synchronization.
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Cosc 4740 Chapter 6, Part 3 Process Synchronization.
6.3 Peterson’s Solution The two processes share two variables: Int turn; Boolean flag[2] The variable turn indicates whose turn it is to enter the critical.
Operating Systems CSE 411 Multi-processor Operating Systems Multi-processor Operating Systems Dec Lecture 30 Instructor: Bhuvan Urgaonkar.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts Essentials – 9 th Edition Chapter 5: Process Synchronization.
Kernel Locking Techniques by Robert Love presented by Scott Price.
Synchronization in Linux COMS W4118 Spring Kernel Synchronization Can think of the kernel as a server Concurrent requests are possible Synchronization.
Chapter 6 – Process Synchronisation (Pgs 225 – 267)
Chapter 6: Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware.
© 2006 RightNow Technologies, Inc. Synchronization September 15, 2006 These people do not actually work at RightNow.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Operating Systems CSE 411 CPU Management Dec Lecture Instructor: Bhuvan Urgaonkar.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Operating Systems CMPSC 473 Signals, Introduction to mutual exclusion September 28, Lecture 9 Instructor: Bhuvan Urgaonkar.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
CHAPTER 7 CONCURRENT SOFTWARE Copyright © 2000, Daniel W. Lewis. All Rights Reserved.
CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Synchronization.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 6: Process Synchronization.
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Chapter 6: Process Synchronization
CS703 – Advanced Operating Systems
Process Synchronization
Chapter 5: Process Synchronization
Background on the need for Synchronization
Kernel Synchronization
Synchronization.
SYNCHRONIZATION IN LINUX
Chapter 5: Process Synchronization
Topic 6 (Textbook - Chapter 5) Process Synchronization
Chapter 2: The Linux System Part 3
Module 7a: Classic Synchronization
Lecture 2 Part 2 Process Synchronization
Critical section problem
Architectural Support for OS
Grades.
Top Half / Bottom Half Processing
Chapter 6: Process Synchronization
CS510 Concurrent Systems Class 1a
Kernel Synchronization II
CSE 451: Operating Systems Autumn 2003 Lecture 7 Synchronization
CSE 451: Operating Systems Autumn 2005 Lecture 7 Synchronization
CSE 451: Operating Systems Winter 2003 Lecture 7 Synchronization
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Architectural Support for OS
Chapter 6: Synchronization Tools
CS510 Concurrent Systems Jonathan Walpole.
Linux Kernel Locking Techniques
Process/Thread Synchronization (Part 2)
CSE 542: Operating Systems
CSE 542: Operating Systems
Presentation transcript:

Kernel Synchronization 國立中正大學 資訊工程研究所 羅習五 老師 少部分內容參酌自薛智文老師

Chapter 5: Kernel Synchronization Kernel Control Paths When Synchronization is Not Necessary Synchronization Primitives Synchronizing Accesses to Kernel Data Structures Examples of Race Condition Prevention

Kernel You could think of the kernel as a server that answers requests; these requests can come either from a process running on a CPU or an external device issuing an interrupt request. Top halves Bottom halves

Kernel Control Paths Kernel Control Path (KCP) a sequence of instructions executed by the kernel to handle interrupts (/exception) of different kinds Each kernel request is handled by a different KCP system call request (software interrupt): system_call  ret_from_sys_call System call Top halves Bottom halves

Kernel Requests A process executing in User Mode causes an exception. (e.g., x/0) A process executing in Kernel Mode causes a Page Fault exception. An external device sends a signal to a programmable interrupt controller (PIC), and the corresponding interrupt is enabled A process running raises an interprocessor interrupt (IPI).

Kernel Control Paths The CPU interleaves KCPs when: A process switch occurs. (it relinquishes control of CPU, e.g., sleep/wait) An interrupt occurs. A deferrable function is executed. Interleaving improves the throughput of PIC and device controllers.

A fully preemptable kernel Nonpreemptive kernel? & preemptive kernel? Nonpreemptive kernel: Linux kernel ~2.4 preemptive kernel: Linux kernel 2.6 Kernel 2.4 + preempt_count* = kernel 2.6 The value is greater than 0 when … The kernel is executing an ISR The deferrable functions are disabled The kernel preemption level has been explicitly disabled *: This field is in the thread_info descriptor.

When Sync. Is Not Necessary Simplifying assumptions: Interrupt handlers and tasklets need not to be coded as reentrant functions Interrupt handlers, softirqs, and tasklets are both nonpreemptable and non-blocking Per-CPU variable accessed by softirqs and tasklets only do not require sync. A data structure access by only one kind of tasklet does not require sync.

Synchronization Primitives Per-CPU variables only on SMP systems; keep them short! One element per each CPU in the system general, read/write, big reader Atomic operations Semaphores memory bus lock, read-modify-write (rmw) ops general, read/write Local interrupt disabling Memory barriers Local softirq disabling avoids compiler, CPU instruction re-ordering Read-copy-update (RCU) Spin locks

Synchronization Primitives Technique Description Scope Atomic operation Atomic read-modify-write instruction to a counter All CPUs Memory barrier Avoid instruction re-ordering Local CPU Spin lock Lock with busy wait Semaphore Lock with blocking wait Local interrupt disabling Forbid interrupt handling on a single CPU Local softirq disabling Forbid deferrable function handling on a single CPU Global interrupt disabling Forbid interrupt and softirq handling on all CPUs

Atomic Operations Many instructions not atomic in hw (MP) rmw instructions: inc, test-and-set, swap unaligned memory access rep instructions Compiler may not generate atomic code even i++ is not necessarily atomic! (i=i+1) Linux – atomic_ macros atomic_t – 24 bit atomic counters Intel implementation (atomic, for MP) lock prefix byte 0xf0 – locks memory bus

Atomic operations in Linux Function Description atomic_read(v) Return *v atomic_set(v,i) Set *v to i atomic_add(i,v) Add i to *v atomic_sub(i,v) Subtract i from *v atomic_sub_and_test(i, v) Subtract i from *v and return 1 if the result is zero; 0 otherwise atomic_inc(v) Add 1 to *v atomic_dec(v) Subtract 1 from *v atomic_dec_and_test(v) Subtract 1 from *v and return 1 if the result is zero; 0 otherwise atomic_inc_and_test(v) Add 1 to *v and return 1 if the result is zero; 0 otherwise atomic_add_negative(i, v) Add i to *v and return 1 if the result is negative; 0 otherwise

Atomic bit handling functions in Linux Description test_bit(nr, addr) Return the value of the nrth bit of *addr set_bit(nr, addr) Set the nrth bit of *addr clear_bit(nr, addr) Clear the nrth bit of *addr change_bit(nr, addr) Invert the nrth bit of *addr test_and_set_bit(nr, addr) Set the nrth bit of *addr and return its old value test_and_clear_bit(nr, addr) Clear the nrth bit of *addr and return its old value test_and_change_bit(nr, addr) Invert the nrth bit of *addr and return its old value atomic_clear_mask(mask, addr) Clear all bits of addr specified by mask atomic_set_mask(mask, addr) Set all bits of addr specified by mask

Memory Barriers Compilers and hw re-order memory accesses as an optimization true on SMP and even UP systems! Memory barrier – instruction to hw/compiler to complete all pending accesses before issuing more read memory barrier – acts on read requests write memory barrier – acts on write requests Linux macros for UP and MP: mb(), rmb(), wmb() for MP only: smp_mp(), smp_rmb(), smp_wmb()

Memory barriers in Linux Macro Description mb( ) Memory barrier for MP and UP rmb( ) Read memory barrier for MP and UP wmb( ) Write memory barrier for MP and UP smp_mb( ) Memory barrier for MP only smp_rmb( ) Read memory barrier for MP only smp_wmb( ) Write memory barrier for MP only

Peterson’s Solution Two process solution Assume that the LOAD and STORE instructions are atomic; that is, cannot be interrupted. The two processes share two variables: int turn; Boolean flag[2] The variable turn indicates whose turn it is to enter the critical section. The flag array is used to indicate if a process is ready to enter the critical section. flag[i] = true implies that process Pi is ready!

Algorithm for Process Pi while (true) { flag[i] = TRUE; turn = j; while ( flag[j] && turn == j); /*CRITICAL SECTION*/ flag[i] = FALSE; /*REMAINDER SECTION*/ } ?

flag[i] = False turn = i Task_i Task_j while (true) { while (true) { turn = j; flag[i] = TRUE; while ( flag[j] && turn == j); /*CRITICAL SECTION*/ flag[i] = FALSE; /*REMAINDER SECTION*/ } while (true) { turn = i; flag[j] = TRUE; while ( flag[i] && turn == i); /*CRITICAL SECTION*/ flag[i] = FALSE; /*REMAINDER SECTION*/ } flag[i] = False turn = i

Peterson’s Solution while (true) { flag[i] = TRUE; mb( ); turn = j; while ( flag[j] && turn == j); /*CRITICAL SECTION*/ flag[i] = FALSE; /*REMAINDER SECTION*/ }

Spin Lock A special kind of lock designed to work in a multiprocessor environment. Spin lock R/W spin lock Sequential lock Useless in a uniprocessor environment (?)

Spin lock functions spin_lock_init( ) spin_lock( ) spin_unlock( ) Description spin_lock_init( ) Set the spin lock to 1 (unlocked) spin_lock( ) Cycle until spin lock becomes 1 (unlocked), then set it to 0 (locked) spin_unlock( ) spin_unlock_wait( ) Wait until the spin lock becomes 1 (unlocked) spin_is_locked( ) Return 0 if the spin lock is set to 1 (unlocked); 1 otherwise spin_trylock( ) Set the spin lock to 0 (locked), and return 1 if the lock is obtained; 0 otherwise

Spin lock functions spin_lock(slp) 1: lock; decb slp jns 3f 2: cmpb $0,slp pause jle 2b   jmp 1b 3: spin_unlock(slp) lock; movb $1, slp

Read/Write Spin Locks initial 0x01 000000 lock # of reading write One read 0x00ffffff Two read 0x00fffffe

Read Spin Lock read_lock(rwlp) movl $rwlp,%eax lock; subl $1,(%eax) jns 1f    call __read_lock_failed 1:  __read_lock_failed:    lock; incl (%eax) 1:cmpl $1,(%eax)    js 1b    lock; decl (%eax)    js __read_lock_failed ret read_unlock(rwlp) lock; incl rwlp

Write Spin Lock write_lock(rwlp) movl $rwlp,%eax lock; subl $0x01000000,(%eax) jz 1f    call write_lock_failed 1:  __write_lock_failed:    lock; addl $0x01000000,(%eax) 1:  cmpl $0x01000000,(%eax)    jne 1b    lock; subl $0x01000000,(%eax)    jnz __write_lock_failed     ret write_unlock(rwlp) lock; addl $0x01000000,rwlp

Seqlock (sequential lock) A seqlock is a locking mechanism Linux for supporting fast writes of shared variables. seqlock := sequence number + lock The lock is to support synchronization between two writers the counter is for indicating consistency in readers

Seqlock (sequential lock) the writer increments the sequence number, both after acquiring the lock and before releasing the lock. Readers read the sequence number before and after reading the shared data. do { while (((old_seq_num = seq_num)%2) != 0); //READER: critical section } while (old_seq_num != seq_num); Seqlock was first applied to system time counter updating.

MONITOR & MWAIT (x86, for thread synchronization) MONITOR defines an address range used to monitor write-back stores. MWAIT is used to indicate that the software thread is waiting for a write-back store to the address range defined by the MONITOR instruction.

Read-copy-update (RCU) It allows extremely low overhead, wait-free reads. RCU updates can be expensive they must leave the old versions of the data structure in place to accommodate pre-existing readers. These old versions are reclaimed after all pre-existing readers finish their accesses. RCU is a new addition in Linux 2.6; it is used in the networking layer and in the virtual file system (VFS). Reference: Paul E. McKenney: Read-copy-update (RCU), http://www.rdrop.com/users/paulmck/rclock/ IPDPS 2006 Best Paper

Read-copy-update (RCU) reader data Local_PTR PTR RCU allows extremely low overhead, wait-free reads.

Read-copy-update (RCU) reader data Local_PTR PTR writer data (new) kmalloc + copy + update New_PTR RCU updates can be very expensive…

Read-copy-update (RCU) reader data PTR An atomic operation writer data (new) PTR = New_PTR PTR Remove pointers to a data structure, so that subsequent readers cannot gain a reference to it.

Read-copy-update (RCU) reader data PTR writer data (new) PTR = new_PTR PTR Wait for all previous readers to complete their RCU read-side critical sections.

Read-copy-update (RCU) data writer or GC data (new) kfree(old_ptr) PTR The “GC” can safely reclaim the data (the old version).

Read-copy-update (RCU) data (new) PTR

Read-copy-update (RCU) Lock scheduler Unlock scheduler CTX_SW reader writer GC Lock_scheduler := preempt_count++ Unlock_scheduler := preempt_count--

Semaphores Kernel semaphores System V IPC semaphores used by kernel control paths. can be acquired only by functions that are allowed to sleep; interrupt handlers and deferrable functions cannot use them. System V IPC semaphores used by User Mode processes

Semaphores struct semaphore implementation requires lower-level synch! count (atomic_t): >0 free; 0 inuse, no waiters; <0 inuse, waiters wait: wait queue sleepers: 0 (none), 1 (some), occasionally 2 implementation requires lower-level synch! atomic updates, spinlock, interrupt disabling optimized assembly code for normal case (down()) C code for slower “contended” case (_ _down())

Semaphores up: movl $sem,%ecx lock; incl (%ecx) jg 1f pushl %eax pushl %edx    pushl %ecx    call _ _up    popl %ecx    popl %edx    popl %eax 1:  down:    movl $sem,%ecx    lock; decl (%ecx);    jns 1f    pushl %eax    pushl %edx    pushl %ecx    call _ _down    popl %ecx    popl %edx    popl %eax 1: 

_ _down WaitingQ.ins WaitingQ.del

Read/Write Semaphores New feature of Linux 2.4 Read/Write Semaphores FIFO complex implementation similar to regular semaphores operations: down_read(), down_write() up_read(), up_write()

Read/Write Semaphores The first process is always awoken. If it is a writer, the other processes in the wait queue continue to sleep. If it is a reader, any other reader following the first process is also woken up and gets the lock. However, readers that have been queued after a writer continue to sleep. R R R W R W R R

Completions The current implementation of up( ) and down( ) also allows them to execute concurrently on the same semaphore. up( ) might attempt to access a data structure that no longer exists. up( )  complete( ). down( )  wait_for_completion( ).

Completions 1 2 create_sem down up del_sem del_sem

Local Interrupt Disabling Local interrupt disabling does not protect against concurrent accesses to data structures by interrupt handlers running on other CPUs. In multiprocessor systems, local interrupt disabling is often coupled with spin locks. Spin locks only on SMP systems; keep them short! general, read/write, big reader

Global Interrupt Disabling A typical scenario consists of a driver that needs to reset the hardware device. Global interrupt disabling significantly lowers the system concurrency level. An interrupt service routine should never execute the cli( ) macro.

_ _global_cli() wait for top and bottom halves to complete disable local interrupts grab spinlock disable all interrupts

Disabling Deferrable Functions disabling interrupts disables deferred functions possible to disable deferred functions but not all interrupts ops (macros): local_bh_disable() local_bh_enable()

Choosing Synch Primitives avoid synch if possible! (clever instruction ordering) example: inserting in linked list (needs barrier still) Example: task migration use atomics or rw spinlocks if possible use semaphores if you need to sleep complicated structures accessed by deferred functions

Example Race Conditions reference counters for sharing structs get/put functions deallocate when 0 memory map semaphore slab cache list semaphore inode semaphore