Chapter 6: Synchronization Tools

Chapter 6: Synchronization Tools

Chapter 6: Synchronization Tools
Background The Critical-Section Problem Peterson’s Solution Hardware Support for Synchronization Mutex Locks Semaphores Monitors Liveness Evaluation

Objectives Describe the critical-section problem and illustrate a race condition Illustrate hardware solutions to the critical-section problem using memory barriers, compare-and-swap operations, and atomic variables Demonstrate how: mutex locks, semaphores, monitors, and condition variables can be used to solve the critical section problem Evaluate tools that solve the critical-section problem in low-. Moderate-, and high-contention scenarios

Background We’ve already seen that processes can execute concurrently or in parallel A Cooperating Process: is one that can affect or be affected by other processes executing in the system. can either directly share a logical address space (that is, both code and data) or be allowed to share data only through shared memory or message passing A process may be interrupted at any point in its instruction stream The CPU may be assigned to execute instructions of another process Thus, concurrent access to shared data may result in data inconsistency Various mechanisms are there to ensure the orderly execution of cooperating processes that share a logical address space, so that data consistency is maintained

Producer-Consumer Problem
A common paradigm/example for cooperating processes A producer process produces information that is consumed by a consumer process, Example: Compilers A compiler may produce assembly code that is consumed by an assembler The assembler, in turn, may produce object modules that are consumed by the loader Client–Server Provides a useful metaphor for Producer-Consumer paradigm, a server as a producer and a client as a consumer A web server produces (provides) web content such as HTML files and images, which are consumed (read) by the client web browser requesting the resource To allow producer and consumer processes to run concurrently, We may have a buffer of items that can be filled by the producer and emptied by the consumer This buffer will reside in a region of memory that is shared by the producer and consumer processes A producer can produce one item while the consumer is consuming another item The producer and consumer must be synchronized so that the consumer does not try to consume an item that has not yet been produced This bounded buffer could be used to enable processes to share memory

Producer-Consumer Problem
Suppose that we wanted to provide a solution to the consumer-producer problem that two processes work on the same buffer (bounded buffer) We can do so by having an integer counter that keeps track of the number of items in the buffer Initially, counter is set to 0 It (counter) is incremented by the producer after it produces a new item to the buffer It (counter) is decremented by the consumer after it consumes an item from the buffer

Producer - Consumer Although the producer and consumer routines shown are correct separately, they may not function correctly when executed concurrently.

Race Condition: Example 1
suppose that the value of the variable count is currently 5 and that the producer and consumer processes concurrently execute the statements “count++” and “count--”. Executing these two statements, The value of the variable count may be 4, 5, or 6! The only correct result, though, is count == 5, which is generated correctly if the producer and consumer execute separately counter++ could be implemented as register1 = counter register1 = register counter = register1 counter-- could be implemented as register2 = counter register2 = register counter = register2 Consider this execution interleaving with “count = 5” initially: S1: producer execute register1 = counter {register1 = 5} S2: producer execute register1 = register {register1 = 6} S3: consumer execute register2 = counter {register2 = 5} S4: consumer execute register2 = register2 – 1 {register2 = 4} S5: producer execute counter = register {counter = 6 } S6: consumer execute counter = register {counter = 4} Even though register1 and register2 may be the same physical register, remember that the contents of this register will be saved and restored by the interrupt handler during CPU scheduling We would arrive at this incorrect state because we allowed both processes to manipulate the variable count concurrently. The concurrent execution of “count++” and “count--” is equivalent to a sequential execution in which the lower-level statements are interleaved in some arbitrary order

Process Synchronization & Coordination
In a Race Condition A situation where several processes are allowed to access and manipulate the same shared data concurrently, and The outcome of the execution depends on the particular order in which the access takes place To guard against the race condition above and guarantee correctness, we need to ensure that only one process at a time can manipulate the variable count synchronize these processes in some way These situations occur frequently in OS as different parts of the system manipulate resources The prominence of multicore systems has brought an increased emphasis on developing multithreaded applications Several threads that are quite possibly sharing data and running in parallel on different processing cores Clearly, we want any changes that result from such activities not to interfere with one another

Critical Section Problem

Critical Section Problem
Consider a system of n processes {p0, p1, … pn-1} Each process has segment of code, called critical section In a critical section, the process may be accessing — and updating — data that is shared with at least one other process. Process may be changing common variables, updating table, writing file, etc. Rule: when one process is in the critical section, no other may be in its critical section Critical Section Problem is to design protocol to synchronize their activity to cooperatively and safely share data The section of code implementing this request is the entry section The critical section may be followed by an exit section The remaining code is the remainder section Rule: Each process must ask permission to enter critical section in entry section, may follow critical section with exit section, then remainder section

Solution to Critical-Section Problem
A solution to the critical-section problem must satisfy the following requirements: Mutual Exclusion if process Pi is executing in its critical section, then no other processes can be executing in their critical sections Progress if no process is executing in its critical section and there exist some processes that wish to enter their critical section, then only processes that are not executing in their remainder sections can participate in deciding who will enter its critical section next additionally, the selection cannot be postponed indefinitely Bounded Waiting A bound/limit must exist on the number of times that other processes are allowed to enter their critical sections after a process has made a request to enter its critical section and before that request is granted Assume that each process executes at a nonzero speed No assumption concerning relative speed of the n processes

Critical-Section Handling in OS
At a given point in time, many kernel-mode processes may be active in the OS As a result, the code implementing an OS (kernel code) is subject to several possible race conditions Example 1: Kernel process that maintains the kernel data structure i.e. the list of all opened files This list must be modified/updated when a new file is opened or closed If two processes were to open files simultaneously, the separate updates to this list could result in a race condition Example 2: Kernel process that maintains the structures for maintaining memory allocation. This process is prone to possible race conditions. Kernel developers have to ensure that the operating system is free from such potential race conditions However, the critical-section problem could be solved simply in a single-core environment, of course, if we could prevent interrupts from occurring while a shared variable is being modified Unfortunately, this solution is not as feasible in a multiprocessor environments

Race Condition in OS: Example 3
Processes P0 and P1 are creating child process using the fork() system call Race condition on kernel variable next_available_pid which represents the next available process identifier (pid) Unless there is mutual exclusion, the same pid could be assigned to two different processes! OOPS!!!

Critical-Section Handling in OS
Two general approaches are used to handle critical sections in OS Preemptive Kernels Allows a process to be preempted while it is running in kernel mode Preemptive kernels are especially difficult to design for SMP architectures, since in these environments it is possible for two kernel-mode processes to run simultaneously on different CPU cores A preemptive kernel may be more responsive No kernel-mode process will run for an arbitrarily long period of time A preemptive kernel is more suitable for real-time programming Non-Preemptive Kernels Does not allow a process running in kernel mode to be preempted  Kernel-mode process will run until it exits kernel mode, blocks, or voluntarily yields control of the CPU Essentially free of race conditions in kernel mode; as only one process is active in the kernel at a time

One of the software-based solutions to the CS problem
Peterson’s Solution One of the software-based solutions to the CS problem We refer to it as a software-based solution because it involves no special support from the operating system nor it requires any specific hardware instructions to ensure mutual exclusion.

Peterson’s Solution to CS
A classic software-based solution to the critical-section problem Not guaranteed to work on modern architectures! However, it is a good algorithmic description of solving the CS problem illustrates some of the complexities involved in designing software that addresses the requirements of mutual exclusion, progress, and bounded waiting is restricted to two processes that alternate execution between their critical sections and remainder sections Two process solution Assume that the load and store machine-language instructions are atomic; that is, cannot be interrupted The two processes share two variables: int turn; boolean flag[2]; The variable turn indicates whose turn it is to enter the critical section; if turn == i, then process Pi is allowed to execute in its critical section The flag array is used to indicate if a process is ready to enter the critical section if flag[i] == true implies that process Pi is ready to enter its critical section

Peterson’s Solution (Cont.)
To enter the critical section: Process Pi first sets flag[i] to be true Then sets turn to the value j, This asserting that if the other process (Pj) wishes to enter the critical section, it can do so If both processes try to enter at the same time, turn will be set to both i and j at roughly the same time. Only one of these assignments will last; the other will occur but will be overwritten immediately The eventual value of turn determines which of the two processes is allowed to enter its critical section first. while (true) { flag[i] = true; turn = j; while (flag[j] && turn == j) ; /* busy waiting */ /* critical section */ flag[i] = false; /*remainder section */ } while (true) { flag[j] = true; turn = i; while (flag[i] && turn == i) ; /* busy waiting */ /* critical section */ flag[j] = false; /*remainder section */ } Pi Pj

Peterson’s Solution (Cont.)
Provable that the three Critical-Section (CS) requirement are met: Mutual exclusion is preserved Pi enters CS only if either flag[j] = false or turn = i Note: if both processes can be executing in their critical sections at the same time, then flag[0] == flag[1] == true. These two observations imply that P0 and P1 could not have successfully executed their while statements at about the same time, since the value of turn can be either 0 or 1 but cannot be both (i or j but not both). One of the processes—say, Pj—must have successfully executed the while statement, whereas Pi had to execute at least one additional statement (“turn == j”). However, at that time, flag[j] == true and turn == j, and this condition will persist as long as Pj is in its critical section; Then, as a result, mutual exclusion is preserved Progress requirement is satisfied Bounded-waiting requirement is met To prove properties 2 and 3,we note that a process Pi can be prevented from entering the critical section only if it is stuck in the while loop with the condition flag[j] == true and turn == j; this loop is the only one possible. If Pj is not ready to enter the critical section, then flag[j] == false, and Pi can enter its critical section. If Pj has set flag[j] to true and is also executing in its while statement, then either turn == i or turn == j. If turn == i, then Pi will enter the critical section. If turn == j, then Pj will enter the critical section. However, once Pj exits its critical section, it will reset flag[j] to false, allowing Pi to enter its critical section. If Pj resets flag[j] to true, it must also set turn to i. Thus, since Pi does not change the value of the variable turn while executing the while statement, Pi will enter the critical section (progress) after at most one entry by Pj (bounded waiting).

Peterson’s Solution Although useful for demonstrating an algorithm, Peterson’s Solution is not guaranteed to work on modern architectures Understanding why it will not work is also useful for better understanding race conditions To improve performance, processors and/or compilers may reorder operations that have no dependencies For single-threaded this is ok as the result will always be the same For multithreaded the reordering may produce inconsistent or unexpected results!

Peterson’s Solution Example: consider the following data that are shared between two threads: Two threads share the data: boolean flag = false; int x = 0; Thread 1 performs while (!flag) ; /* busy waiting */ print x Thread 2 performs x = 100; flag = true What is the expected output? is it 100? it is possible that a processor may reorder the instructions for Thread 2 so that flag is assigned true before assignment of x = 100. In this situation, it is possible that Thread 1 would output 0 for variable x. Less obvious is that the processor may also reorder the statements issued by Thread 1 and load the variable x before loading the value of flag. If this were to occur, Thread 1would output 0 for variable x even if the instructions issued by Thread 2 were not reordered.

Peterson’s Solution 100 is the expected output
However, the operations for Thread 2 may be reordered: flag = true; x = 100; If this occurs, the output may be 0! The effects of instruction reordering in Peterson’s Solution This allows both processes to be in their critical section at the same time!

Hardware Support for Synchronization

Synchronization Hardware
Uniprocessors – could disable interrupts Currently running code would execute without preemption Generally too inefficient on multiprocessor systems Operating systems using this not broadly scalable Multiprocessors: systems provide hardware support for implementing the critical section code Presented as primitive operations that provide support for solving the critical-section problem These Primitive operations can be used directly as synchronization tools Can be used to form the foundation of more abstract synchronization mechanisms (High-Level Software Tools) We will look at three forms of hardware support: 1. Memory Barriers 2. Hardware Instructions 3. Atomic Variables

1. Memory Barriers Memory Model are the memory guarantees a computer architecture makes to application programs Memory models may be either: Strongly ordered memory modification of one processor is immediately visible to all other processors Weakly ordered memory modification of one processor may not be immediately visible to all other processors Memory models vary by processor type kernel developers cannot make assumptions regarding the modifications visibility Computer architectures provide instructions; memory barrier (memory fences) is an instruction that forces any change in memory to be propagated (made visible) to all other processors

1. Memory Barrier (Cont.) When a memory barrier instruction is performed, the system ensures that all loads and stores are completed before any subsequent load or store operations are performed We could add a memory barrier to the following instructions to ensure Thread 1 outputs 100: boolean flag = false; int x = 0; Thread 1 now performs the statements while (!flag) memory_barrier(); print x Thread 2 now performs x = 100; memory_barrier(); flag = true data that are shared between the two threads This guarantees that the value of flag is loaded before the value of x. This ensures that the assignment to x occurs before the assignment to flag. Note: memory barriers are considered very low-level operations and are typically only used by kernel developers when writing specialized code that ensures mutual exclusion.

2. Hardware Instructions
Special hardware instructions that allow us to either: test-and-modify the content of a word, or to swap the contents of two words atomically (uninterruptibly) Allow to solve the critical-section (CS) problem in a relatively simple manner Test-and-Set() instruction Compare-and-Swap() instruction

2. test_and_set Instruction (Cont.)
Definition: boolean test_and_set (boolean *target) { boolean rv = *target; *target = true; return rv; } Executed atomically Returns the original value of passed parameter Set the new value of passed parameter to true Shared boolean variable lock, initialized to false Solution: do { while ( test_and_set( &lock ) ) ; /* do nothing: busy waiting */ /* critical section */ lock = false; /* remainder section */ } while (true);

2. compare_and_swap Instruction (Cont.)
Definition: int compare _and_swap(int *value, int expected, int new_value) { int temp = *value; if (*value == expected) *value = new_value; return temp; } Executed atomically Returns the original value of passed parameter value Set the variable value the value of the passed parameter new_value but only if *value == expected is true. That is, the swap takes place only under this condition. Shared integer lock initialized to 0; Solution: while (true){ while (compare_and_swap( &lock , 0, 1 ) != 0) ; /* do nothing: busy waiting */ /* critical section */ lock = 0; /* remainder section */ }

Bounded-waiting Mutual Exclusion with compare-and-swap
while (true) { waiting[i] = true; key = 1; while (waiting[i] && key == 1) key = compare_and_swap(&lock,0,1); waiting[i] = false; /* critical section */ j = (i + 1) % n; while ((j != i) && !waiting[j]) j = (j + 1) % n; if (j == i) lock = 0; else waiting[j] = false; /* remainder section */ }

3. Atomic Variables Typically, instructions such as compare-and-swap() are used as building blocks for other synchronization tools One tool is an atomic variable that provides atomic (uninterruptible) updates on basic data types such as integers and booleans For example, the increment() operation on the atomic variable sequence ensures sequence is incremented without interruption: increment( & sequence ); The increment() function can be implemented as follows: void increment(atomic_int *v) { int temp; do { temp = *v; } while (temp != (compare_and_swap(v,temp,temp+1)); }

Higher-Level Software Tools to the CS Problem
Mutex Locks Semaphores Monitors

Software Based Tools for CS
The presented hardware-based solutions to the critical-section problem are complicated as well as generally inaccessible to application programmers Instead, OS designers Build higher-level software tools to solve the critical-section problem Examples: Mutex Locks Semaphores Monitors

1. Mutex Locks Mutex is short for MUTual EXclusion
A very simple lock used to protect critical sections and prevent race conditions A process must acquire the lock before entering a critical section It releases the lock when it exits the critical section (after the cs) Protect a critical section by first acquire() a lock then release() the lock Boolean variable indicating if lock is available or not Calls to acquire() and release() must be atomic (uninterruptible) Usually implemented via hardware atomic instructions such as compare-and-swap() Cons: This solution requires busy waiting, while a process is in its critical section, any other process that tries to enter its critical section must loop continuously in the call to acquire(); bad in a real multiprogramming system, in which single CPU core is shared among many processes, wastes CPU cycles A process that attempts to acquire an unavailable lock is blocked until the lock is released This lock therefore is called a spinlock (How to avoid ?? What about sleep) spinlocks are widely used in many operating systems, what about Pthreads ?

Mutex Lock Definitions
while (true) { acquire lock Critical Section release lock remainder section } acquire() { while (!available) ; /* busy wait */ available = false;; } release() { available = true; These two functions { acquire() & release() } must be implemented atomically Hardware instruction such as compare-and-swap() can be used to implement these functions

Mutex Lock Taken from:

2. Semaphore wait() and signal() wait( S ) { Critical Section
Synchronization tool that provides more sophisticated ways (than Mutex locks) for process to synchronize their activities Semaphore S : an integer variable Apart from initialization, it can only be accessed only via two indivisible (atomic) operations wait() and signal() (Originally called P() and V()) Introduced by Edsger Dijkstra When one process modifies the semaphore value, no other process can simultaneously modify that same semaphore value In the case of wait(S), the testing of the integer value of S (S ≤ 0), as well as its possible modification (S--), must be executed without interruption Definition of the wait() operation Definition of the signal() operation wait( S ) { while (S <= 0) ; // busy wait S--; } Critical Section signal( S ) { S++; }

Semaphore Usage ?? wait( S ) { signal( S ) { S++; }
Tow types of Semaphore Counting semaphore integer value can range over an unrestricted domain Binary semaphore integer value can range only between 0 and 1 Same as a mutex lock Can solve various synchronization problems Consider P1 and P2 that require S1 to happen before S2 Create a semaphore “synch” initialized to 0 P1: S1; signal( synch ); /* Increment synch++ */ P2: wait( synch ); /* while (synch <=0) ; busy wait synch-- */ S2; Can implement a counting semaphore S as a binary semaphore ?? wait( S ) { while (S <= 0) ; // busy wait S--; } signal( S ) { S++; }

Semaphore Implementation
Must guarantee that no two processes can execute the wait() and signal() on the same semaphore at the same time Thus, the implementation becomes the critical section (CS) problem where the wait and signal code are placed in the CS itself Could now have busy waiting in CS implementation But implementation code is short Little busy waiting if critical section rarely occupied Note: that applications may spend lots of time in CS and therefore this is not a good solution 

Semaphore Implementation with NO Busy waiting
With each semaphore, there is an associated waiting queue Each entry in a waiting queue has two data items: value (of type integer) pointer to next record in the list (PCB) Two Operations: block – place the process invoking the operation on the appropriate waiting queue wakeup – remove one of processes from the waiting queue and place it in the ready queue typedef struct { int value; struct process *list; } semaphore;

Implementation with NO Busy waiting (Cont.)
wait(semaphore *S) { S -> value--; if (S -> value < 0) { add this process to S->list; block(); /*sleep*/ } signal(semaphore *S) { S -> value++; if (S -> value <= 0) { remove a process P from S->list; wakeup(P); }

Problems with Semaphores
Although semaphores provide a convenient and effective mechanism for process synchronization Using them incorrectly can result in timing errors that are difficult to detect When all processes share a binary semaphore variable (mutex),which is initialized to 1 Incorrect use of semaphore operations can occur: Each process must execute wait(mutex) before entering the critical section and signal(mutex) afterward. If this sequence is not preserved, two processes may be in their CS simultaneously Example1: suppose that a program interchanges the order wait() and signal() signal(mutex) …critical section… wait(mutex) several processes may be executing the critical sections simultaneously, mutual-exclusion Example2: suppose that a program replaces signal(mutex) with wait(mutex). wait(mutex) …critical section… wait(mutex) Then, the process will permanently block on the second call to wait(), Example3: suppose that a process omits wait(mutex) and/or signal(mutex) either mutual exclusion is violated or the process will permanently block. These – and others – are examples of what can occur when semaphores and other synchronization tools are used incorrectly

Liveness

Liveness Indefinite waiting is an example of a liveness failure
Processes may have to wait indefinitely while trying to acquire a synchronization tool such as a mutex lock or semaphore Waiting indefinitely violates the progress and bounded-waiting criteria discussed at the beginning of this chapter Liveness refers to a set of properties that a system must satisfy to ensure processes make progress Indefinite waiting is an example of a liveness failure

Liveness Deadlock – two or more processes are waiting indefinitely for an event that can be caused by only one of the waiting processes Let S and Q be two semaphores initialized to 1 P P1 wait(S); wait(Q); wait(Q); wait(S); signal(S); signal(Q); signal(Q); signal(S); Consider if P0 executes wait(S) and P1 wait(Q). When P0 executes wait(Q), it must wait until P1 executes signal(Q) However, P1 is waiting until P0 execute signal(S). Since these signal() operations will never be executed, P0 and P1 are deadlocked.

Liveness Other forms of deadlock: Starvation – indefinite blocking
A process may never be removed from the semaphore queue in which it is suspended Priority Inversion – Scheduling problem when lower-priority process holds a lock needed by higher-priority process Solved via priority-inheritance protocol

End of Chapter 6

Priority Inheritance Protocol
Consider the scenario with three processes P1, P2, and P3. P1 has the highest priority, P2 the next highest, and P3 the lowest. Assume a P3 is assigned a resource R that P1 wants. Thus, P1 must wait for P3 to finish using the resource. However, P2 becomes runnable and preempts P3. What has happened is that P2 - a process with a lower priority than P1 - has indirectly prevented P3 from gaining access to the resource. To prevent this from occurring, a priority inheritance protocol is used. This simply allows the priority of the highest thread waiting to access a shared resource to be assigned to the thread currently using the resource. Thus, the current owner of the resource is assigned the priority of the highest priority thread wishing to acquire the resource.

Chapter 6: Synchronization Tools

Similar presentations

Presentation on theme: "Chapter 6: Synchronization Tools"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 6: Synchronization Tools

Similar presentations

Presentation on theme: "Chapter 6: Synchronization Tools"— Presentation transcript:

Similar presentations

About project

Feedback