Parallel Processing (CS526) Spring 2012(Week 6).  A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.

Slides:



Advertisements
Similar presentations
Symmetric Multiprocessors: Synchronization and Sequential Consistency.
Advertisements

1 Lecture 20: Synchronization & Consistency Topics: synchronization, consistency models (Sections )
Operating Systems Part III: Process Management (Process Synchronization)
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Multiprocessors—Synchronization. Synchronization Why Synchronize? Need to know when it is safe for different processes to use shared data Issues for Synchronization:
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
Process Synchronization A set of concurrent/parallel processes/tasks can be disjoint or cooperating (or competing) With cooperating and competing processes.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
CSC321 Concurrent Programming: §3 The Mutual Exclusion Problem 1 Section 3 The Mutual Exclusion Problem.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Silberschatz, Galvin and Gagne ©2007 Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Chapter 6 (a): Synchronization.
Chapter 6: Process Synchronization
Background Concurrent access to shared data can lead to inconsistencies Maintaining data consistency among cooperating processes is critical What is wrong.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 5: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
1 Operating Systems, 122 Practical Session 5, Synchronization 1.
CH7 discussion-review Mahmoud Alhabbash. Q1 What is a Race Condition? How could we prevent that? – Race condition is the situation where several processes.
Chapter 3 The Critical Section Problem
CIS 720 Mutual Exclusion. Critical Section problem Process i do (true) entry protocol; critical section; exit protocol; non-critical section od.
Concurrency.
CS510 Concurrent Systems Class 1b Spin Lock Performance.
1 Lecture 21: Synchronization Topics: lock implementations (Sections )
Race Conditions CS550 Operating Systems. Review So far, we have discussed Processes and Threads and talked about multithreading and MPI processes by example.
Synchronization Todd C. Mowry CS 740 November 1, 2000 Topics Locks Barriers Hardware primitives.
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
Synchronization CSCI 444/544 Operating Systems Fall 2008.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Synchronization (Barriers) Parallel Processing (CS453)
The Critical Section Problem
Concurrency, Mutual Exclusion and Synchronization.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
Process Synchronization Continued 7.2 Critical-Section Problem 7.3 Synchronization Hardware 7.4 Semaphores.
The Performance of Spin Lock Alternatives for Shared-Memory Multiprocessors THOMAS E. ANDERSON Presented by Daesung Park.
Optimistic Design 1. Guarded Methods Do something based on the fact that one or more objects have particular states  Make a set of purchases assuming.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
1 Interprocess Communication (IPC) - Outline Problem: Race condition Solution: Mutual exclusion –Disabling interrupts; –Lock variables; –Strict alternation.
1 Concurrent Processes. 2 Cooperating Processes  Operating systems allow for the creation and concurrent execution of multiple processes  concurrency.
Operating Systems CMPSC 473 Mutual Exclusion Lecture 11: October 5, 2010 Instructor: Bhuvan Urgaonkar.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Synchronicity Introduction to Operating Systems: Module 5.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
1 Global and high-contention operations: Barriers, reductions, and highly- contended locks Katie Coons April 6, 2006.
CGS 3763 Operating Systems Concepts Spring 2013 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 11: :30 AM.
Ch4. Multiprocessors & Thread-Level Parallelism 4. Syn (Synchronization & Memory Consistency) ECE562 Advanced Computer Architecture Prof. Honggang Wang.
1 Critical Section Problem CIS 450 Winter 2003 Professor Jinhua Guo.
Homework-6 Questions : 2,10,15,22.
Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?
Background on the need for Synchronization
G.Anuradha Reference: William Stallings
Lecture 19: Coherence and Synchronization
Lecture 5: Synchronization
Global and high-contention operations: Barriers, reductions, and highly-contended locks Katie Coons April 6, 2006.
143a discussion session week 3
Designing Parallel Algorithms (Synchronization)
Lecture 21: Synchronization and Consistency
Lecture: Coherence and Synchronization
Critical section problem
Lecture 4: Synchronization
Kernel Synchronization II
CS333 Intro to Operating Systems
Chapter 6: Synchronization Tools
Lecture: Coherence, Synchronization
Lecture: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
Process/Thread Synchronization (Part 2)
CSE 542: Operating Systems
Presentation transcript:

Parallel Processing (CS526) Spring 2012(Week 6)

 A parallel algorithm is a group of partitioned tasks that work with each other to solve a large problem.  Working together  communication and coordination.  Communication/coordination  Synchronization.

 There are two kinds of Synchronization: ◦ Mutual Exclusion : We want to prevent two or more threads from being active concurrently for some period, because their actions may interfere incorrectly. ◦ Condition Synchronization :occurs when we want to delay an action until some condition (on the shared variables such as in producer-consumer, or with respect to the progress of other threads such as in a barrier) becomes true.

 Assume, a shared counter x is initially 0; Two concurrent threads increment x; Expected result: x == 2  Process histories: P1: …; load value of x to reg; incr reg; write reg to x; … P2: …; load value of x to reg; incr reg; write reg to x; …  Without synchronization, the final result can be x == {1,2}  The statements accessing shared variables are critical sections that must be executed one at a time, i.e. with mutual exclusion (atomically)

 Mutual Exclusion : We want to prevent two or more threads from being active concurrently for some period, because their actions may interfere incorrectly.

 Critical section is a section of code that accesses a shared resource (e.g. a shared variable) and can be executed by only one process at a time.  The Critical Section (CS) Problem: To find a mechanism that guarantees execution of critical sections one at a time, i.e. with mutual exclusion. ◦ Arises in many concurrent programs. For example: shared linked lists in OS, database records, shared counters, etc.

 The challenge is to make the critical section atomic. We must design code to execute before (entry protocol) and after (exit protocol) the critical section.  Our Solution should have the following Important Properties: ◦ Mutual Exclusion. At most one thread is executing the critical section at a time. ◦ Absence of Deadlock (or Livelock). If two or more threads are trying to enter the critical section, at least one succeeds. ◦ Absence of Unnecessary Delay. If a thread is trying to enter its critical section and the other threads are executing their non- critical sections, or have terminated, the first thread is not prevented from entering its critical section. ◦ Eventual Entry (or No Starvation). A thread that is attempting to enter its critical section will eventually succeed.

 The entry and exit protocol code obviously has to operate upon one or more shared variables. Conventionally we call such variables locks, and the protocol code sequences locking and unlocking. Shared variable libraries will often abstract these as functions.

 We can have, One lock for all shared variables ◦ inefficient, decreases parallelism  Or, Each shared variable has its own lock ◦ large degree of parallelism can be achieved, but may cause very large synchronization overhead (memory and time)  a tradeoff between the number of locks (synchronization overhead) for the number of variables protected by one lock (available parallelism).

 Unfair (spin) locks – unfair but efficient: ◦ Short latency and low memory demand ◦ Poor fairness, may cause starvation ◦ Good in case of low contention (a few processes) ◦ Examples: Test&set lock, test-test&set lock, test&set lock with backof.  Fair (queuing) locks – fair but more expensive: ◦ Longer latency, more memory – the price for fairness ◦ Examples: tie-breaker lock, ticket lock, bakery loc

 A spin Lock is a Boolean variable that indicates whether or not one of the processes is in its critical section: ◦ lock == 1 – some process is in its CS (the lock is “locked”) ◦ lock == 0 – no process in CS (the lock is “unlocked”)

 A simple approach is to implement each lock with a shared boolean variable.  If the variable has value false then one locking thread can set it and be allowed to proceed. Other attempted locks must be forced to wait.  To unlock the lock, the lock-holding thread simply sets the lock to false.  We can specify this behavior with pseudo notation.

Assuming it is atomic

 The spin lock requires atomicity is its own implementation: ◦ HW support for synchronization – a special atomic memory instruction, such as test&set, swap, compare&swap, fetch&increment.  Unlock is implemented with ordinary store operation location = false;  Test-and-set lock using Test&Set instruction (t&s):

 Causes high memory contention while waiting for the lock: ◦ T&s is treated as a write operation – invalidates cached copies if any ◦ Unsuccessful t&s generate memory accesses (bus traffic) ◦ Also wasting CPU time because of busy waiting  Enhancements to the simple Test&Set lock: ◦ SW solutions:  Test&set lock with (exponential) backoff  Test-test&set lock ◦ Improved HW primitives:  Instructions Load-Locked (LL) and Store-Conditional(SC)

 Test&set lock with exponential backoff: ◦ Back off (pause) after unsuccessful t&s (attempt to lock) ◦ Allows to reduce frequency of issuing test&sets while waiting. ◦ Don’t back off too much or will be backed off when lock becomes free

 Idea: Keep testing with ordinary load. When value changes(to 0), try test&set.  Slightly higher latency, much less memory contention

 Uncontained latency ◦ Low if repeatedly accessed by same processor; independent of n  Traffic ◦ Lots if many processors compete; poor scaling with n ◦ Each t&s generates invalidations, and all rush out again to t&s  Storage: Very small (single variable); independent of n  Fairness: Poor, can cause starvation  Test&set with backoff similar, but less traffic  Test-and-test&set: slightly higher latency, much less traffic

 Spin locks are efficient (low latency and memory demand) ◦ When a lock becomes free, spinning processes rush to grab the lock in an arbitrary order; one succeeds, others fail and spin again. ◦ The same process can grab the lock again.  Queuing locks provide fair solution to the CS problem ◦ Waiting processes are queued on the lock; ◦ Released lock is passed to the proc in the head of the queue; ◦ Examples : ticket, bakery algorithms.

 Works like a waiting line at a post office or a bank.  Two shared counters per lock: ◦ number to be “drawn” by one proc at a time; ◦ next to indicate which proc can enter its critical section.  CS enter (lock the lock): ◦ Read a number from number and increment number; wait until next is equal to its number drawn, then enter CS.  CS exit (unlock the lock): ◦ Increment next that allows the next waiting proc (if any) to enter its CS.

 The ticket lock can be implemented as a structure with fields number and next.  turn[i] can be a local variable turn in the Lock procedure.  The ticket lock needs a special atomic memory instruction for number drawing – fetch&increment (load a location to a register and incrementthe location).

 The Ticket algorithm is fair if fetch&op is available. ◦ Otherwise requires mutual exclusion for number drawing that can be unfair.  The Bakery algorithm works like a line in a bakery without a number-drawing machine – a proc looks around and takes a number one larger then any other. ◦ Requires a shared int array turn[n] per lock. ◦ Does not need a special instruction.