1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the.

Slides:



Advertisements
Similar presentations
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual.
Advertisements

Symmetric Multiprocessors: Synchronization and Sequential Consistency.
Chapter 5 Concurrency: Mutual Exclusion and Synchronization Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee.
The Art of Multiprocessor Programming Nir Shavit, Ori Shalev CS Spring 2007 (Based on the book by Herlihy and Shavit)
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Global Environment Model. MUTUAL EXCLUSION PROBLEM The operations used by processes to access to common resources (critical sections) must be mutually.
CS492B Analysis of Concurrent Programs Lock Basics Jaehyuk Huh Computer Science, KAIST.
Ch. 7 Process Synchronization (1/2) I Background F Producer - Consumer process :  Compiler, Assembler, Loader, · · · · · · F Bounded buffer.
Mutual Exclusion By Shiran Mizrahi. Critical Section class Counter { private int value = 1; //counter starts at one public Counter(int c) { //constructor.
Silberschatz, Galvin and Gagne ©2007 Operating System Concepts with Java – 7 th Edition, Nov 15, 2006 Chapter 6 (a): Synchronization.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 6: Process Synchronization.
Process Synchronization. Module 6: Process Synchronization Background The Critical-Section Problem Peterson’s Solution Synchronization Hardware Semaphores.
Mutual Exclusion.
Interprocess Communication
Mutual Exclusion The Art of Multiprocessor Programming Spring 2007.
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
Introduction Companion slides for
Chair of Software Engineering Concurrent Object-Oriented Programming Prof. Dr. Bertrand Meyer Lecture 3: Introduction.
PARALLEL PROGRAMMING with TRANSACTIONAL MEMORY Pratibha Kona.
Shared Memory Coordination We will be looking at process coordination using shared memory and busy waiting. –So we don't send messages but read and write.
1 Lecture 6 Performance Measurement and Improvement.
6: Process Synchronization 1 1 PROCESS SYNCHRONIZATION I This is about getting processes to coordinate with each other. How do processes work with resources.
Chapter 2.3 : Interprocess Communication
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit Modified by Rajeev Alur for CIS 640 at Penn, Spring.
Introduction to Multiprocessor Synchronization Maurice Herlihy TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA.
Copyright © 2010, Oracle and/or its affiliates. All rights reserved. Who’s Afraid of a Big Bad Lock Nir Shavit Sun Labs at Oracle Joint work with Danny.
Operating Systems CSE 411 CPU Management Oct Lecture 13 Instructor: Bhuvan Urgaonkar.
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
Object Oriented Analysis & Design SDL Threads. Contents 2  Processes  Thread Concepts  Creating threads  Critical sections  Synchronizing threads.
Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.
Parallel Processing - introduction  Traditionally, the computer has been viewed as a sequential machine. This view of the computer has never been entirely.
Multicore Programming Nir Shavit Tel Aviv University.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Introduction to Concurrency.
Introduction to Multiprocessor Synchronization Maurice Herlihy TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA.
Process Synchronization Continued 7.2 Critical-Section Problem 7.3 Synchronization Hardware 7.4 Semaphores.
Operating Systems CSE 411 Multi-processor Operating Systems Multi-processor Operating Systems Dec Lecture 30 Instructor: Bhuvan Urgaonkar.
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Mutual Exclusion.
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit TexPoint fonts used in EMF. Read the TexPoint manual.
Chapter 1 Performance & Technology Trends Read Sections 1.5, 1.6, and 1.8.
Chapter 7 -1 CHAPTER 7 PROCESS SYNCHRONIZATION CGS Operating System Concepts UCF, Spring 2004.
CY2003 Computer Systems Lecture 04 Interprocess Communication.
Multicore Programming Summer Daily Schedule 09:00am - 10:30am – Lecture 10:30am - 10:50pm - Break 10:50am - 12:20pm - Lecture 12:20pm - 01:20pm.
CS 3204 Operating Systems Godmar Back Lecture 7. 12/12/2015CS 3204 Fall Announcements Project 1 due on Sep 29, 11:59pm Reading: –Read carefully.
Fundamentals of Parallel Computer Architecture - Chapter 71 Chapter 7 Introduction to Shared Memory Multiprocessors Yan Solihin Copyright.
Introduction Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.
CS399 New Beginnings Jonathan Walpole. 2 Concurrent Programming & Synchronization Primitives.
Advanced Computer Networks Lecture 1 - Parallelization 1.
Fall 2008Programming Development Techniques 1 Topic 20 Concurrency Section 3.4.
1 Lecture #24 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
Implementing Lock. From the Previous Lecture  The “too much milk” example shows that writing concurrent programs directly with load and store instructions.
Concurrency and Performance Based on slides by Henri Casanova.
1 5-High-Performance Embedded Systems using Concurrent Process (cont.)
Mutual Exclusion -- Addendum. Mutual Exclusion in Critical Sections.
Synchronization Questions answered in this lecture: Why is synchronization necessary? What are race conditions, critical sections, and atomic operations?
Concurrency Idea. 2 Concurrency idea Challenge –Print primes from 1 to Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold.
December 1, 2006©2006 Craig Zilles1 Threads & Atomic Operations in Hardware  Previously, we introduced multi-core parallelism & cache coherence —Today.
Background on the need for Synchronization
Parallel Processing - introduction
Atomic Operations in Hardware
Atomic Operations in Hardware
Introduction to Multiprocessor Synchronization
Implementing Mutual Exclusion
Implementing Mutual Exclusion
Introduction Companion slides for
CSE 153 Design of Operating Systems Winter 19
CS333 Intro to Operating Systems
Process/Thread Synchronization (Part 2)
并发算法与理论 Concurrency: Algorithms and Theories
Presentation transcript:

1 Lecture #21 Shared Objects and Concurrent Programming This material is not available in the textbook. The online powerpoint presentations contain the text explanations given in class.

Art of Multiprocessor Programming 2 Moore’s Law Clock speed flattening sharply Transistor count still rising

Art of Multiprocessor Programming 3 Vanishing from your Desktops: The Uniprocesor memory cpu

Art of Multiprocessor Programming 4 Your Server: The Shared Memory Multiprocessor (SMP) cache Bus shared memory cache

Art of Multiprocessor Programming 5 Your New Server or Desktop: The Multicore Processor (CMP) cache Bus shared memory cache All on the same chip Sun T2000 Niagara

Art of Multiprocessor Programming 6 From the 2008 press… …Intel has announced a press conference in San Francisco on November 17th, where it will officially launch the Core i7 Nehalem processor… …Sun’s next generation Enterprise T5140 and T5240 servers, based on the 3rd Generation UltraSPARC T2 Plus processor, were released two days ago…

Art of Multiprocessor Programming 7 Why is Kunle Smiling? Niagara 1

© 2006 Herlihy and Shavit 8 Traditional Software Scaling Process User code Traditional Uniprocessor Speedup 1.8x 7x 3.6x Time: Moore’s law

© 2006 Herlihy and Shavit 9 Multicore Software Scaling Process User code Multicore Speedup 1.8x7x3.6x Unfortunately, not so simple…

© 2006 Herlihy and Shavit 10 Real-World Software Scaling Process 1.8x 2x 2.9x User code Multicore Speedup Parallelization and Synchronization require great care…

11 Concurrent Programming object Shared Memory Challenge: coordinating access

12 Persistent vs. Transient Communication Persistent Communication medium: the sending of information changes the state of the medium forever. Example: Blackboard. Transient communication medium: the change of state is only for some limited time period. Example: Talking.

13 Parallel Primality Testing Task: Print all primes from 1 to in some order Available: A machine with 10 processors Solution: Speed work up 10 times, that is, new time to print all primes will be 1/10 of time for single processor

14 Parallel Primality Testing P1P1 P2P2 P x Split the work among processors! Each processor P i gets 10 9 numbers to test. … …

15 Parallel Primality Testing (define (P i) (let ((counter (+ 1 (* (- i 1) (power 10 9)))) (upto (* i (power 10 9)))) (define (iter) (if (< counter upto) (begin (if (prime? counter) (display counter) #f) (increment-counter) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2)... (P 10))

16 Problem: work is split unevenly Some processors have less primes to test… Some composite numbers are easier to test… P1P1 P2P2 P x Need to split the work range dynamically!

Art of Multiprocessor Programming Shared Counter each thread takes a number

18 A Shared Counter Object (define (make-shared-counter value) (define (fetch) value) (define (increment) (set! value (+ 1 value)) (define (dispatch m) (cond (((eq? m 'fetch) (fetch)) (eq? m 'increment) (increment)) (else (error “unknown request”)))) dispatch) (define shared-counter (make-shared-counter 1))

19 Using the Shared Counter (define (P i) (define (iter) (let ((index (shared-counter 'fetch))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (shared-counter 'increment) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2)... (P 10))

20 This Solution Doesn’t Work time Increment: (set! value (+ 1 value)) P 1 read value 77 P 2 increment 10 times 87 P 1 set! value 78 Error! (let ((index (shared-counter 'fetch))) 77 P 1 fetch P 2 fetch 77 Error!

Art of Multiprocessor Programming 21 Is this problem inherent? If we could only glue reads and writes together… read write read write !!

22 The Fetch-and-Increment Operation (define (make-shared-counter value) (define (fetch-and-increment) (let ((old value)) (set! value (+ old 1)) old)) (define (dispatch m) (cond (((eq? m 'fetch-and-increment) (fetch-and-increment)) (else (error ``unknown request -- counter'' m)))) dispatch) Instantaneous Shared Counter Fetch-and-inc

© 2006 Herlihy and Shavit 23 Where Things Reside cache Bus cache 1 shared counter shared memory void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i* , j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); } code Local variables

24 A Correct Shared Counter (define shared-counter (make-shared-counter 1)) (define (P i) (define (iter) (let ((index (shared-counter 'fetch-and-increment))) (if (< index (power 10 10)) (begin (if (prime? index) (display index) #f) (iter)) 'done)) (iter))) (parallel-execute (P 1) (P 2)... (P 10))

25 Implementing Fetch-and-Inc To make the program work we need an “instantaneous” implementation of fetch-and-increment. How can we do this: Special Hardware. Built-in synchronization instructions. Special Software. Use regular instructions -- the solution will involve waiting. Software: Mutual Exclusion

26 Mutual Exclusion (mutex 'start) (let ((old value)) (set! value (+ old 1)) old) (mutex 'end)) Only one process at a time can execute these instructions P1P1 P2P2 P P2P2 returns 1 Mutex count

27 The Story of Alice and Bob Bob Alice Yard * As told by Leslie Lamport

28 The Mutual Exclusion Problem Requirements: Mutual Exclusion: there will never be two dogs simultaneously in the yard. No Deadlock: if only one dog wants to be in the yard it will succeed, and if both dogs want to go out, at least one of them will succeed.

29 Cell Phone Solution Bob Alice Yard

30 Coke Can Solution Bob Alice Yard

31 Flag Solution -- Alice (define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving )) (define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving )) Bob Alice

32 Flag Solution -- Bob (define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving )) (define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

33 Flag Solution -- Both (define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving )) (define (Alice) (loop ;; ``repeat forever'' (set! Alice-flag 'up) ;; Alice wants to enter (do ((= Bob-flag 'up)) (skip)) ;; loop until Bob lowers flag (Alice-dog-in-yard) ;; Dog can enter the yard (set! Alice-flag 'down) ;; Alice is leaving )) (define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving )) (define (Bob) (loop ;; ``repeat forever'' (set! Bob-flag 'up) ;; Bob wants to enter (do ((= Alice-flag 'up)) ;; If Alice wants to enter (set! Bob-flag 'down) ;; Bob is a gentleman (do ((= Alice-flag 'up)) (skip)) ;; loop (skip) till Alice leaves (set! Bob-flag 'up) ;; raise flag ) ;; and go through the do again (Bob-dog-in-yard) ;; Dog can enter yard (set! Bob-flag 'down) ;; Bob is leaving ))

34 Intuition: Why Mutual Exclusion is Preserved Each perform: First raise the flag, to signal interest. Then look to see if the other one has raised the flag. One can claim that the following flag principle holds: since Alice and Bob each raise their own flag and then look at the others flag, the last one to start looking must notice that both flags are up.

Art of Multiprocessor Programming 35 Proof of Mutual Exclusion Assume both dogs in yard Derive a contradiction By reasoning backwards Consider the last time Alice and Bob each looked before letting the dogs in Without loss of generality assume Alice was the last to look…

Art of Multiprocessor Programming 36 Proof time Alice’s last look Alice last raised her flag Bob’s last look QED Alice must have seen Bob’s Flag. A Contradiction Bob last raised flag

37 Why is there no Deadlock? Since Alice has priority over Bob…if neither is entering the critical section, both are repeatedly trying, and Bob will give Alice priority. Unfortunately, the algorithm is not a fair one, and Bob's dogs might eventually grow very anxious :-)

38 The Morals of our Story The Mutual Exclusion problem cannot be solved using transient communication. (I.e. Cell-phones.) The Mutual Exclusion problem cannot be solved using interrupts or interrupt bits (I.e. Cans) The Mutual Exclusion problem can be solved with one bit registers (i.e. Flags), memory locations that can be read and written (set!-ed). We cheated a little: the arbiter problem…

Art of Multiprocessor Programming 39 The Arbiter Problem (an aside) Pick a point

40 The Solution and Conclusion (define (Alice) (loop (mutex 'begin) (Alice-dog-in-yard) ;; critical section (mutex 'end) )) Question: then why not execute all the code of the parallel prime-printing algorithm in a critical section?

Art of Multiprocessor Programming 41 Answer: Amdahl’s Law Speedup= …of computation given n CPUs instead of 1

Art of Multiprocessor Programming 42 Amdahl’s Law Speedup=

Art of Multiprocessor Programming 43 Amdahl’s Law Speedup= Parallel fraction

Art of Multiprocessor Programming 44 Amdahl’s Law Speedup= Parallel fraction Sequential fraction

Art of Multiprocessor Programming 45 Amdahl’s Law Speedup= Parallel fraction Number of processors Sequential fraction

Art of Multiprocessor Programming 46 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup?

Art of Multiprocessor Programming 47 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup? Speedup = 2.17=

Art of Multiprocessor Programming 48 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup?

Art of Multiprocessor Programming 49 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup? Speedup = 3.57=

Art of Multiprocessor Programming 50 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup?

Art of Multiprocessor Programming 51 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup? Speedup = 5.26=

Art of Multiprocessor Programming 52 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup?

Art of Multiprocessor Programming 53 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup? Speedup = 9.17=

Art of Multiprocessor Programming Back to Real-World Multicore Scaling x 2x 2.9x User code Multicore Speedup Why the bad performance?

As num cores grows the effect of 25% becomes more accute 2.3/4, 2.9/8, 3.4/16, 3.7/32…. Amdahl’s Law: Pay for N = 8 cores SequentialPart = 25% Speedup = only 2.9 times! Must parallelize applications on a very fine grain! Where is sequential code coming from…

Need Fine-Grained Locking 75% Unshared 25% Shared cc cc cc cc Coarse Grained c c c c c c c c cc cc cc cc Fine Grained c c c c c c c c The reason we get only 2.9 speedup 75% Unshared 25% Shared Fine grained synchornization has huge performance benefit

57 Multicores are here …

58 “Life is the synchronicity of chance” You just saw a bit of what concurrent programming is about Today we don’t have sufficient expertise yet on how to make use of multicore machines… You guys are the generation that will get to use them and hopefully develop this expertise. Programming Multicore Machines