Download presentation
Presentation is loading. Please wait.
Published byRaymond Sowell Modified over 10 years ago
1
The Art of Multiprocessor Programming Nir Shavit, Ori Shalev CS 0368-4061-01 Spring 2007 (Based on the book by Herlihy and Shavit)
2
© 2006 Herlihy and Shavit2 Requirements Required readings 6 problem sets –20% of grade –See submission instructions later –Late loses 10%/day Final exam
3
© 2006 Herlihy and Shavit3 From the New York Times … SAN FRANCISCO, May 7. 2004 - Intel said on Friday that it was scrapping its development of two microprocessors, a move that is a shift in the company's business strategy….
4
© 2006 Herlihy and Shavit4 Moores Law Clock speed flattening sharply Transistor count still rising
5
© 2006 Herlihy and Shavit5 On Your Desktop: The Uniprocesor memory cpu
6
© 2006 Herlihy and Shavit6 In the Enterprise: The Shared Memory Multiprocessor (SMP) cache Bus shared memory cache
7
© 2006 Herlihy and Shavit7 Your New Desktop: The Multicore processor (CMP) cache Bus shared memory cache All on the same chip Sun T2000 Niagara
8
© 2006 Herlihy and Shavit8 Multicores Are Here Intel's Intel ups ante with 4-core chip. New microprocessor, due this year, will be faster, use less electricity... [San Fran Chronicle] AMD will launch a dual-core version of its Opteron server processor at an event in New York on April 21. [PC World] Suns Niagara…will have eight cores, each core capable of running 4 threads in parallel, for 32 concurrently running threads. …. [The Inquierer]
9
© 2006 Herlihy and Shavit9 Why do we care? Time no longer cures software bloat –The free ride is over When you double your programs path length –You cant just wait 6 months –Your software must somehow exploit twice as much concurrency
10
© 2006 Herlihy and Shavit10 Traditional Scaling Process User code Traditional Uniprocessor Speedup 1.8x 7x 3.6x Time: Moores law
11
© 2006 Herlihy and Shavit11 Multicore Scaling Process User code Multicore Speedup 1.8x7x3.6x Unfortunately, not so simple…
12
© 2006 Herlihy and Shavit12 Real-World Scaling Process 1.8x 2x 2.9x User code Multicore Speedup Parallelization and Synchronization require great care…
13
© 2006 Herlihy and Shavit13 Multiprocessor Programming: Course Overview Fundamentals –Models, algorithms, impossibility Real-World programming –Architectures –Techniques
14
© 2006 Herlihy and Shavit14 Sequential Computation memory object thread
15
© 2006 Herlihy and Shavit15 Concurrent Computation memory object threads
16
© 2006 Herlihy and Shavit16 Asynchrony Sudden unpredictable delays –Cache misses (short) –Page faults (long) –Scheduling quantum used up (really long)
17
© 2006 Herlihy and Shavit17 Model Summary Multiple threads –Sometimes called processes Single shared memory Objects live in memory Unpredictable asynchronous delays
18
© 2006 Herlihy and Shavit18 Road Map We are going to focus on principles first, then practice –Start with idealized models –Look at simplistic problems –Emphasize correctness over pragmatism –Correctness may be theoretical, but incorrectness has practical impact
19
© 2006 Herlihy and Shavit19 Concurrency Jargon Hardware –Processors Software –Threads, processes Sometimes OK to confuse them, sometimes not.
20
© 2006 Herlihy and Shavit20 Parallel Primality Testing Challenge –Print primes from 1 to 10 10 Given –Ten-processor multiprocessor –One thread per processor Goal –Get ten-fold speedup (or close)
21
© 2006 Herlihy and Shavit21 Load Balancing Split the work evenly Each thread tests range of 10 9 … … 10 910 2·10 9 1 P0P0 P1P1 P9P9
22
© 2006 Herlihy and Shavit22 Procedure for Thread i void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); }
23
© 2006 Herlihy and Shavit23 Issues Larger numbers imply fewer primes Yet larger numbers harder to test Thread workloads –Uneven –Hard to predict Need dynamic load balancing
24
© 2006 Herlihy and Shavit24 Issues Larger Num ranges have fewer primes Larger numbers harder to test Thread workloads –Uneven –Hard to predict Need dynamic load balancing rejected
25
© 2006 Herlihy and Shavit25 17 18 19 Shared Counter each thread takes a number
26
© 2006 Herlihy and Shavit26 Procedure for Thread i int counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); }
27
© 2006 Herlihy and Shavit27 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Shared counter object
28
© 2006 Herlihy and Shavit28 Where Things Reside cache Bus cache 1 shared counter shared memory void primePrint { int i = ThreadID.get(); // IDs in {0..9} for (j = i*10 9 +1, j<(i+1)*10 9 ; j++) { if (isPrime(j)) print(j); } code Local variables
29
© 2006 Herlihy and Shavit29 Procedure for Thread i Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Stop when every value taken
30
© 2006 Herlihy and Shavit30 Counter counter = new Counter(1); void primePrint { long j = 0; while (j < 10 10 ) { j = counter.getAndIncrement(); if (isPrime(j)) print(j); } Procedure for Thread i Increment & return each new value
31
© 2006 Herlihy and Shavit31 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; }
32
© 2006 Herlihy and Shavit32 Counter Implementation public class Counter { private long value; public long getAndIncrement() { return value++; } OK for uniprocessor, not for multiprocessor
33
© 2006 Herlihy and Shavit33 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; }
34
© 2006 Herlihy and Shavit34 What It Means public class Counter { private long value; public long getAndIncrement() { return value++; } temp = value; value = value + 1; return temp;
35
© 2006 Herlihy and Shavit35 time Not so good… Value… 1 read 1 read 1 write 2 read 2 write 3 write 2 232
36
© 2006 Herlihy and Shavit36 Is this problem inherent? If we could only glue reads and writes… read write read write
37
© 2006 Herlihy and Shavit37 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; }
38
© 2006 Herlihy and Shavit38 Challenge public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps atomic (indivisible)
39
© 2006 Herlihy and Shavit39 Hardware Solution public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } ReadModifyWrite() instruction
40
© 2006 Herlihy and Shavit40 An Aside: Java public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; }
41
© 2006 Herlihy and Shavit41 An Aside: Java public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Synchronized block
42
© 2006 Herlihy and Shavit42 An Aside: Java public class Counter { private long value; public long getAndIncrement() { synchronized { temp = value; value = temp + 1; } return temp; } Mutual Exclusion
43
© 2006 Herlihy and Shavit43 Mutual Exclusion or Alice & Bob share a pond AB
44
© 2006 Herlihy and Shavit44 Alice has a pet AB
45
© 2006 Herlihy and Shavit45 Bob has a pet AB
46
© 2006 Herlihy and Shavit46 The Problem AB The pets dont get along
47
© 2006 Herlihy and Shavit47 Formalizing the Problem Two types of formal properties in asynchronous computation: Safety Properties –Nothing bad happens ever Liveness Properties –Something good happens eventually
48
© 2006 Herlihy and Shavit48 Formalizing our Problem Mutual Exclusion –Both pets never in pond simultaneously –This is a safety property No Deadlock –if only one wants in, it gets in –if both want in, one gets in. –This is a liveness property
49
© 2006 Herlihy and Shavit49 Simple Protocol Idea –Just look at the pond Gotcha –Trees obscure the view
50
© 2006 Herlihy and Shavit50 Interpretation Threads cant see what other threads are doing Explicit communication required for coordination
51
© 2006 Herlihy and Shavit51 Cell Phone Protocol Idea –Bob calls Alice (or vice-versa) Gotcha –Bob takes shower –Alice recharges battery –Bob out shopping for pet food …
52
© 2006 Herlihy and Shavit52 Interpretation Message-passing doesnt work Recipient might not be –Listening –There at all Communication must be –Persistent (like writing) –Not transient (like speaking)
53
© 2006 Herlihy and Shavit53 Can Protocol cola
54
© 2006 Herlihy and Shavit54 Bob conveys a bit AB cola
55
© 2006 Herlihy and Shavit55 Bob conveys a bit AB cola
56
© 2006 Herlihy and Shavit56 Can Protocol Idea –Cans on Alices windowsill –Strings lead to Bobs house –Bob pulls strings, knocks over cans Gotcha –Cans cannot be reused –Bob runs out of cans
57
© 2006 Herlihy and Shavit57 Interpretation Cannot solve mutual exclusion with interrupts –Sender sets fixed bit in receivers space –Receiver resets bit when ready –Requires unbounded number of inturrupt bits
58
© 2006 Herlihy and Shavit58 Flag Protocol AB
59
© 2006 Herlihy and Shavit59 Alices Protocol (sort of) AB
60
© 2006 Herlihy and Shavit60 Bobs Protocol (sort of) AB
61
© 2006 Herlihy and Shavit61 Alices Protocol Raise flag Wait until Bobs flag is down Unleash pet Lower flag when pet returns
62
© 2006 Herlihy and Shavit62 Bobs Protocol Raise flag Wait until Alices flag is down Unleash pet Lower flag when pet returns danger!
63
© 2006 Herlihy and Shavit63 Bobs Protocol (2 nd try) Raise flag While Alices flag is up –Lower flag –Wait for Alices flag to go down –Raise flag Unleash pet Lower flag when pet returns
64
© 2006 Herlihy and Shavit64 Bobs Protocol Raise flag While Alices flag is up –Lower flag –Wait for Alices flag to go down –Raise flag Unleash pet Lower flag when pet returns Bob defers to Alice
65
© 2006 Herlihy and Shavit65 The Flag Principle Raise the flag Look at others flag Flag Principle: –If each raises and looks, then –Last to look must see both flags up
66
© 2006 Herlihy and Shavit66 Proof of Mutual Exclusion Assume both pets in pond –Derive a contradiction –By reasoning backwards Consider the last time Alice and Bob each looked before letting the pets in Without loss of generality assume Alice was the last to look…
67
© 2006 Herlihy and Shavit67 Proof time Alices last look Alice last raised her flag Bobs last looked QED Alice must have seen Bobs Flag. A Contradiction Bob last raised flag
68
© 2006 Herlihy and Shavit68 Proof of No Deadlock If only one pet wants in, it gets in.
69
© 2006 Herlihy and Shavit69 Proof of No Deadlock If only one pet wants in, it gets in. Deadlock requires both continually trying to get in.
70
© 2006 Herlihy and Shavit70 Proof of No Deadlock If only one pet wants in, it gets in. Deadlock requires both continually trying to get in. If Bob sees Alices flag, he gives her priority (a gentleman…) QED
71
© 2006 Herlihy and Shavit71 Remarks Protocol is unfair –Bobs pet might never get in Protocol uses waiting –If Bob is eaten by his pet, Alices pet might never get in
72
© 2006 Herlihy and Shavit72 Moral of Story Mutual Exclusion cannot be solved by –transient communication (cell phones) –interrupts (cans) It can be solved by – one-bit shared variables – that can be read or written
73
© 2006 Herlihy and Shavit73 The Arbiter Problem (an aside) Pick a point
74
© 2006 Herlihy and Shavit74 The Fable Continues Alice and Bob fall in love & marry
75
© 2006 Herlihy and Shavit75 The Fable Continues Alice and Bob fall in love & marry Then they fall out of love & divorce –She gets the pets –He has to feed them Leading to a new coordination problem: Producer-Consumer
76
© 2006 Herlihy and Shavit76 Bob Puts Food in the Pond A
77
© 2006 Herlihy and Shavit77 mmm… Alice releases her pets to Feed B mmm…
78
© 2006 Herlihy and Shavit78 Producer/Consumer Alice and Bob cant meet –Each has restraining order on other –So he puts food in the pond –And later, she releases the pets Avoid –Releasing pets when theres no food –Putting out food if uneaten food remains
79
© 2006 Herlihy and Shavit79 Producer/Consumer Need a mechanism so that –Bob lets Alice know when food has been put out –Alice lets Bob know when to put out more food
80
© 2006 Herlihy and Shavit80 Surprise Solution AB cola
81
© 2006 Herlihy and Shavit81 Bob puts food in Pond AB cola
82
© 2006 Herlihy and Shavit82 Bob knocks over Can AB cola
83
© 2006 Herlihy and Shavit83 Alice Releases Pets AB cola yum… B
84
© 2006 Herlihy and Shavit84 Alice Resets Can when Pets are Fed AB cola
85
© 2006 Herlihy and Shavit85 Pseudocode while (true) { while (can.isUp()){}; pet.release(); pet.recapture(); can.reset(); } Alices code
86
© 2006 Herlihy and Shavit86 Pseudocode while (true) { while (can.isUp()){}; pet.release(); pet.recapture(); can.reset(); } Alices code while (true) { while (can.isDown()){}; pond.stockWithFood(); can.knockOver(); } Bobs code
87
© 2006 Herlihy and Shavit87 Correctness Mutual Exclusion –Pets and Bob never together in pond
88
© 2006 Herlihy and Shavit88 Correctness Mutual Exclusion –Pets and Bob never together in pond No Starvation if Bob always willing to feed, and pets always famished, then pets eat infinitely often.
89
© 2006 Herlihy and Shavit89 Correctness Mutual Exclusion –Pets and Bob never together in pond No Starvation if Bob always willing to feed, and pets always famished, then pets eat infinitely often. Producer/Consumer The pets never enter pond unless there is food, and Bob never provides food if there is unconsumed food. safety liveness safety
90
© 2006 Herlihy and Shavit90 Could Also Solve Using Flags AB
91
© 2006 Herlihy and Shavit91 Waiting Both solutions use waiting –while(mumble){} Waiting is problematic –If one participant is delayed –So is everyone else –But delays are common & unpredictable
92
© 2006 Herlihy and Shavit92 The Fable drags on … Bob and Alice still have issues
93
© 2006 Herlihy and Shavit93 The Fable drags on … Bob and Alice still have issues So they need to communicate
94
© 2006 Herlihy and Shavit94 The Fable drags on … Bob and Alice still have issues So they need to communicate So they agree to use billboards …
95
© 2006 Herlihy and Shavit95 E 1 D 2 C 3 Billboards are Large B 3 A 1 Letter Tiles From Scrabble box
96
© 2006 Herlihy and Shavit96 E 1 D 2 C 3 Write One Letter at a Time … B 3 A 1 W 4 A 1 S 1 H 4
97
© 2006 Herlihy and Shavit97 To post a message W 4 A 1 S 1 H 4 A 1 C 3 R 1 T 1 H 4 E 1 whe w
98
© 2006 Herlihy and Shavit98 S 1 Lets send another mesage S 1 E 1 L 1 L 1 L 1 V 4 L 1 A 1 M 3 A 1 A 1 P 3
99
© 2006 Herlihy and Shavit99 Uh-Oh A 1 C 3 R 1 T 1 H 4 E 1 S 1 E 1 L 1 L 1 L 1 OK
100
© 2006 Herlihy and Shavit100 Readers/Writers Devise a protocol so that –Writer writes one letter at a time –Reader reads one letter at a time –Reader sees Old message or new message No mixed messages
101
© 2006 Herlihy and Shavit101 Readers/Writers (continued) Easy with mutual exclusion But mutual exclusion requires waiting –One waits for the other –Everyone executes sequentially Remarkably –We can solve R/W without mutual exclusion
102
© 2006 Herlihy and Shavit102 Why do we care? We want as much of the code as possible to execute concurrently (in parallel) A larger sequential part implies reduced performance Amdahls law: this relation is not linear…
103
© 2006 Herlihy and Shavit103 Amdahls Law Speedup= …of computation given n CPUs instead of 1
104
© 2006 Herlihy and Shavit104 Amdahls Law Speedup=
105
© 2006 Herlihy and Shavit105 Amdahls Law Speedup= Parallel fraction
106
© 2006 Herlihy and Shavit106 Amdahls Law Speedup= Parallel fraction Sequential fraction
107
© 2006 Herlihy and Shavit107 Amdahls Law Speedup= Parallel fraction Number of processors Sequential fraction
108
© 2006 Herlihy and Shavit108 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup?
109
© 2006 Herlihy and Shavit109 Example Ten processors 60% concurrent, 40% sequential How close to 10-fold speedup? Speedup=2.17=
110
© 2006 Herlihy and Shavit110 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup?
111
© 2006 Herlihy and Shavit111 Example Ten processors 80% concurrent, 20% sequential How close to 10-fold speedup? Speedup=3.57=
112
© 2006 Herlihy and Shavit112 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup?
113
© 2006 Herlihy and Shavit113 Example Ten processors 90% concurrent, 10% sequential How close to 10-fold speedup? Speedup=5.26=
114
© 2006 Herlihy and Shavit114 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup?
115
© 2006 Herlihy and Shavit115 Example Ten processors 99% concurrent, 01% sequential How close to 10-fold speedup? Speedup=9.17=
116
© 2006 Herlihy and Shavit116 The Moral Making good use of our multiple processors (cores) means Finding ways to effectively parallelize our code –Minimize sequential parts –Reduce idle time in which threads wait without
117
© 2006 Herlihy and Shavit117 Multiprocessor Programming This is what this course is about… –The % that is not easy to make concurrent yet may have a large impact on overall speedup Next week: –A more serious look at mutual exclusion –Homework Assignment 1 will be waiting for you on the course web page www.cs.tau.ac.il/~multi tomorrow.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.