Download presentation
Presentation is loading. Please wait.
Published byDouglas Thompson Modified over 9 years ago
1
Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit
2
Art of Multiprocessor Programming 2 Mutual Exclusion Today we will try to formalize our understanding of mutual exclusion We will also use the opportunity to show you how to argue about and prove various properties in an asynchronous concurrent setting
3
Art of Multiprocessor Programming 3 Mutual Exclusion Formal problem definitions Solutions for 2 threads Solutions for n threads Fair solutions Inherent costs
4
Art of Multiprocessor Programming 4 Warning You will never use these protocols –Get over it You are advised to understand them –The same issues show up everywhere –Except hidden and more complex
5
Art of Multiprocessor Programming 5 Why is Concurrent Programming so Hard? Try preparing a seven-course banquet –By yourself –With one friend –With twenty-seven friends … Before we can talk about programs –Need a language –Describing time and concurrency
6
Art of Multiprocessor Programming 6 “Absolute, true and mathematical time, of itself and from its own nature, flows equably without relation to anything external.” (I. Newton, 1689) “Time is, like, Nature’s way of making sure that everything doesn’t happen all at once.” (Anonymous, circa 1968) Time time
7
Art of Multiprocessor Programming 7 time An event a 0 of thread A is –Instantaneous –No simultaneous events (break ties) a0a0 Events
8
Art of Multiprocessor Programming 8 time A thread A is (formally) a sequence a 0, a 1,... of events –“Trace” model –Notation: a 0 a 1 indicates order a0a0 Threads a1a1 a2a2 …
9
Art of Multiprocessor Programming 9 Assign to shared variable Assign to local variable Invoke method Return from method Lots of other things … Example Thread Events
10
Art of Multiprocessor Programming 10 Threads are State Machines Events are transitions a0a0 a1a1 a2a2 a3a3
11
Art of Multiprocessor Programming 11 States Thread State –Program counter –Local variables System state –Object fields (shared variables) –Union of thread states
12
Art of Multiprocessor Programming 12 time Thread A Concurrency
13
Art of Multiprocessor Programming 13 time Thread A Thread B Concurrency
14
Art of Multiprocessor Programming 14 time Interleavings Events of two or more threads –Interleaved –Not necessarily independent (why?)
15
Art of Multiprocessor Programming 15 time An interval A 0 =(a 0,a 1 ) is –Time between events a 0 and a 1 a0a0 a1a1 Intervals A0A0
16
Art of Multiprocessor Programming 16 time Intervals may Overlap a0a0 a1a1 A0A0 b0b0 b1b1 B0B0
17
Art of Multiprocessor Programming 17 time Intervals may be Disjoint a0a0 a1a1 A0A0 b0b0 b1b1 B0B0
18
Art of Multiprocessor Programming 18 time Precedence a0a0 a1a1 A0A0 b0b0 b1b1 B0B0 Interval A 0 precedes interval B 0
19
Art of Multiprocessor Programming 19 Precedence Notation: A 0 B 0 Formally, –End event of A 0 before start event of B 0 –Also called “happens before” or “precedes”
20
Art of Multiprocessor Programming 20 Precedence Ordering Remark: A 0 B 0 is just like saying –1066 AD 1492 AD, –Middle Ages Renaissance, Oh wait, –what about this week vs this month?
21
Art of Multiprocessor Programming 21 Precedence Ordering Never true that A A If A B then not true that B A If A B & B C then A C Funny thing: A B & B A might both be false!
22
Art of Multiprocessor Programming 22 Partial Orders (you may know this already) Irreflexive: –Never true that A A Antisymmetric: –If A B then not true that B A Transitive: –If A B & B C then A C
23
Art of Multiprocessor Programming 23 Total Orders (you may know this already) Also –Irreflexive –Antisymmetric –Transitive Except that for every distinct A, B, –Either A B or B A
24
Art of Multiprocessor Programming 24 Repeated Events while (mumble) { a 0 ; a 1 ; } a0ka0k k-th occurrence of event a 0 A0kA0k k-th occurrence of interval A 0 =(a 0,a 1 )
25
Art of Multiprocessor Programming 25 Implementing a Counter public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps indivisible using locks
26
Art of Multiprocessor Programming 26 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); }
27
Art of Multiprocessor Programming 27 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } acquire lock
28
Art of Multiprocessor Programming 28 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } release lock acquire lock
29
Art of Multiprocessor Programming 29 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }}
30
Art of Multiprocessor Programming 30 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} acquire Lock
31
Art of Multiprocessor Programming 31 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} Release lock (no matter what)
32
Art of Multiprocessor Programming 32 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} Critical section
33
Art of Multiprocessor Programming 33 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution
34
Art of Multiprocessor Programming 34 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be thread j’s m-th critical section execution
35
Art of Multiprocessor Programming 35 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or
36
Art of Multiprocessor Programming 36 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or CS i k CS j m
37
Art of Multiprocessor Programming 37 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or CS i k CS j m CS j m CS i k
38
Art of Multiprocessor Programming 38 Deadlock-Free If some thread calls lock() –And never returns –Then other threads must complete lock() and unlock() calls infinitely often System as a whole makes progress –Even if individuals starve
39
Art of Multiprocessor Programming 39 Starvation-Free If some thread calls lock() –It will eventually return Individual threads make progress
40
Art of Multiprocessor Programming 40 Two-Thread vs n -Thread Solutions Two-thread solutions first –Illustrate most basic ideas –Fits on one slide Then n-Thread solutions
41
Art of Multiprocessor Programming 41 class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; … } Two-Thread Conventions
42
Art of Multiprocessor Programming 42 class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; … } Two-Thread Conventions Henceforth: i is current thread, j is other thread
43
Art of Multiprocessor Programming 43 LockOne class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} }
44
Art of Multiprocessor Programming 44 LockOne class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } Set my flag
45
Art of Multiprocessor Programming 45 class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } LockOne Wait for other flag to go false Set my flag
46
Art of Multiprocessor Programming 46 Assume CS A j overlaps CS B k Consider each thread's last (j-th and k-th) read and write in the lock() method before entering Derive a contradiction LockOne Satisfies Mutual Exclusion
47
Art of Multiprocessor Programming 47 write A (flag[A]=true) read A (flag[B]==false) CS A write B (flag[B]=true) read B (flag[A]==false) CS B From the Code class LockOne implements Lock { … public void lock() { flag[i] = true; while (flag[j]) {} }
48
Art of Multiprocessor Programming 48 read A (flag[B]==false) write B (flag[B]=true) read B (flag[A]==false) write A (flag[B]=true) From the Assumption
49
Art of Multiprocessor Programming 49 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
50
Art of Multiprocessor Programming 50 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
51
Art of Multiprocessor Programming 51 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
52
Art of Multiprocessor Programming 52 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
53
Art of Multiprocessor Programming 53 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
54
Art of Multiprocessor Programming 54 Assumptions: –read A (flag[B]==false) write B (flag[B]=true) –read B (flag[A]==false) write A (flag[A]=true) From the code –write A (flag[A]=true) read A (flag[B]==false) –write B (flag[B]=true) read B (flag[A]==false) Combining
55
Art of Multiprocessor Programming 55 Cycle!
56
Art of Multiprocessor Programming 56 Deadlock Freedom LockOne Fails deadlock-freedom –Concurrent execution can deadlock –Sequential executions OK flag[i] = true; flag[j] = true; while (flag[j]){} while (flag[i]){}
57
Art of Multiprocessor Programming 57 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} }
58
Art of Multiprocessor Programming 58 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Let other go first
59
Art of Multiprocessor Programming 59 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Wait for permission
60
Art of Multiprocessor Programming 60 LockTwo public class Lock2 implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Nothing to do
61
Art of Multiprocessor Programming 61 public void LockTwo() { victim = i; while (victim == i) {}; } LockTwo Claims Satisfies mutual exclusion –If thread i in CS –Then victim == j –Cannot be both 0 and 1 Not deadlock free –Sequential execution deadlocks –Concurrent execution does not
62
Art of Multiprocessor Programming 62 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; }
63
Art of Multiprocessor Programming 63 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested
64
Art of Multiprocessor Programming 64 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other
65
Art of Multiprocessor Programming 65 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other Wait while other interested & I’m the victim
66
Art of Multiprocessor Programming 66 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other Wait while other interested & I’m the victim No longer interested
67
Art of Multiprocessor Programming 67 public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; Mutual Exclusion If thread 1 in critical section, –flag[1]=true, –victim = 0 If thread 0 in critical section, –flag[0]=true, –victim = 1 Cannot both be true
68
Art of Multiprocessor Programming 68 Deadlock Free Thread blocked –only at while loop –only if it is the victim One or the other must not be the victim public void lock() { … while (flag[j] && victim == i) {};
69
Art of Multiprocessor Programming 69 Starvation Free Thread i blocked only if j repeatedly re-enters so that flag[j] == true and victim == i When j re-enters –it sets victim to j. –So i gets in public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; }
70
Art of Multiprocessor Programming 70 The Filter Algorithm for n Threads There are n-1 “waiting rooms” called levels At each level –At least one enters level –At least one blocked if many try Only one thread makes it through ncs cs
71
Art of Multiprocessor Programming 71 Filter class Filter implements Lock { volatile int[] level; // level[i] for thread i volatile int[] victim; // victim[L] for level L public Filter(int n) { level = new int[n]; victim = new int[n]; for (int i = 1; i < n; i++) { level[i] = 0; }} … } n -1 0 1 0000004 2 2 Thread 2 at level 4 0 4 level victim
72
Art of Multiprocessor Programming 72 Filter class Filter implements Lock { … public void lock(){ for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i level[k] >= L) && victim[L] == i ); }} public void unlock() { level[i] = 0; }}
73
Art of Multiprocessor Programming 73 class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter One level at a time
74
Art of Multiprocessor Programming 74 class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} Filter Announce intention to enter level L
75
Art of Multiprocessor Programming 75 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Give priority to anyone but me
76
Art of Multiprocessor Programming 76 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Wait as long as someone else is at same or higher level, and I’m designated victim
77
Art of Multiprocessor Programming 77 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Thread enters level L when it completes the loop
78
Art of Multiprocessor Programming 78 Claim Start at level L=0 At most n-L threads enter level L Mutual exclusion at level L=n-1 ncs csL=n-1 L=1 L=n-2 L=0
79
Art of Multiprocessor Programming 79 Induction Hypothesis Assume all at level L-1 enter level L A last to write victim[L] B is any other thread at level L No more than n-L+1 at level L-1 Induction step: by contradiction ncs cs L-1 has n-L+1 L has n-L assume prove
80
Art of Multiprocessor Programming 80 Proof Structure ncs cs Assumed to enter L-1 By way of contradiction all enter L n-L+1 = 4 A B Last to write victim[L] Show that A must have seen B at level L and since victim[L] == A could not have entered
81
Art of Multiprocessor Programming 81 From the Code (1) write B (level[B]=L) write B (victim[L]=B) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i) {}; }}
82
Art of Multiprocessor Programming 82 From the Code (2) write A (victim[L]=A) read A (level[B]) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i) {}; }}
83
Art of Multiprocessor Programming 83 By Assumption By assumption, A is the last thread to write victim[L] (3) write B (victim[L]=B) write A (victim[L]=A)
84
Art of Multiprocessor Programming 84 Combining Observations (1) write B (level[B]=L) write B (victim[L]=B) (3) write B (victim[L]=B) write A (victim[L]=A) (2) write A (victim[L]=A) read A (level[B])
85
Art of Multiprocessor Programming 85 public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while (( k != i) level[k] >= L) && victim[L] == i) {}; }} Combining Observations (1) write B (level[B]=L) write B (victim[L]=B) (3) write B (victim[L]=B) write A (victim[L]=A) (2) write A (victim[L]=A) read A (level[B])
86
Art of Multiprocessor Programming 86 Combining Observations (1) write B (level[B]=L) write B (victim[L]=B) (3) write B (victim[L]=B) write A (victim[L]=A) (2) write A (victim[L]=A) read A (level[B]) Thus, A read level[B] ≥ L, A was last to write victim[L], so it could not have entered level L!
87
Art of Multiprocessor Programming 87 No Starvation Filter Lock satisfies properties: –Just like Peterson Alg at any level –So no one starves But what about fairness? –Threads can be overtaken by others
88
Art of Multiprocessor Programming 88 Bounded Waiting Want stronger fairness guarantees Thread not “overtaken” too much Need to adjust definitions ….
89
Art of Multiprocessor Programming 89 Bounded Waiting Divide lock() method into 2 parts: –Doorway interval: Written D A always finishes in finite steps –Waiting interval: Written W A may take unbounded steps
90
Art of Multiprocessor Programming 90 Bakery Algorithm Provides First-Come-First-Served How? –Take a “number” –Wait until lower numbers have been served Lexicographic order –(a,i) > (b,j) If a > b, or a = b and i > j
91
Art of Multiprocessor Programming 91 Bakery Algorithm class Bakery implements Lock { volatile boolean[] flag; volatile Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } …
92
Art of Multiprocessor Programming 92 Bakery Algorithm class Bakery implements Lock { volatile boolean[] flag; volatile Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } … n -1 0 fffftft 2 f 0000504 0 6 CS
93
Art of Multiprocessor Programming 93 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); }
94
Art of Multiprocessor Programming 94 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } Doorway
95
Art of Multiprocessor Programming 95 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } I’m interested
96
Art of Multiprocessor Programming 96 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } Take increasing label (read labels in some arbitrary order)
97
Art of Multiprocessor Programming 97 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } Someone is interested
98
Art of Multiprocessor Programming 98 Bakery Algorithm class Bakery implements Lock { boolean flag[n]; int label[n]; public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } Someone is interested With lower (label,i) in lexicographic order
99
Art of Multiprocessor Programming 99 Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; }
100
Art of Multiprocessor Programming 100 Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; } No longer interested labels are always increasing
101
Art of Multiprocessor Programming 101 No Deadlock There is always one thread with earliest label Ties are impossible (why?)
102
Art of Multiprocessor Programming 102 First-Come-First-Served If D A D B then A’s label is earlier –write A (label[A]) read B (label[A]) write B (label[B]) read B (flag[A]) So B is locked out while flag[A] is true class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); }
103
Art of Multiprocessor Programming 103 Mutual Exclusion Suppose A and B in CS together Suppose A has earlier label When B entered, it must have seen –flag[A] is false, or –label[A] > label[B] class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); }
104
Art of Multiprocessor Programming 104 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false
105
Art of Multiprocessor Programming 105 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false Labeling B read B (flag[A]) write A (flag[A]) Labeling A
106
Art of Multiprocessor Programming 106 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false Labeling B read B (flag[A]) write A (flag[A]) Labeling A Which contradicts the assumption that A has an earlier label
107
Art of Multiprocessor Programming 107 Bakery Y2 32 K Bug class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); }
108
Art of Multiprocessor Programming 108 Bakery Y2 32 K Bug class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while ( k flag[k] && (label[i],i) > (label[k],k)); } Mutex breaks if label[i] overflows
109
Art of Multiprocessor Programming 109 Does Overflow Actually Matter? Yes –Y2K –18 January 2038 (Unix time_t rollover) –16-bit counters No –64-bit counters Maybe –32-bit counters
110
Art of Multiprocessor Programming© Herlihy-Shavit 2007 110 Revisit Mutual Exclusion... Think of performance, not just correctness and progress Begin to understand how performance depends on our software properly utilizing the multiprocessor machine’s hardware And get to know a collection of locking algorithms… (1)
111
Art of Multiprocessor Programming© Herlihy-Shavit 2007 111 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor (1)
112
Art of Multiprocessor Programming© Herlihy-Shavit 2007 112 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor our focus
113
Art of Multiprocessor Programming© Herlihy-Shavit 2007 113 Basic Spin-Lock CS Resets lock upon exit spin lock critical section...
114
Art of Multiprocessor Programming© Herlihy-Shavit 2007 114 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock introduces sequential bottleneck
115
Art of Multiprocessor Programming© Herlihy-Shavit 2007 115 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention
116
Art of Multiprocessor Programming© Herlihy-Shavit 2007 116 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... Notice: these are distinct phenomena …lock suffers from contention
117
Art of Multiprocessor Programming© Herlihy-Shavit 2007 117 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention Seq Bottleneck no parallelism
118
Art of Multiprocessor Programming© Herlihy-Shavit 2007 118 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... Contention ??? …lock suffers from contention
119
Art of Multiprocessor Programming© Herlihy-Shavit 2007 119 Review: Test-and-Set Boolean value Test-and-set (TAS) –Swap true with current value –Return value tells if prior value was true or false Can reset just by writing false TAS aka “getAndSet”
120
Art of Multiprocessor Programming© Herlihy-Shavit 2007 120 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } (5)
121
Art of Multiprocessor Programming© Herlihy-Shavit 2007 121 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Package java.util.concurrent.atomic
122
Art of Multiprocessor Programming© Herlihy-Shavit 2007 122 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Swap old and new values
123
Art of Multiprocessor Programming© Herlihy-Shavit 2007 123 Review: Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true)
124
Art of Multiprocessor Programming© Herlihy-Shavit 2007 124 Review: Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true) (5) Swapping in true is called “test-and-set” or TAS
125
Art of Multiprocessor Programming© Herlihy-Shavit 2007 125 Test-and-Set Locks Locking –Lock is free: value is false –Lock is taken: value is true Acquire lock by calling TAS –If result is false, you win –If result is true, you lose Release lock by writing false
126
Art of Multiprocessor Programming© Herlihy-Shavit 2007 126 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}
127
Art of Multiprocessor Programming© Herlihy-Shavit 2007 127 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Lock state is AtomicBoolean
128
Art of Multiprocessor Programming© Herlihy-Shavit 2007 128 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Keep trying until lock acquired
129
Art of Multiprocessor Programming© Herlihy-Shavit 2007 129 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Release lock by resetting state to false
130
Art of Multiprocessor Programming© Herlihy-Shavit 2007 130 Space Complexity TAS spin-lock has small “footprint” N thread spin-lock uses O(1) space As opposed to O(n) Peterson/Bakery How did we overcome the (n) lower bound? We used a RMW operation…
131
Art of Multiprocessor Programming© Herlihy-Shavit 2007 131 Performance Experiment –n threads –Increment shared counter 1 million times How long should it take? How long does it take?
132
Art of Multiprocessor Programming© Herlihy-Shavit 2007 132 Graph ideal time threads no speedup because of sequential bottleneck
133
Art of Multiprocessor Programming© Herlihy-Shavit 2007 133 Mystery #1 time threads TAS lock Ideal (1) What is going on?
134
Art of Multiprocessor Programming© Herlihy-Shavit 2007 134 Test-and-Test-and-Set Locks Lurking stage –Wait until lock “looks” free –Spin while read returns true (lock taken) Pouncing state –As soon as lock “looks” available –Read returns false (lock free) –Call TAS to acquire lock –If TAS loses, back to lurking
135
Art of Multiprocessor Programming© Herlihy-Shavit 2007 135 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }
136
Art of Multiprocessor Programming© Herlihy-Shavit 2007 136 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; } Wait until lock looks free
137
Art of Multiprocessor Programming© Herlihy-Shavit 2007 137 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; } Then try to acquire it
138
Art of Multiprocessor Programming© Herlihy-Shavit 2007 138 Mystery #2 TAS lock TTAS lock Ideal time threads
139
Art of Multiprocessor Programming© Herlihy-Shavit 2007 139 Mystery Both –TAS and TTAS –Do the same thing (in our model) Except that –TTAS performs much better than TAS –Neither approaches ideal
140
Art of Multiprocessor Programming© Herlihy-Shavit 2007 140 Opinion Our memory abstraction is broken TAS & TTAS methods –Are provably the same (in our model) –Except they aren’t (in field tests) Need a more detailed model …
141
Art of Multiprocessor Programming© Herlihy-Shavit 2007 141 Bus-Based Architectures Bus cache memory cache
142
Art of Multiprocessor Programming© Herlihy-Shavit 2007 142 Bus-Based Architectures Bus cache memory cache Random access memory (10s of cycles)
143
Art of Multiprocessor Programming© Herlihy-Shavit 2007 143 Bus-Based Architectures cache memory cache Shared Bus broadcast medium One broadcaster at a time Processors and memory all “snoop” Bus
144
Art of Multiprocessor Programming© Herlihy-Shavit 2007 144 Bus-Based Architectures Bus cache memory cache Per-Processor Caches Small Fast: 1 or 2 cycles Address & state information
145
Art of Multiprocessor Programming© Herlihy-Shavit 2007 145 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™
146
Art of Multiprocessor Programming© Herlihy-Shavit 2007 146 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™ Cache miss –“I had to shlep all the way to memory for that data” –Bad Thing™
147
Art of Multiprocessor Programming© Herlihy-Shavit 2007 147 Cave Canem This model is still a simplification –But not in any essential way –Illustrates basic principles Will discuss complexities later
148
Art of Multiprocessor Programming© Herlihy-Shavit 2007 148 Bus Processor Issues Load Request cache memory cache data
149
Art of Multiprocessor Programming© Herlihy-Shavit 2007 149 Bus Processor Issues Load Request Bus cache memory cache data Gimme data
150
Art of Multiprocessor Programming© Herlihy-Shavit 2007 150 cache Bus Memory Responds Bus memory cache data Got your data right here data
151
Art of Multiprocessor Programming© Herlihy-Shavit 2007 151 Bus Processor Issues Load Request memory cache data Gimme data
152
Art of Multiprocessor Programming© Herlihy-Shavit 2007 152 Bus Processor Issues Load Request Bus memory cache data Gimme data
153
Art of Multiprocessor Programming© Herlihy-Shavit 2007 153 Bus Processor Issues Load Request Bus memory cache data I got data
154
Art of Multiprocessor Programming© Herlihy-Shavit 2007 154 Bus Other Processor Responds memory cache data I got data data Bus
155
Art of Multiprocessor Programming© Herlihy-Shavit 2007 155 Bus Other Processor Responds memory cache data Bus
156
Art of Multiprocessor Programming© Herlihy-Shavit 2007 156 Modify Cached Data Bus data memory cachedata (1)
157
Art of Multiprocessor Programming© Herlihy-Shavit 2007 157 Modify Cached Data Bus data memory cachedata (1)
158
Art of Multiprocessor Programming© Herlihy-Shavit 2007 158 memory Bus data Modify Cached Data cachedata
159
Art of Multiprocessor Programming© Herlihy-Shavit 2007 159 memory Bus data Modify Cached Data cache What’s up with the other copies? data
160
Art of Multiprocessor Programming© Herlihy-Shavit 2007 160 Cache Coherence We have lots of copies of data –Original copy in memory –Cached copies at processors Some processor modifies its own copy –What do we do with the others? –How to avoid confusion?
161
Art of Multiprocessor Programming© Herlihy-Shavit 2007 161 Write-Back Caches Accumulate changes in cache Write back when needed –Need the cache for something else –Another processor wants it On first modification –Invalidate other entries –Requires non-trivial protocol …
162
Art of Multiprocessor Programming© Herlihy-Shavit 2007 162 Write-Back Caches Cache entry has three states –Invalid: contains raw seething bits –Valid: I can read but I can’t write –Dirty: Data has been modified Intercept other load requests Write back to memory before using cache
163
Art of Multiprocessor Programming© Herlihy-Shavit 2007 163 Bus Invalidate memory cachedata
164
Art of Multiprocessor Programming© Herlihy-Shavit 2007 164 Bus Invalidate Bus memory cachedata Mine, all mine!
165
Art of Multiprocessor Programming© Herlihy-Shavit 2007 165 Bus Invalidate Bus memory cachedata cache Uh,oh
166
Art of Multiprocessor Programming© Herlihy-Shavit 2007 166 cache Bus Invalidate memory cachedata Other caches lose read permission
167
Art of Multiprocessor Programming© Herlihy-Shavit 2007 167 cache Bus Invalidate memory cachedata Other caches lose read permission This cache acquires write permission
168
Art of Multiprocessor Programming© Herlihy-Shavit 2007 168 cache Bus Invalidate memory cachedata Memory provides data only if not present in any cache, so no need to change it now (expensive) (2)
169
Art of Multiprocessor Programming© Herlihy-Shavit 2007 169 cache Bus Another Processor Asks for Data memory cachedata (2) Bus
170
Art of Multiprocessor Programming© Herlihy-Shavit 2007 170 cache data Bus Owner Responds memory cachedata (2) Bus Here it is!
171
Art of Multiprocessor Programming© Herlihy-Shavit 2007 171 Bus End of the Day … memory cachedata (1) Reading OK, no writing data
172
Art of Multiprocessor Programming© Herlihy-Shavit 2007 172 Mutual Exclusion What do we want to optimize? –Bus bandwidth used by spinning threads –Release/Acquire latency –Acquire latency for idle lock
173
Art of Multiprocessor Programming© Herlihy-Shavit 2007 173 Simple TASLock TAS invalidates cache lines Spinners –Miss in cache –Go to bus Thread wants to release lock –delayed behind spinners
174
Art of Multiprocessor Programming© Herlihy-Shavit 2007 174 Test-and-test-and-set Wait until lock “looks” free –Spin on local cache –No bus use while lock busy Problem: when lock is released –Invalidation storm …
175
Art of Multiprocessor Programming© Herlihy-Shavit 2007 175 Local Spinning while Lock is Busy Bus memory busy
176
Art of Multiprocessor Programming© Herlihy-Shavit 2007 176 Bus On Release memory freeinvalid free
177
Art of Multiprocessor Programming© Herlihy-Shavit 2007 177 On Release Bus memory freeinvalid free miss Everyone misses, rereads (1)
178
Art of Multiprocessor Programming© Herlihy-Shavit 2007 178 On Release Bus memory freeinvalid free TAS(…) Everyone tries TAS (1)
179
Art of Multiprocessor Programming© Herlihy-Shavit 2007 179 Problems Everyone misses –Reads satisfied sequentially Everyone does TAS –Invalidates others’ caches Eventually quiesces after lock acquired –How long does this take?
180
Art of Multiprocessor Programming© Herlihy-Shavit 2007 180 Mystery Explained TAS lock TTAS lock Ideal time threads Better than TAS but still not as good as ideal
181
Art of Multiprocessor Programming© Herlihy-Shavit 2007 181 Solution: Introduce Delay spin lock time d r1dr1d r2dr2d If the lock looks free But I fail to get it There must be lots of contention Better to back off than to collide again
182
Art of Multiprocessor Programming© Herlihy-Shavit 2007 182 Dynamic Example: Exponential Backoff time d 2d4d spin lock If I fail to get lock –wait random duration before retry –Each subsequent failure doubles expected wait
183
Art of Multiprocessor Programming© Herlihy-Shavit 2007 183 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}
184
Art of Multiprocessor Programming© Herlihy-Shavit 2007 184 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Fix minimum delay
185
Art of Multiprocessor Programming© Herlihy-Shavit 2007 185 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Wait until lock looks free
186
Art of Multiprocessor Programming© Herlihy-Shavit 2007 186 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} If we win, return
187
Art of Multiprocessor Programming© Herlihy-Shavit 2007 187 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Back off for random duration
188
Art of Multiprocessor Programming© Herlihy-Shavit 2007 188 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Double max delay, within reason
189
Art of Multiprocessor Programming© Herlihy-Shavit 2007 189 Spin-Waiting Overhead TTAS Lock Backoff lock time threads
190
Art of Multiprocessor Programming© Herlihy-Shavit 2007 190 Backoff: Other Issues Good –Easy to implement –Beats TTAS lock Bad –Must choose parameters carefully –Not portable across platforms
191
Art of Multiprocessor Programming© Herlihy-Shavit 2007 191 Idea Avoid useless invalidations –By keeping a queue of threads Each thread –Notifies next in line –Without bothering the others
192
Art of Multiprocessor Programming© Herlihy-Shavit 2007 192 Anderson Queue Lock flags next TFFFFFFF idle
193
Art of Multiprocessor Programming© Herlihy-Shavit 2007 193 Anderson Queue Lock flags next TFFFFFFF acquiring getAndIncrement
194
Art of Multiprocessor Programming© Herlihy-Shavit 2007 194 Anderson Queue Lock flags next TFFFFFFF acquiring getAndIncrement
195
Art of Multiprocessor Programming© Herlihy-Shavit 2007 195 Anderson Queue Lock flags next TFFFFFFF acquired Mine!
196
Art of Multiprocessor Programming© Herlihy-Shavit 2007 196 Anderson Queue Lock flags next TFFFFFFF acquired acquiring
197
Art of Multiprocessor Programming© Herlihy-Shavit 2007 197 Anderson Queue Lock flags next TFFFFFFF acquired acquiring getAndIncrement
198
Art of Multiprocessor Programming© Herlihy-Shavit 2007 198 Anderson Queue Lock flags next TFFFFFFF acquired acquiring getAndIncrement
199
Art of Multiprocessor Programming© Herlihy-Shavit 2007 199 acquired Anderson Queue Lock flags next TFFFFFFF acquiring
200
Art of Multiprocessor Programming© Herlihy-Shavit 2007 200 released Anderson Queue Lock flags next TTFFFFFF acquired
201
Art of Multiprocessor Programming© Herlihy-Shavit 2007 201 released Anderson Queue Lock flags next TTFFFFFF acquired Yow!
202
Art of Multiprocessor Programming© Herlihy-Shavit 2007 202 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n];
203
Art of Multiprocessor Programming© Herlihy-Shavit 2007 203 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n]; One flag per thread
204
Art of Multiprocessor Programming© Herlihy-Shavit 2007 204 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n]; Next flag to use
205
Art of Multiprocessor Programming© Herlihy-Shavit 2007 205 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); ThreadLocal mySlot; Thread-local variable
206
Art of Multiprocessor Programming© Herlihy-Shavit 2007 206 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; }
207
Art of Multiprocessor Programming© Herlihy-Shavit 2007 207 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Take next slot
208
Art of Multiprocessor Programming© Herlihy-Shavit 2007 208 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Spin until told to go
209
Art of Multiprocessor Programming© Herlihy-Shavit 2007 209 Anderson Queue Lock public lock() { int slot[i]=next.getAndIncrement(); while (!flags[slot[i] % n]) {}; flags[slot[i] % n] = false; } public unlock() { flags[slot[i]+1 % n] = true; } Prepare slot for re-use
210
Art of Multiprocessor Programming© Herlihy-Shavit 2007 210 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Tell next thread to go
211
Art of Multiprocessor Programming© Herlihy-Shavit 2007 211 Performance Shorter handover than backoff Curve is practically flat Scalable performance FIFO fairness queue TTAS
212
Art of Multiprocessor Programming© Herlihy-Shavit 2007 212 Anderson Queue Lock Good –First truly scalable lock –Simple, easy to implement Bad –Space hog –One bit per thread Unknown number of threads? Small number of actual contenders?
213
Art of Multiprocessor Programming 213 This work is licensed under a Creative Commons Attribution- ShareAlike 2.5 License.Creative Commons Attribution- ShareAlike 2.5 License You are free: –to Share — to copy, distribute and transmit the work –to Remix — to adapt the work Under the following conditions: –Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). –Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to –http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.