Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit.

Similar presentations


Presentation on theme: "Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit."— Presentation transcript:

1 Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit

2 Art of Multiprocessor Programming 2 Mutual Exclusion Today we will try to formalize our understanding of mutual exclusion We will also use the opportunity to show you how to argue about and prove various properties in an asynchronous concurrent setting

3 Art of Multiprocessor Programming 3 Mutual Exclusion Formal problem definitions Solutions for 2 threads Solutions for n threads Fair solutions Inherent costs

4 Art of Multiprocessor Programming 4 Warning You will never use these protocols –Get over it You are advised to understand them –The same issues show up everywhere –Except hidden and more complex

5 Art of Multiprocessor Programming 5 Why is Concurrent Programming so Hard? Try preparing a seven-course banquet –By yourself –With one friend –With twenty-seven friends … Before we can talk about programs –Need a language –Describing time and concurrency

6 Art of Multiprocessor Programming 6 “Absolute, true and mathematical time, of itself and from its own nature, flows equably without relation to anything external.” (I. Newton, 1689) “Time is, like, Nature’s way of making sure that everything doesn’t happen all at once.” (Anonymous, circa 1968) Time time

7 Art of Multiprocessor Programming 7 time An event a 0 of thread A is –Instantaneous –No simultaneous events (break ties) a0a0 Events

8 Art of Multiprocessor Programming 8 time A thread A is (formally) a sequence a 0, a 1,... of events –“Trace” model –Notation: a 0  a 1 indicates order a0a0 Threads a1a1 a2a2 …

9 Art of Multiprocessor Programming 9 Assign to shared variable Assign to local variable Invoke method Return from method Lots of other things … Example Thread Events

10 Art of Multiprocessor Programming 10 Threads are State Machines Events are transitions a0a0 a1a1 a2a2 a3a3

11 Art of Multiprocessor Programming 11 States Thread State –Program counter –Local variables System state –Object fields (shared variables) –Union of thread states

12 Art of Multiprocessor Programming 12 time Thread A Concurrency

13 Art of Multiprocessor Programming 13 time Thread A Thread B Concurrency

14 Art of Multiprocessor Programming 14 time Interleavings Events of two or more threads –Interleaved –Not necessarily independent (why?)

15 Art of Multiprocessor Programming 15 time An interval A 0 =(a 0,a 1 ) is –Time between events a 0 and a 1 a0a0 a1a1 Intervals A0A0

16 Art of Multiprocessor Programming 16 time Intervals may Overlap a0a0 a1a1 A0A0 b0b0 b1b1 B0B0

17 Art of Multiprocessor Programming 17 time Intervals may be Disjoint a0a0 a1a1 A0A0 b0b0 b1b1 B0B0

18 Art of Multiprocessor Programming 18 time Precedence a0a0 a1a1 A0A0 b0b0 b1b1 B0B0 Interval A 0 precedes interval B 0

19 Art of Multiprocessor Programming 19 Precedence Notation: A 0  B 0 Formally, –End event of A 0 before start event of B 0 –Also called “happens before” or “precedes”

20 Art of Multiprocessor Programming 20 Precedence Ordering Remark: A 0  B 0 is just like saying –1066 AD  1492 AD, –Middle Ages  Renaissance, Oh wait, –what about this week vs this month?

21 Art of Multiprocessor Programming 21 Precedence Ordering Never true that A   A If A  B then not true that B  A If A  B & B  C then A  C Funny thing: A  B & B  A might both be false!

22 Art of Multiprocessor Programming 22 Partial Orders (you may know this already) Irreflexive: –Never true that A   A Antisymmetric: –If A   B then not true that B   A Transitive: –If A   B & B   C then A   C

23 Art of Multiprocessor Programming 23 Total Orders (you may know this already) Also –Irreflexive –Antisymmetric –Transitive Except that for every distinct A, B, –Either A   B or B   A

24 Art of Multiprocessor Programming 24 Repeated Events while (mumble) { a 0 ; a 1 ; } a0ka0k k-th occurrence of event a 0 A0kA0k k-th occurrence of interval A 0 =(a 0,a 1 )

25 Art of Multiprocessor Programming 25 Implementing a Counter public class Counter { private long value; public long getAndIncrement() { temp = value; value = temp + 1; return temp; } Make these steps indivisible using locks

26 Art of Multiprocessor Programming 26 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); }

27 Art of Multiprocessor Programming 27 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } acquire lock

28 Art of Multiprocessor Programming 28 Locks (Mutual Exclusion) public interface Lock { public void lock(); public void unlock(); } release lock acquire lock

29 Art of Multiprocessor Programming 29 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }}

30 Art of Multiprocessor Programming 30 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} acquire Lock

31 Art of Multiprocessor Programming 31 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} Release lock (no matter what)

32 Art of Multiprocessor Programming 32 Using Locks public class Counter { private long value; private Lock lock; public long getAndIncrement() { lock.lock(); try { int temp = value; value = value + 1; } finally { lock.unlock(); } return temp; }} Critical section

33 Art of Multiprocessor Programming 33 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution

34 Art of Multiprocessor Programming 34 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be thread j’s m-th critical section execution

35 Art of Multiprocessor Programming 35 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or

36 Art of Multiprocessor Programming 36 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or CS i k  CS j m

37 Art of Multiprocessor Programming 37 Mutual Exclusion Let CS i k be thread i’s k-th critical section execution And CS j m be j’s m-th execution Then either – or CS i k  CS j m CS j m  CS i k

38 Art of Multiprocessor Programming 38 Deadlock-Free If some thread calls lock() –And never returns –Then other threads must complete lock() and unlock() calls infinitely often System as a whole makes progress –Even if individuals starve

39 Art of Multiprocessor Programming 39 Starvation-Free If some thread calls lock() –It will eventually return Individual threads make progress

40 Art of Multiprocessor Programming 40 Two-Thread vs n -Thread Solutions Two-thread solutions first –Illustrate most basic ideas –Fits on one slide Then n-Thread solutions

41 Art of Multiprocessor Programming 41 class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; … } Two-Thread Conventions

42 Art of Multiprocessor Programming 42 class … implements Lock { … // thread-local index, 0 or 1 public void lock() { int i = ThreadID.get(); int j = 1 - i; … } Two-Thread Conventions Henceforth: i is current thread, j is other thread

43 Art of Multiprocessor Programming 43 LockOne class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} }

44 Art of Multiprocessor Programming 44 LockOne class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } Set my flag

45 Art of Multiprocessor Programming 45 class LockOne implements Lock { private volatile boolean[] flag = new boolean[2]; public void lock() { flag[i] = true; while (flag[j]) {} } LockOne Wait for other flag to go false Set my flag

46 Art of Multiprocessor Programming 46 Assume CS A j overlaps CS B k Consider each thread's last (j-th and k-th) read and write in the lock() method before entering Derive a contradiction LockOne Satisfies Mutual Exclusion

47 Art of Multiprocessor Programming 47 write A (flag[A]=true)  read A (flag[B]==false)  CS A write B (flag[B]=true)  read B (flag[A]==false)  CS B From the Code class LockOne implements Lock { … public void lock() { flag[i] = true; while (flag[j]) {} }

48 Art of Multiprocessor Programming 48 read A (flag[B]==false)  write B (flag[B]=true) read B (flag[A]==false)  write A (flag[B]=true) From the Assumption

49 Art of Multiprocessor Programming 49 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

50 Art of Multiprocessor Programming 50 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

51 Art of Multiprocessor Programming 51 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

52 Art of Multiprocessor Programming 52 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

53 Art of Multiprocessor Programming 53 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

54 Art of Multiprocessor Programming 54 Assumptions: –read A (flag[B]==false)  write B (flag[B]=true) –read B (flag[A]==false)  write A (flag[A]=true) From the code –write A (flag[A]=true)  read A (flag[B]==false) –write B (flag[B]=true)  read B (flag[A]==false) Combining

55 Art of Multiprocessor Programming 55 Cycle!

56 Art of Multiprocessor Programming 56 Deadlock Freedom LockOne Fails deadlock-freedom –Concurrent execution can deadlock –Sequential executions OK flag[i] = true; flag[j] = true; while (flag[j]){} while (flag[i]){}

57 Art of Multiprocessor Programming 57 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} }

58 Art of Multiprocessor Programming 58 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Let other go first

59 Art of Multiprocessor Programming 59 LockTwo public class LockTwo implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Wait for permission

60 Art of Multiprocessor Programming 60 LockTwo public class Lock2 implements Lock { private volatile int victim; public void lock() { victim = i; while (victim == i) {}; } public void unlock() {} } Nothing to do

61 Art of Multiprocessor Programming 61 public void LockTwo() { victim = i; while (victim == i) {}; } LockTwo Claims Satisfies mutual exclusion –If thread i in CS –Then victim == j –Cannot be both 0 and 1 Not deadlock free –Sequential execution deadlocks –Concurrent execution does not

62 Art of Multiprocessor Programming 62 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; }

63 Art of Multiprocessor Programming 63 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested

64 Art of Multiprocessor Programming 64 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other

65 Art of Multiprocessor Programming 65 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other Wait while other interested & I’m the victim

66 Art of Multiprocessor Programming 66 Peterson’s Algorithm public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; } Announce I’m interested Defer to other Wait while other interested & I’m the victim No longer interested

67 Art of Multiprocessor Programming 67 public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; Mutual Exclusion If thread 1 in critical section, –flag[1]=true, –victim = 0 If thread 0 in critical section, –flag[0]=true, –victim = 1 Cannot both be true

68 Art of Multiprocessor Programming 68 Deadlock Free Thread blocked –only at while loop –only if it is the victim One or the other must not be the victim public void lock() { … while (flag[j] && victim == i) {};

69 Art of Multiprocessor Programming 69 Starvation Free Thread i blocked only if j repeatedly re-enters so that flag[j] == true and victim == i When j re-enters –it sets victim to j. –So i gets in public void lock() { flag[i] = true; victim = i; while (flag[j] && victim == i) {}; } public void unlock() { flag[i] = false; }

70 Art of Multiprocessor Programming 70 The Filter Algorithm for n Threads There are n-1 “waiting rooms” called levels At each level –At least one enters level –At least one blocked if many try Only one thread makes it through ncs cs

71 Art of Multiprocessor Programming 71 Filter class Filter implements Lock { volatile int[] level; // level[i] for thread i volatile int[] victim; // victim[L] for level L public Filter(int n) { level = new int[n]; victim = new int[n]; for (int i = 1; i < n; i++) { level[i] = 0; }} … } n -1 0 1 0000004 2 2 Thread 2 at level 4 0 4 level victim

72 Art of Multiprocessor Programming 72 Filter class Filter implements Lock { … public void lock(){ for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i level[k] >= L) && victim[L] == i ); }} public void unlock() { level[i] = 0; }}

73 Art of Multiprocessor Programming 73 class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter One level at a time

74 Art of Multiprocessor Programming 74 class Filter implements Lock { … public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i); // busy wait }} public void release(int i) { level[i] = 0; }} Filter Announce intention to enter level L

75 Art of Multiprocessor Programming 75 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Give priority to anyone but me

76 Art of Multiprocessor Programming 76 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Wait as long as someone else is at same or higher level, and I’m designated victim

77 Art of Multiprocessor Programming 77 class Filter implements Lock { int level[n]; int victim[n]; public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i); }} public void release(int i) { level[i] = 0; }} Filter Thread enters level L when it completes the loop

78 Art of Multiprocessor Programming 78 Claim Start at level L=0 At most n-L threads enter level L Mutual exclusion at level L=n-1 ncs csL=n-1 L=1 L=n-2 L=0

79 Art of Multiprocessor Programming 79 Induction Hypothesis Assume all at level L-1 enter level L A last to write victim[L] B is any other thread at level L No more than n-L+1 at level L-1 Induction step: by contradiction ncs cs L-1 has n-L+1 L has n-L assume prove

80 Art of Multiprocessor Programming 80 Proof Structure ncs cs Assumed to enter L-1 By way of contradiction all enter L n-L+1 = 4 A B Last to write victim[L] Show that A must have seen B at level L and since victim[L] == A could not have entered

81 Art of Multiprocessor Programming 81 From the Code (1) write B (level[B]=L)  write B (victim[L]=B) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i) {}; }}

82 Art of Multiprocessor Programming 82 From the Code (2) write A (victim[L]=A)  read A (level[B]) public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i) {}; }}

83 Art of Multiprocessor Programming 83 By Assumption By assumption, A is the last thread to write victim[L] (3) write B (victim[L]=B)  write A (victim[L]=A)

84 Art of Multiprocessor Programming 84 Combining Observations (1) write B (level[B]=L)  write B (victim[L]=B) (3) write B (victim[L]=B)  write A (victim[L]=A) (2) write A (victim[L]=A)  read A (level[B])

85 Art of Multiprocessor Programming 85 public void lock() { for (int L = 1; L < n; L++) { level[i] = L; victim[L] = i; while ((   k != i) level[k] >= L) && victim[L] == i) {}; }} Combining Observations (1) write B (level[B]=L)  write B (victim[L]=B) (3) write B (victim[L]=B)  write A (victim[L]=A) (2) write A (victim[L]=A)  read A (level[B])

86 Art of Multiprocessor Programming 86 Combining Observations (1) write B (level[B]=L)  write B (victim[L]=B) (3) write B (victim[L]=B)  write A (victim[L]=A) (2) write A (victim[L]=A)  read A (level[B]) Thus, A read level[B] ≥ L, A was last to write victim[L], so it could not have entered level L!

87 Art of Multiprocessor Programming 87 No Starvation Filter Lock satisfies properties: –Just like Peterson Alg at any level –So no one starves But what about fairness? –Threads can be overtaken by others

88 Art of Multiprocessor Programming 88 Bounded Waiting Want stronger fairness guarantees Thread not “overtaken” too much Need to adjust definitions ….

89 Art of Multiprocessor Programming 89 Bounded Waiting Divide lock() method into 2 parts: –Doorway interval: Written D A always finishes in finite steps –Waiting interval: Written W A may take unbounded steps

90 Art of Multiprocessor Programming 90 Bakery Algorithm Provides First-Come-First-Served How? –Take a “number” –Wait until lower numbers have been served Lexicographic order –(a,i) > (b,j) If a > b, or a = b and i > j

91 Art of Multiprocessor Programming 91 Bakery Algorithm class Bakery implements Lock { volatile boolean[] flag; volatile Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } …

92 Art of Multiprocessor Programming 92 Bakery Algorithm class Bakery implements Lock { volatile boolean[] flag; volatile Label[] label; public Bakery (int n) { flag = new boolean[n]; label = new Label[n]; for (int i = 0; i < n; i++) { flag[i] = false; label[i] = 0; } … n -1 0 fffftft 2 f 0000504 0 6 CS

93 Art of Multiprocessor Programming 93 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); }

94 Art of Multiprocessor Programming 94 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } Doorway

95 Art of Multiprocessor Programming 95 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } I’m interested

96 Art of Multiprocessor Programming 96 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } Take increasing label (read labels in some arbitrary order)

97 Art of Multiprocessor Programming 97 Bakery Algorithm class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } Someone is interested

98 Art of Multiprocessor Programming 98 Bakery Algorithm class Bakery implements Lock { boolean flag[n]; int label[n]; public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } Someone is interested With lower (label,i) in lexicographic order

99 Art of Multiprocessor Programming 99 Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; }

100 Art of Multiprocessor Programming 100 Bakery Algorithm class Bakery implements Lock { … public void unlock() { flag[i] = false; } No longer interested labels are always increasing

101 Art of Multiprocessor Programming 101 No Deadlock There is always one thread with earliest label Ties are impossible (why?)

102 Art of Multiprocessor Programming 102 First-Come-First-Served If D A  D B then A’s label is earlier –write A (label[A])  read B (label[A])  write B (label[B])  read B (flag[A]) So B is locked out while flag[A] is true class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); }

103 Art of Multiprocessor Programming 103 Mutual Exclusion Suppose A and B in CS together Suppose A has earlier label When B entered, it must have seen –flag[A] is false, or –label[A] > label[B] class Bakery implements Lock { public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); }

104 Art of Multiprocessor Programming 104 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false

105 Art of Multiprocessor Programming 105 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false Labeling B  read B (flag[A])  write A (flag[A])  Labeling A

106 Art of Multiprocessor Programming 106 Mutual Exclusion Labels are strictly increasing so B must have seen flag[A] == false Labeling B  read B (flag[A])  write A (flag[A])  Labeling A Which contradicts the assumption that A has an earlier label

107 Art of Multiprocessor Programming 107 Bakery Y2 32 K Bug class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); }

108 Art of Multiprocessor Programming 108 Bakery Y2 32 K Bug class Bakery implements Lock { … public void lock() { flag[i] = true; label[i] = max(label[0], …,label[n-1])+1; while (  k flag[k] && (label[i],i) > (label[k],k)); } Mutex breaks if label[i] overflows

109 Art of Multiprocessor Programming 109 Does Overflow Actually Matter? Yes –Y2K –18 January 2038 (Unix time_t rollover) –16-bit counters No –64-bit counters Maybe –32-bit counters

110 Art of Multiprocessor Programming© Herlihy-Shavit 2007 110 Revisit Mutual Exclusion... Think of performance, not just correctness and progress Begin to understand how performance depends on our software properly utilizing the multiprocessor machine’s hardware And get to know a collection of locking algorithms… (1)

111 Art of Multiprocessor Programming© Herlihy-Shavit 2007 111 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor (1)

112 Art of Multiprocessor Programming© Herlihy-Shavit 2007 112 What Should you do if you can’t get a lock? Keep trying –“spin” or “busy-wait” –Good if delays are short Give up the processor –Good if delays are long –Always good on uniprocessor our focus

113 Art of Multiprocessor Programming© Herlihy-Shavit 2007 113 Basic Spin-Lock CS Resets lock upon exit spin lock critical section...

114 Art of Multiprocessor Programming© Herlihy-Shavit 2007 114 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock introduces sequential bottleneck

115 Art of Multiprocessor Programming© Herlihy-Shavit 2007 115 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention

116 Art of Multiprocessor Programming© Herlihy-Shavit 2007 116 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... Notice: these are distinct phenomena …lock suffers from contention

117 Art of Multiprocessor Programming© Herlihy-Shavit 2007 117 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... …lock suffers from contention Seq Bottleneck  no parallelism

118 Art of Multiprocessor Programming© Herlihy-Shavit 2007 118 Basic Spin-Lock CS Resets lock upon exit spin lock critical section... Contention  ??? …lock suffers from contention

119 Art of Multiprocessor Programming© Herlihy-Shavit 2007 119 Review: Test-and-Set Boolean value Test-and-set (TAS) –Swap true with current value –Return value tells if prior value was true or false Can reset just by writing false TAS aka “getAndSet”

120 Art of Multiprocessor Programming© Herlihy-Shavit 2007 120 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } (5)

121 Art of Multiprocessor Programming© Herlihy-Shavit 2007 121 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Package java.util.concurrent.atomic

122 Art of Multiprocessor Programming© Herlihy-Shavit 2007 122 Review: Test-and-Set public class AtomicBoolean { boolean value; public synchronized boolean getAndSet(boolean newValue) { boolean prior = value; value = newValue; return prior; } Swap old and new values

123 Art of Multiprocessor Programming© Herlihy-Shavit 2007 123 Review: Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true)

124 Art of Multiprocessor Programming© Herlihy-Shavit 2007 124 Review: Test-and-Set AtomicBoolean lock = new AtomicBoolean(false) … boolean prior = lock.getAndSet(true) (5) Swapping in true is called “test-and-set” or TAS

125 Art of Multiprocessor Programming© Herlihy-Shavit 2007 125 Test-and-Set Locks Locking –Lock is free: value is false –Lock is taken: value is true Acquire lock by calling TAS –If result is false, you win –If result is true, you lose Release lock by writing false

126 Art of Multiprocessor Programming© Herlihy-Shavit 2007 126 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }}

127 Art of Multiprocessor Programming© Herlihy-Shavit 2007 127 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Lock state is AtomicBoolean

128 Art of Multiprocessor Programming© Herlihy-Shavit 2007 128 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Keep trying until lock acquired

129 Art of Multiprocessor Programming© Herlihy-Shavit 2007 129 Test-and-set Lock class TASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (state.getAndSet(true)) {} } void unlock() { state.set(false); }} Release lock by resetting state to false

130 Art of Multiprocessor Programming© Herlihy-Shavit 2007 130 Space Complexity TAS spin-lock has small “footprint” N thread spin-lock uses O(1) space As opposed to O(n) Peterson/Bakery How did we overcome the  (n) lower bound? We used a RMW operation…

131 Art of Multiprocessor Programming© Herlihy-Shavit 2007 131 Performance Experiment –n threads –Increment shared counter 1 million times How long should it take? How long does it take?

132 Art of Multiprocessor Programming© Herlihy-Shavit 2007 132 Graph ideal time threads no speedup because of sequential bottleneck

133 Art of Multiprocessor Programming© Herlihy-Shavit 2007 133 Mystery #1 time threads TAS lock Ideal (1) What is going on?

134 Art of Multiprocessor Programming© Herlihy-Shavit 2007 134 Test-and-Test-and-Set Locks Lurking stage –Wait until lock “looks” free –Spin while read returns true (lock taken) Pouncing state –As soon as lock “looks” available –Read returns false (lock free) –Call TAS to acquire lock –If TAS loses, back to lurking

135 Art of Multiprocessor Programming© Herlihy-Shavit 2007 135 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; }

136 Art of Multiprocessor Programming© Herlihy-Shavit 2007 136 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; } Wait until lock looks free

137 Art of Multiprocessor Programming© Herlihy-Shavit 2007 137 Test-and-test-and-set Lock class TTASlock { AtomicBoolean state = new AtomicBoolean(false); void lock() { while (true) { while (state.get()) {} if (!state.getAndSet(true)) return; } Then try to acquire it

138 Art of Multiprocessor Programming© Herlihy-Shavit 2007 138 Mystery #2 TAS lock TTAS lock Ideal time threads

139 Art of Multiprocessor Programming© Herlihy-Shavit 2007 139 Mystery Both –TAS and TTAS –Do the same thing (in our model) Except that –TTAS performs much better than TAS –Neither approaches ideal

140 Art of Multiprocessor Programming© Herlihy-Shavit 2007 140 Opinion Our memory abstraction is broken TAS & TTAS methods –Are provably the same (in our model) –Except they aren’t (in field tests) Need a more detailed model …

141 Art of Multiprocessor Programming© Herlihy-Shavit 2007 141 Bus-Based Architectures Bus cache memory cache

142 Art of Multiprocessor Programming© Herlihy-Shavit 2007 142 Bus-Based Architectures Bus cache memory cache Random access memory (10s of cycles)

143 Art of Multiprocessor Programming© Herlihy-Shavit 2007 143 Bus-Based Architectures cache memory cache Shared Bus broadcast medium One broadcaster at a time Processors and memory all “snoop” Bus

144 Art of Multiprocessor Programming© Herlihy-Shavit 2007 144 Bus-Based Architectures Bus cache memory cache Per-Processor Caches Small Fast: 1 or 2 cycles Address & state information

145 Art of Multiprocessor Programming© Herlihy-Shavit 2007 145 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™

146 Art of Multiprocessor Programming© Herlihy-Shavit 2007 146 Jargon Watch Cache hit –“I found what I wanted in my cache” –Good Thing™ Cache miss –“I had to shlep all the way to memory for that data” –Bad Thing™

147 Art of Multiprocessor Programming© Herlihy-Shavit 2007 147 Cave Canem This model is still a simplification –But not in any essential way –Illustrates basic principles Will discuss complexities later

148 Art of Multiprocessor Programming© Herlihy-Shavit 2007 148 Bus Processor Issues Load Request cache memory cache data

149 Art of Multiprocessor Programming© Herlihy-Shavit 2007 149 Bus Processor Issues Load Request Bus cache memory cache data Gimme data

150 Art of Multiprocessor Programming© Herlihy-Shavit 2007 150 cache Bus Memory Responds Bus memory cache data Got your data right here data

151 Art of Multiprocessor Programming© Herlihy-Shavit 2007 151 Bus Processor Issues Load Request memory cache data Gimme data

152 Art of Multiprocessor Programming© Herlihy-Shavit 2007 152 Bus Processor Issues Load Request Bus memory cache data Gimme data

153 Art of Multiprocessor Programming© Herlihy-Shavit 2007 153 Bus Processor Issues Load Request Bus memory cache data I got data

154 Art of Multiprocessor Programming© Herlihy-Shavit 2007 154 Bus Other Processor Responds memory cache data I got data data Bus

155 Art of Multiprocessor Programming© Herlihy-Shavit 2007 155 Bus Other Processor Responds memory cache data Bus

156 Art of Multiprocessor Programming© Herlihy-Shavit 2007 156 Modify Cached Data Bus data memory cachedata (1)

157 Art of Multiprocessor Programming© Herlihy-Shavit 2007 157 Modify Cached Data Bus data memory cachedata (1)

158 Art of Multiprocessor Programming© Herlihy-Shavit 2007 158 memory Bus data Modify Cached Data cachedata

159 Art of Multiprocessor Programming© Herlihy-Shavit 2007 159 memory Bus data Modify Cached Data cache What’s up with the other copies? data

160 Art of Multiprocessor Programming© Herlihy-Shavit 2007 160 Cache Coherence We have lots of copies of data –Original copy in memory –Cached copies at processors Some processor modifies its own copy –What do we do with the others? –How to avoid confusion?

161 Art of Multiprocessor Programming© Herlihy-Shavit 2007 161 Write-Back Caches Accumulate changes in cache Write back when needed –Need the cache for something else –Another processor wants it On first modification –Invalidate other entries –Requires non-trivial protocol …

162 Art of Multiprocessor Programming© Herlihy-Shavit 2007 162 Write-Back Caches Cache entry has three states –Invalid: contains raw seething bits –Valid: I can read but I can’t write –Dirty: Data has been modified Intercept other load requests Write back to memory before using cache

163 Art of Multiprocessor Programming© Herlihy-Shavit 2007 163 Bus Invalidate memory cachedata

164 Art of Multiprocessor Programming© Herlihy-Shavit 2007 164 Bus Invalidate Bus memory cachedata Mine, all mine!

165 Art of Multiprocessor Programming© Herlihy-Shavit 2007 165 Bus Invalidate Bus memory cachedata cache Uh,oh

166 Art of Multiprocessor Programming© Herlihy-Shavit 2007 166 cache Bus Invalidate memory cachedata Other caches lose read permission

167 Art of Multiprocessor Programming© Herlihy-Shavit 2007 167 cache Bus Invalidate memory cachedata Other caches lose read permission This cache acquires write permission

168 Art of Multiprocessor Programming© Herlihy-Shavit 2007 168 cache Bus Invalidate memory cachedata Memory provides data only if not present in any cache, so no need to change it now (expensive) (2)

169 Art of Multiprocessor Programming© Herlihy-Shavit 2007 169 cache Bus Another Processor Asks for Data memory cachedata (2) Bus

170 Art of Multiprocessor Programming© Herlihy-Shavit 2007 170 cache data Bus Owner Responds memory cachedata (2) Bus Here it is!

171 Art of Multiprocessor Programming© Herlihy-Shavit 2007 171 Bus End of the Day … memory cachedata (1) Reading OK, no writing data

172 Art of Multiprocessor Programming© Herlihy-Shavit 2007 172 Mutual Exclusion What do we want to optimize? –Bus bandwidth used by spinning threads –Release/Acquire latency –Acquire latency for idle lock

173 Art of Multiprocessor Programming© Herlihy-Shavit 2007 173 Simple TASLock TAS invalidates cache lines Spinners –Miss in cache –Go to bus Thread wants to release lock –delayed behind spinners

174 Art of Multiprocessor Programming© Herlihy-Shavit 2007 174 Test-and-test-and-set Wait until lock “looks” free –Spin on local cache –No bus use while lock busy Problem: when lock is released –Invalidation storm …

175 Art of Multiprocessor Programming© Herlihy-Shavit 2007 175 Local Spinning while Lock is Busy Bus memory busy

176 Art of Multiprocessor Programming© Herlihy-Shavit 2007 176 Bus On Release memory freeinvalid free

177 Art of Multiprocessor Programming© Herlihy-Shavit 2007 177 On Release Bus memory freeinvalid free miss Everyone misses, rereads (1)

178 Art of Multiprocessor Programming© Herlihy-Shavit 2007 178 On Release Bus memory freeinvalid free TAS(…) Everyone tries TAS (1)

179 Art of Multiprocessor Programming© Herlihy-Shavit 2007 179 Problems Everyone misses –Reads satisfied sequentially Everyone does TAS –Invalidates others’ caches Eventually quiesces after lock acquired –How long does this take?

180 Art of Multiprocessor Programming© Herlihy-Shavit 2007 180 Mystery Explained TAS lock TTAS lock Ideal time threads Better than TAS but still not as good as ideal

181 Art of Multiprocessor Programming© Herlihy-Shavit 2007 181 Solution: Introduce Delay spin lock time d r1dr1d r2dr2d If the lock looks free But I fail to get it There must be lots of contention Better to back off than to collide again

182 Art of Multiprocessor Programming© Herlihy-Shavit 2007 182 Dynamic Example: Exponential Backoff time d 2d4d spin lock If I fail to get lock –wait random duration before retry –Each subsequent failure doubles expected wait

183 Art of Multiprocessor Programming© Herlihy-Shavit 2007 183 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}}

184 Art of Multiprocessor Programming© Herlihy-Shavit 2007 184 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Fix minimum delay

185 Art of Multiprocessor Programming© Herlihy-Shavit 2007 185 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Wait until lock looks free

186 Art of Multiprocessor Programming© Herlihy-Shavit 2007 186 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} If we win, return

187 Art of Multiprocessor Programming© Herlihy-Shavit 2007 187 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Back off for random duration

188 Art of Multiprocessor Programming© Herlihy-Shavit 2007 188 Exponential Backoff Lock public class Backoff implements lock { public void lock() { int delay = MIN_DELAY; while (true) { while (state.get()) {} if (!lock.getAndSet(true)) return; sleep(random() % delay); if (delay < MAX_DELAY) delay = 2 * delay; }}} Double max delay, within reason

189 Art of Multiprocessor Programming© Herlihy-Shavit 2007 189 Spin-Waiting Overhead TTAS Lock Backoff lock time threads

190 Art of Multiprocessor Programming© Herlihy-Shavit 2007 190 Backoff: Other Issues Good –Easy to implement –Beats TTAS lock Bad –Must choose parameters carefully –Not portable across platforms

191 Art of Multiprocessor Programming© Herlihy-Shavit 2007 191 Idea Avoid useless invalidations –By keeping a queue of threads Each thread –Notifies next in line –Without bothering the others

192 Art of Multiprocessor Programming© Herlihy-Shavit 2007 192 Anderson Queue Lock flags next TFFFFFFF idle

193 Art of Multiprocessor Programming© Herlihy-Shavit 2007 193 Anderson Queue Lock flags next TFFFFFFF acquiring getAndIncrement

194 Art of Multiprocessor Programming© Herlihy-Shavit 2007 194 Anderson Queue Lock flags next TFFFFFFF acquiring getAndIncrement

195 Art of Multiprocessor Programming© Herlihy-Shavit 2007 195 Anderson Queue Lock flags next TFFFFFFF acquired Mine!

196 Art of Multiprocessor Programming© Herlihy-Shavit 2007 196 Anderson Queue Lock flags next TFFFFFFF acquired acquiring

197 Art of Multiprocessor Programming© Herlihy-Shavit 2007 197 Anderson Queue Lock flags next TFFFFFFF acquired acquiring getAndIncrement

198 Art of Multiprocessor Programming© Herlihy-Shavit 2007 198 Anderson Queue Lock flags next TFFFFFFF acquired acquiring getAndIncrement

199 Art of Multiprocessor Programming© Herlihy-Shavit 2007 199 acquired Anderson Queue Lock flags next TFFFFFFF acquiring

200 Art of Multiprocessor Programming© Herlihy-Shavit 2007 200 released Anderson Queue Lock flags next TTFFFFFF acquired

201 Art of Multiprocessor Programming© Herlihy-Shavit 2007 201 released Anderson Queue Lock flags next TTFFFFFF acquired Yow!

202 Art of Multiprocessor Programming© Herlihy-Shavit 2007 202 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n];

203 Art of Multiprocessor Programming© Herlihy-Shavit 2007 203 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n]; One flag per thread

204 Art of Multiprocessor Programming© Herlihy-Shavit 2007 204 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); int[] slot = new int[n]; Next flag to use

205 Art of Multiprocessor Programming© Herlihy-Shavit 2007 205 Anderson Queue Lock class ALock implements Lock { boolean[] flags={true,false,…,false}; AtomicInteger next = new AtomicInteger(0); ThreadLocal mySlot; Thread-local variable

206 Art of Multiprocessor Programming© Herlihy-Shavit 2007 206 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; }

207 Art of Multiprocessor Programming© Herlihy-Shavit 2007 207 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Take next slot

208 Art of Multiprocessor Programming© Herlihy-Shavit 2007 208 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Spin until told to go

209 Art of Multiprocessor Programming© Herlihy-Shavit 2007 209 Anderson Queue Lock public lock() { int slot[i]=next.getAndIncrement(); while (!flags[slot[i] % n]) {}; flags[slot[i] % n] = false; } public unlock() { flags[slot[i]+1 % n] = true; } Prepare slot for re-use

210 Art of Multiprocessor Programming© Herlihy-Shavit 2007 210 Anderson Queue Lock public lock() { int mySlot = next.getAndIncrement(); while (!flags[mySlot % n]) {}; flags[mySlot % n] = false; } public unlock() { flags[(mySlot+1) % n] = true; } Tell next thread to go

211 Art of Multiprocessor Programming© Herlihy-Shavit 2007 211 Performance Shorter handover than backoff Curve is practically flat Scalable performance FIFO fairness queue TTAS

212 Art of Multiprocessor Programming© Herlihy-Shavit 2007 212 Anderson Queue Lock Good –First truly scalable lock –Simple, easy to implement Bad –Space hog –One bit per thread Unknown number of threads? Small number of actual contenders?

213 Art of Multiprocessor Programming 213 This work is licensed under a Creative Commons Attribution- ShareAlike 2.5 License.Creative Commons Attribution- ShareAlike 2.5 License You are free: –to Share — to copy, distribute and transmit the work –to Remix — to adapt the work Under the following conditions: –Attribution. You must attribute the work to “The Art of Multiprocessor Programming” (but not in any way that suggests that the authors endorse you or your use of the work). –Share Alike. If you alter, transform, or build upon this work, you may distribute the resulting work only under the same, similar or a compatible license. For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to –http://creativecommons.org/licenses/by-sa/3.0/. Any of the above conditions can be waived if you get permission from the copyright holder. Nothing in this license impairs or restricts the author's moral rights.


Download ppt "Mutual Exclusion Companion slides for The Art of Multiprocessor Programming by Maurice Herlihy & Nir Shavit."

Similar presentations


Ads by Google