Download presentation
Presentation is loading. Please wait.
Published byMatthew Jackson Modified over 9 years ago
1
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1
2
Distributed Shared Memory CSCE 668Set 16: Distributed Shared Memory 2 A model for inter-process communication Provides illusion of shared variables on top of message passing Shared memory is often considered a more convenient programming platform than message passing Formally, give a simulation of the shared memory model on top of the message passing model We'll consider the special case of no failures only read/write variables to be simulated
3
The Simulation CSCE 668Set 16: Distributed Shared Memory 3 alg 0 read/writereturn/ack sendrecv Message Passing System alg n-1 read/writereturn/ack sendrecv … users of read/write shared memory Shared Memory
4
Shared Memory Issues CSCE 668Set 16: Distributed Shared Memory 4 A process invokes a shared memory operation (read or write) at some time The simulation algorithm running on the same node executes some code, possibly involving exchanges of messages Eventually the simulation algorithm informs the process of the result of the shared memory operation. So shared memory operations are not instantaneous! Operations (invoked by different processes) can overlap What values should be returned by operations that overlap other operations? defined by a memory consistency condition
5
Sequential Specifications CSCE 668Set 16: Distributed Shared Memory 5 Each shared object has a sequential specification: specifies behavior of object in the absence of concurrency. Object supports operations invocations matching responses Set of sequences of operations that are legal
6
Sequential Spec for R/W Registers CSCE 668Set 16: Distributed Shared Memory 6 Each operation has two parts, invocation and response Read operation has invocation read i (X) and response return i (X,v) (subscript i indicates proc.) Write operation has invocation write i (X,v) and response ack i (X) (subscript i indicates proc.) A sequence of operations is legal iff each read returns the value of the latest preceding write. Ex: [write 0 (X,3) ack 0 (X)] [read 1 (X) return 1 (X,3)]
7
Memory Consistency Conditions CSCE 668Set 16: Distributed Shared Memory 7 Consistency conditions tie together the sequential specification with what happens in the presence of concurrency. We will study two well-known conditions: linearizability sequential consistency We will only consider read/write registers, in the absence of failures.
8
Definition of Linearizability CSCE 668Set 16: Distributed Shared Memory 8 Suppose is a sequence of invocations and responses for a set of operations. an invocation is not necessarily immediately followed by its matching response, can have concurrent, overlapping ops is linearizable if there exists a permutation of all the operations in (now each invocation is immediately followed by its matching response) s.t. |X is legal (satisfies sequential spec) for all vars X, and if response of operation O 1 occurs in before invocation of operation O 2, then O 1 occurs in before O 2 ( respects real-time order of non-overlapping operations in ).
9
Linearizability Examples CSCE 668Set 16: Distributed Shared Memory 9 write(X,1)ack(X) Suppose there are two shared variables, X and Y, both initially 0 read(Y)return(Y,1) write(Y,1)ack(Y)read(X)return(X,1) p0p0 p1p1 Is this sequence linearizable? Yes - brown triangles. W hat if p 1 's read returns 0? 0 No - see arrow. 1 2 3 4
10
Definition of Sequential Consistency CSCE 668Set 16: Distributed Shared Memory 10 Suppose is a sequence of invocations and responses for some set of operations. is sequentially consistent if there exists a permutation of all the operations in s.t. |X is legal (satisfies sequential spec) for all vars X, and if response of operation O 1 occurs in before invocation of operation O 2 at the same process, then O 1 occurs in before O 2 ( respects real- time order of operations by the same process in ).
11
Sequential Consistency Examples CSCE 668Set 16: Distributed Shared Memory 11 write(X,1)ack(X) Suppose there are two shared variables, X and Y, both initially 0 read(Y)return(Y,1) write(Y,1)ack(Y)read(X)return(X,0) p0p0 p1p1 Is this sequence sequentially consistent?Yes - brown numbers. What if p 0 's read returns 0? 0 No - see arrows. 12 34
12
Specification of Linearizable Shared Memory Comm. System CSCE 668Set 16: Distributed Shared Memory 12 Inputs are invocations on the shared objects Outputs are responses from the shared objects A sequence is in the allowable set iff Correct Interaction: each proc. alternates invocations and matching responses Liveness: each invocation has a matching response Linearizability: is linearizable
13
Specification of Sequentially Consistent Shared Memory CSCE 668Set 16: Distributed Shared Memory 13 Inputs are invocations on the shared objects Outputs are responses from the shared objects A sequence is in the allowable set iff Correct Interaction: each proc. alternates invocations and matching responses Liveness: each invocation has a matching response Sequential Consistency: is sequentially consistent
14
Algorithm to Implement Linearizable Shared Memory CSCE 668Set 16: Distributed Shared Memory 14 Uses totally ordered broadcast as the underlying communication system. Each proc keeps a replica for each shared variable When read request arrives: send bcast msg containing request when own bcast msg arrives, return value in local replica When write request arrives: send bcast msg containing request upon receipt, each proc updates its replica's value when own bcast msg arrives, respond with ack
15
The Simulation CSCE 668Set 16: Distributed Shared Memory 15 alg 0 read/writereturn/ack to-bc-sendto-bc-recv Totally Ordered Broadcast alg n-1 read/writereturn/ack to-bc-sendto-bc-recv … users of read/write shared memory Shared Memory
16
Correctness of Linearizability Algorithm CSCE 668Set 16: Distributed Shared Memory 16 Consider any admissible execution of the algorithm in which underlying totally ordered broadcast behaves properly users interact properly (alternate invocations and responses Show that , the restriction of to the events of the top interface, satisfies Liveness and Linearizability.
17
Correctness of Linearizability Algorithm CSCE 668Set 16: Distributed Shared Memory 17 Liveness (every invocation has a response): By Liveness property of the underlying totally ordered broadcast. Linearizability: Define the permutation of the operations to be the order in which the corresponding broadcasts are received. is legal: because all the operations are consistently ordered by the TO bcast. respects real-time order of operations: if O 1 finishes before O 2 begins, O 1 's bcast is ordered before O 2 's bcast.
18
Why is Read Bcast Needed? CSCE 668Set 16: Distributed Shared Memory 18 The bcast done for a read causes no changes to any replicas, just delays the response to the read. Why is it needed? Let's see what happens if we remove it.
19
Why Read Bcast is Needed CSCE 668Set 16: Distributed Shared Memory 19 write(1) read return(1) read return(0) to-bc-send p0p0 p1p1 p2p2
20
Algorithm for Sequential Consistency CSCE 668Set 16: Distributed Shared Memory 20 The linearizability algorithm, without doing a bcast for reads: Uses totally ordered broadcast as the underlying communication system. Each proc keeps a replica for each shared variable When read request arrives: immediately return the value stored in the local replica When write request arrives: send bcast msg containing request upon receipt, each proc updates its replica's value when own bcast msg arrives, respond with ack
21
Correctness of SC Algorithm CSCE 668Set 16: Distributed Shared Memory 21 Lemma (9.3): The local copies at each proc. take on all the values appearing in write operations, in the same order, which preserves the order of non-overlapping writes - implies per-process order of writes is preserved Lemma (9.4): If p i writes Y and later reads X, then p i 's update of its local copy of Y (on behalf of that write) precedes its read of its local copy of X (on behalf of that read).
22
Correctness of the SC Algorithm CSCE 668Set 16: Distributed Shared Memory 22 (Theorem 9.5) Why does SC hold? Given any admissible execution , must come up with a permutation of the shared memory operations that is legal and respects per-proc. ordering of operations
23
The Permutation CSCE 668Set 16: Distributed Shared Memory 23 Insert all writes into in their to-bcast order. Consider each read R in in the order of invocation: suppose R is a read by p i of X place R in immediately after the later of 1. the operation by p i that immediately precedes R in , and 2. the write that R "read from" (caused the latest update of p i 's local copy of X preceding the response for R)
24
Permutation Example CSCE 668Set 16: Distributed Shared Memory 24 write(2) read return(2) read return(1) to-bc-send p0p0 p1p1 p2p2 ack write(1)ack to-bc-send permutation is given by brown numbers 1 3 4 2
25
Permutation Respects Per Proc. Ordering CSCE 668Set 16: Distributed Shared Memory 25 For a specific proc: Relative ordering of two writes is preserved by Lemma 9.3 Relative ordering of two reads is preserved by the construction of If write W precedes read R in exec. , then W precedes R in by construction Suppose read R precedes write W in . Show same is true in .
26
Permutation Respects Ordering CSCE 668Set 16: Distributed Shared Memory 26 Suppose in contradiction R and W are swapped in : There is a read R' by p i that equals or precedes R in There is a write W' that equals W or follows W in the to-bcast order And R' "reads from" W'. But: R' finishes before W starts in and updates are done to local replicas in to-bcast order (Lemma 9.3) so update for W' does not precede update for W so R' cannot read from W'. R' RW |p i : : …W … W' … R' … R …
27
Permutation is Legal CSCE 668Set 16: Distributed Shared Memory 27 Consider some read R of X by p i and some write W s.t. R reads from W in . Suppose in contradiction, some other write W' to X falls between W and R in : Why does R follow W' in ? : …W … W' … R …
28
Permutation is Legal CSCE 668Set 16: Distributed Shared Memory 28 Case 1: W' is also by p i. Then R follows W' in because R follows W' in . Update for W at p i precedes update for W' at p i in (Lemma 9.3). Thus R does not read from W, contradiction.
29
Permutation is Legal CSCE 668Set 16: Distributed Shared Memory 29 Case 2: W' is not by p i. Then R follows W' in due to some operation O, also by p i, s.t. O precedes R in , and O is placed between W' and R in Consider the earliest such O. Case 2.1: O is a write (not necessarily to X). update for W' at p i precedes update for O at p i in (Lemma 9.3) update for O at p i precedes p i 's local read for R in (Lemma 9.4) So R does not read from W, contradiction. : …W … W' … O … R …
30
Permutation is Legal CSCE 668Set 16: Distributed Shared Memory 30 C ase 2.2: O is a read. By construction of , O must read X and in fact read from W' (otherwise O would not be after W') Update for W at p i precedes update for W' at p i in (Lemma 9.3). Update for W' at p i precedes local read for O at p i in (otherwise O would not read from W'). Thus R cannot read from W, contradiction. : …W … W' … O … R …
31
Performance of SC Algorithm CSCE 668Set 16: Distributed Shared Memory 31 Read operations are implemented "locally", without requiring any inter-process communication. Thus reads can be viewed as "fast": time between invocation and response is only that needed for some local computation. Time for a write is time for delivery of one totally ordered broadcast (depends on how to-bcast is implemented).
32
Alternative SC Algorithm CSCE 668Set 16: Distributed Shared Memory 32 It is possible to have an algorithm that implements sequentially consistent shared memory on top of totally ordered broadcast that has reverse performance: writes are local/fast (even though bcasts are sent, don't wait for them to be received) reads can require waiting for some bcasts to be received Like the previous SC algorithm, this one does not implement linearizable shared memory.
33
Time Complexity for DSM Algorithms CSCE 668Set 16: Distributed Shared Memory 33 One complexity measure of interest for DSM algorithms is how long it takes for operations to complete. The linearizability algorithm required D time for both reads and writes, where D is the maximum time for a totally- ordered broadcast message to be received. The sequential consistency algorithm required D time for writes and 0 time for reads, since we are assuming time for local computation is negligible. Can we do better? To answer this question, we need some kind of timing model.
34
Timing Model CSCE 668Set 16: Distributed Shared Memory 34 Assume the underlying communication system is the point-to-point message passing system (not totally ordered broadcast). Assume that every message has delay in the range [d-u,d]. Claim: Totally ordered broadcast can be implemented in this model so that D, the maximum time for delivery, is O(d).
35
Time and Clocks in Layered Model CSCE 668Set 16: Distributed Shared Memory 35 Timed execution: associate an occurrence time with each node input event. Times of other events are "inherited" from time of triggering node input recall assumption that local processing time is negligible. Model hardware clocks as before: run at same rate as real time, but not synchronized Notions of view, timed view, shifting are same: Shifting Lemma still holds (relates h/w clocks and msg delays between original and shifted execs)
36
Lower Bound for SC CSCE 668Set 16: Distributed Shared Memory 36 Let T read = worst-case time for a read to complete Let T write = worst-case time for a write to complete Theorem (9.7): In any simulation of sequentially consistent shared memory on top of point-to-point message passing, T read + T write d.
37
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 37 Consider any SC simulation with T read + T write < d. Let X and Y be two shared variables, both initially 0. Let 0 be admissible execution whose top layer behavior is write 0 (X,1) ack 0 (X) read 0 (Y) return 0 (Y,0) write begins at time 0, read ends before time d every msg has delay d Why does 0 exist? The alg. must respond correctly to any sequence of invocations. Suppose user at p 0 wants to do a write, immediately followed by a read. By SC, read must return 0. By assumption, total elapsed time is less than d.
38
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 38 time0d write(X,1)read(Y,0) p0p0 p1p1 00
39
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 39 Similarly, let 1 be admissible execution whose top layer behavior is write 1 (Y,1) ack 1 (Y) read 1 (X) return 1 (X,0) write begins at time 0, read ends before time d every msg has delay d 1 exists for similar reason.
40
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 40 time0d write(X,1)read(Y,0) p0p0 p1p1 00 write(Y,1) read(X,0) p0p0 p1p1 11
41
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 41 Now merge p 0 's timed view in 0 with p 1 's timed view in 1 to create admissible execution '. But ' is not SC, contradiction!
42
SC Lower Bound Proof CSCE 668Set 16: Distributed Shared Memory 42 time0d write(X,1)read(Y,0) p0p0 p1p1 00 write(Y,1) read(X,0) p0p0 p1p1 11 write(X,1)read(Y,0) p0p0 p1p1 '' write(Y,1)read(X,0)
43
Linearizability Write Lower Bound CSCE 668Set 16: Distributed Shared Memory 43 Theorem (9.8): In any simulation of linearizable shared memory on top of point-to-point message passing, T write ≥ u/2. Proof: Consider any linearizable simulation with T write < u/2. Let be an admissible exec. whose top layer behavior is: p 1 writes 1 to X, p 2 writes 2 to X, p 0 reads 2 from X Shift to create admissible exec. in which p 1 and p 2 's writes are swapped, causing p 0 's read to violate linearizability.
44
Linearizability Write Lower Bound CSCE 668Set 16: Distributed Shared Memory 44 0u/2 u time: p0p0 p1p1 p2p2 write 1 read 2 write 2 : p0p0 p1p1 p2p2 delay pattern d - u/2 d d - u
45
Linearizability Write Lower Bound CSCE 668Set 16: Distributed Shared Memory 45 0u/2 u time: p0p0 p1p1 p2p2 write 1 read 2 write 2 p0p0 p1p1 p2p2 delay pattern d d - u d d shift p 1 by u/2 shift p 2 by -u/2
46
Linearizability Read Lower Bound CSCE 668Set 16: Distributed Shared Memory 46 Approach is similar to the write lower bound. Assume in contradiction there is an algorithm with T read < u/4. Identify a particular execution: fix a pattern of read and write invocations, occurring at particular times fix the pattern of message delays Shift this execution to get one that is still admissible but not linearizable
47
Linearizability Read Lower Bound CSCE 668Set 16: Distributed Shared Memory 47 Original execution: p 1 reads X and gets 0 (old value). Then p 0 starts writing 1 to X. When write is done, p 0 reads X and gets 1 (new value). Also, during the write, p 0 and p 1 alternate reading X. At some point, the reads stop getting the old value (0) and start getting the new value (1)
48
Linearizability Read Lower Bound CSCE 668Set 16: Distributed Shared Memory 48 Set all delays in this execution to be d - u/2. Now shift p 2 earlier by u/2. Verify that result is still admissible (every delay either stays the same or becomes d or d - u). But in shifted execution, sequence of values read is 0, 0, …, 0, 1, 0, 1, 1, …, 1
49
Linearizability Read Lower Bound CSCE 668Set 16: Distributed Shared Memory 49 p0p0 p1p1 p2p2 read 0 read 1 read 0 read 1 read 0 write 1 u/2 p0p0 p1p1 read 0 read 1 p2p2 read 0 write 1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.