Synchronization in Message Passing Systems

Synchronization in Message Passing Systems
by Ye Su Advisor: Dr. Gurdip Singh Department of Computing and Information Sciences Kansas State University May 7, 2004 Ph. D. Thesis Defense

Outline Introduction to region synchronization problem.
Brief review of the aspect oriented-based methodology. Correctness criteria for the region synchronization algorithm in distributed systems. Algorithm to map the coarse-grained solution to fine-grained solution in the point-to-point networks. Algorithm Optimizations in the point-to-point networks. Algorithm to map the coarse-grained solution to fine-grained solution in the CAN based Systems. Integrating our solutions to SyncGen toolset solving a complicated example by using the approach we proposed. Summary and future work May 7, 2004 Ph.D. Thesis Defense

1. Introduction P1 P3 P2 - Processes - Synchronization - Regions
An aspect oriented approach for developing synchronization for shared memory systems is proposed by Mizuno, Singh and Neilsen. Similar as their approach, we focus on the technique to derive algorithms for synchronization in message passing systems. May 7, 2004 Ph.D. Thesis Defense

2. Overview of the aspect oriented-based methodology
May 7, 2004 Ph.D. Thesis Defense

Overview of the aspect oriented-based methodology
Identifying synchronization regions void reader() { void writer(){ while (true) { while (true) { ...other computation …other computation /*** Region-Enter: Reader ***/ /*** Region-Enter: Writer ***/ ...read shared variables …write shared variables… /*** Region-Exit: Reader ***/ /*** Region-Exit: Writer ***/ ...other computation other computation... } } } } May 7, 2004 Ph.D. Thesis Defense

Global invariant specification A global invariant I is a predicate defined using in and out counters with arithmetic inequalities, arithmetic operators and boolean connectives. Reader Writer In[R] In[W] RR RW out[R] out[W] ((in[R]=out[R])(in[W]=out[W]))(in[W]-out[W]≤1) May 7, 2004 Ph.D. Thesis Defense

Generation of coarse-grained solution Two types of synchronization constructs, <S> and <await B S>, are used in a coarse-grained solution. Reader region: Entry: < await (in[W] = out[W])  in[R]++ > Exit: < out[R]++ > Writer region: Entry: < await ((in[R] = out[R]) /\ (in[W] = out[W]))  in[W]++ > Exit: < out[W]++ > May 7, 2004 Ph.D. Thesis Defense

Translation to synchronization code fine-grained synchronization code in a target programming language or platform is obtained from the coarse-grained solution. Techniques to map coarse-grained solutions to multi-threaded programs based on monitors [And91] and Java synchronized blocks [Miz99] have been proposed. In this thesis, we will focus on how to map a coarse-grained solution to fine-grained solutions in message passing based systems. May 7, 2004 Ph.D. Thesis Defense

Weaving the code The final step in the methodology is to weave the synchronization code to functional code. For example, in the active monitor approach, the monitor code and the code for the proxies are generated automatically. Furthermore, appropriate method calls are inserted at appropriate points in the functional code. May 7, 2004 Ph.D. Thesis Defense

3. Correctness Criteria In a distributed program, we define a synchronization statement Syni, associated with entry as well as exit for each region. Syni is one of the forms: < Ci++ > < await (Bi)  Ci++ > Where Ci is the in[x] or out[x] for some region Rx and Bi is composed of local variables. May 7, 2004 Ph.D. Thesis Defense

Correctness Criteria A simple centralized solution P1 P2 Pn
Central Site (Pc) …… request reply May 7, 2004 Ph.D. Thesis Defense

Message passing system
Correctness Criteria A distributed solution P1 P2 Pn …… request reply Mon1 Mon2 Monn Message passing system May 7, 2004 Ph.D. Thesis Defense

Correctness Criteria A counter example P1 P2 Real Time P3 t reqa reqb
reqc Sync Sync Sync Syna Syna Synb Synb Synb inconsistent Sync Syna Syna Synb May 7, 2004 Ph.D. Thesis Defense

Correctness Criteria P’: A virtual process executes every process’ synchronization statement in real time. Local counter variable P’ Pi Pj Auxiliary shared variable I((in[R]=out[R])(in[W]=out[W]))(in[W]-out[w]≤1) In[R]’++ In[R]++ out[R]’++ out[R]++ I’((in[R]’=out[R]’)(in[W]’=out[W]’))(in[W]’-out[w]’≤1) In[W]’++ In[W]++ Definition: An algorithm A solves the region synchronization problem for invariant I if I' is an invariant of A. May 7, 2004 Ph.D. Thesis Defense

4. The algorithm for a point-to-point network
Happened Before () Ea  Eb if Ea and Eb are events in the same process, and Ea occurred before Eb. Ea is the event of sending a message in a process, Pi, and Eb is the event of receiving the same message in another process Pj. Ea  Ec, and Ec  Eb. Total ordering of events. Ea tm Eb, if and only if (TMEa < TMEb) or (TMEa = TMEb) /\ i < j) where TMEa is the timestamp for event Ea. Ea t Eb, if Ea occurs before Eb in real time May 7, 2004 Ph.D. Thesis Defense

Definitions Seq is a sequence of statements, each of which increments a counter. Seq1 || ... || Seqn denotes the concurrent execution of the sequences. {P}Seq{Q} holds if, whenever the execution of Seq begins in a state satisfying P and the execution of Seq terminates the resulting state satisfies Q. ……Seq……. P Q May 7, 2004 Ph.D. Thesis Defense

Definitions The weakest precondition, wp(Seq,Q), is a predicate defining the largest set of states such that the execution of Seq in any state satisfying wp(Seq,Q) results in a state satisfying Q. The strongest postcondition, sp(P,Seq), is a predicate defining the smallest set of states such that the execution of Seq with precondition P results in a state satisfying sp(P,Seq). May 7, 2004 Ph.D. Thesis Defense

Definitions Let Seqi and Seqj be two sequences in P and I be a global invariant of P. If there exists Pi such that Pi  wp(Seqi, I) is true but {Pi} Seqi || Seqj {I} does not hold then we say Seqj conflicts with Seqi with respect to I and we denote it by Seqj cf Seqi. If there exists Pi such that Pi  wp(Seqi, I) is false but Pi  wp(Seqj;Seqi,I) is true then we say Seqj enables Seqi with respect to I and we denote it by Seqj en Seqi. ……Seqi…… ……Seqj;Seqi…… Pi I=true? Pi I=true ……Seqi…… ……Seqj;Seqi…… Pi I=true Pi I≠true May 7, 2004 Ph.D. Thesis Defense

Notations reqi and reqj are requests to execute Syni and Synj respectively. ex_reqj,x denotes the event of Px executing reqi, and ex_reqi denotes the event of local execution of reqi. P’ Px Py reqk reqi ex_reqk,x ex_reqk ex_reqk ex_reqi,y ex_reqi ex_reqi May 7, 2004 Ph.D. Thesis Defense

The algorithm for a point-to-point network
Let reqi be a request issued by Pk. We now define a set of rules that a process may follow to execute this request. R1:  reqj, where j ≠ i, if reqj cf reqi  (reqj en reqi ), then ex_reqj  t ex_reqi  ex_reqj,k  t ex_reqi. R2:  reqj, where j ≠ i, if reqj  en reqi   (reqj  cf reqi) then ex_reqj,k  t ex_reqi  ex_reqj  t ex_reqi. R3:  reqj, where j ≠ i, if reqj  cf reqi  reqj  en reqi, then ex_reqj,k  t ex_reqi  ex_reqj  t ex_reqi. May 7, 2004 Ph.D. Thesis Defense

R1:  reqj, where j ≠ i, if reqj cf reqi  (reqj en reqi ), then ex_reqj  t ex_reqi  ex_reqj,k  t ex_reqi. P’ Pk Pl Pk  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi May 7, 2004 Ph.D. Thesis Defense

R2:  reqj, where j ≠ i, if reqj  en reqi   (reqj  cf reqi) then ex_reqj,k  t ex_reqi  ex_reqj  t ex_reqi. Pk P’ Pk Pl  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi May 7, 2004 Ph.D. Thesis Defense

R3:  reqj, where j ≠ i, if reqj  cf reqi  reqj  en reqi, then ex_reqj,k  t ex_reqi  ex_reqj  t ex_reqi. Pk P’ Pk Pl  ex_reqj ex_reqj ex_reqj,k ex_reqi ex_reqi ex_reqi May 7, 2004 Ph.D. Thesis Defense

Theorem: If all execution sequences of P satisfy R1, R2 and R3, then P is consistent. P’ Pk I is true after ex_reqi at Pk. Pk satisfies R1, R2 and R3. ex_reqi ex_reqi Question: Is I’ true after ex_reqi at P’ ? I’=true ? I=true May 7, 2004 Ph.D. Thesis Defense

Let {Pi}Seqi{Qi} holds. If Seqi' is any sequence obtained by reordering statements in Seqi and {Pi}Seqi'{Qi} also holds, then we say that the triple {Pi}Seqi{Qi} is order-free. For example, the triple, {x=3  y=1} x=7; y=y+1 {x=7  y=2}, is order-free. May 7, 2004 Ph.D. Thesis Defense

Lemma 1: If {Pk}Seqk{Pi}Sti{I} holds and  Stj in Seqk where (StjcfSti)  (StjenSti), then {Pk}Seqk'{Pi}Sti{I} still holds where Seqk' is obtained by removing Stj from Seqk. Stk … Stj-1 Stj Stj+1 ... Sti Stk … Stj-1 Stj+1 ... Sti Pk Pk (StjcfSti)  (StjenSti) Seq’k Seqk  To proof lemma 1, we need use the definitions of order-free, conflict and enable conflict Pi Pi I I May 7, 2004 Ph.D. Thesis Defense

Lemma 2: If {Pk}Seqk{Pi}Sti{I} holds and  Stj where (StjenSti)  (StjcfSti), then {Pk}Seqk'{Pi}Sti{I} still holds where Seqk' is obtained by adding Stj in Seqk. Stk … Stj-1 Stj+1 ... Sti Stk … Stj-1 Stj Stj+1 ... Sti Pk Pk (StjenSti)  (StjcfSti) Seqk Seq’k  To proof lemma 2, we will need the definitions of order-free, conflict and enable Pi enable Pi I I May 7, 2004 Ph.D. Thesis Defense

The proof of theorem R1+R2+R3+Lemma1+Lemma2  Theorem May 7, 2004 Ph.D. Thesis Defense

It is possible that an algorithm satisfying R1, R2 and R3 may not be starvation free. Pk reql ex_reqk,l reqi (Bi=false) reqj ex_reqk,j enablel (Bi=false) req? R4:  reqj, where j ≠ i, if reqj cf reqi  (reqj en reqi ), then reqj tm reqi  ex_reqj t ex_reqi. May 7, 2004 Ph.D. Thesis Defense

The general idea of the algorithm If a process wants to execute a conflicting statement, it sends a request to all processes and waits the ack messages. If a process wants to execute an enabling statements, a request should be sent out. If a process receives a request from other process, the request should be executed. May 7, 2004 Ph.D. Thesis Defense

The algorithm (part 1): reqj cf reqi  (reqj en reqi ) and we assume that reqj tm reqi. Real Time Px Py Px Py reqj reqi reqj conflict ex_reqj,x ack reqi ex_reqj ex_reqj ack ex_reqi,y ex_reqi ex_reqi This part implements R1 and R4 May 7, 2004 Ph.D. Thesis Defense

The algorithm (part 2): reqj  en reqi   (reqj  cf reqi) Px Py Real time Px Py reqi ex_reqj reqj ex_reqj enable reqj ex_reqx,j reqi ex_reqi ex_reqi This part implements R2 May 7, 2004 Ph.D. Thesis Defense

The algorithm (part 3): reqj  cf reqi  reqj  en reqi reqj can be handled as conflicting request and reqj will not be sent out as enable request after the execution of reqj. Px Py Real time Px Py reqi ex_reqj reqj ex_reqj reqj conflict & enable ex_reqx,j reqi ex_reqi ex_reqi This part implements R3 May 7, 2004 Ph.D. Thesis Defense

Our algorithm satisfies the rules R1-R4, so it is consistent. The complexity of messages less than 3XN where the N is the number of processes. Message passing takes place only between those processes that need to synchronize. For example, in readers/writers problem, readers only send requests for entering Reader region to the writers instead of all the processes. May 7, 2004 Ph.D. Thesis Defense

Fault Tolerance We consider only node failures (or process failure), and no link failures. To make things simple, we also assume that the node does not crash while sending a message. The status of a process (whether it is in a region or not) can be identified by its last request. Each process Pk has a new variable, LASTk,j, to record the last request received from each process Pj. May 7, 2004 Ph.D. Thesis Defense

Node failure Py captured the crashed node Pz Py checks Lasty,z, if reqi is request for entering some region, then fakeREQ is the request to exit the same region. Otherwise, it is empty. Py sends SomebodyDied (Pz,fakeREQ) to tell Px and Py itself that Pz is died. Px and Py treats the fakeREQ as the request from Pz then remove the Pz from list. However, the algorithm has some limitation. For example, consider the case of Pj entering R1 followed by R2 in a nested manner. If Pj fails after exiting R1 and before entering R2, then other processes wanting to enter R1 may be blocked for ever since Pj will never exit R2. Px Py Pz reqi SomebodyDied(Pz,fakeREQ) May 7, 2004 Ph.D. Thesis Defense

Node recovery When a process, Pj recovers from failure, it sends Join(Pj) message to all other processes and waits for Agree messages from them. When a process, Pi, receives the Join(Pj) message, it adds Pj to the list. Pi then sends Agree message along with the latest request that it executed. When Pj receives the Agree message from Pi, it checks the latest request Pi executed. If the latest request is for entering Rx, Pj increases the entry counter for Rx, In[x], by one, otherwise, it does nothing. Then, Pj can executes the requests from Pi after it gets the Agree message. After Pj gets all the Agree messages from other processes it can sends its own request to execute. May 7, 2004 Ph.D. Thesis Defense

5. Algorithm Optimizations
The algorithm that we have proposed is for the general problem of region synchronization. We would like our general algorithm to match the message complexity of algorithms that have been designed for specific synchronization problems. For example, our algorithm uses the same number of messages for the distributed mutual exclusion problem as the algorithm proposed by Lamport (3 X (N-1) messages). However, Ricart-Agrawala algorithm only requires 2 X (N-1) messages. Chandy gives an algorithm requiring 0-2d messages for dining philosophers problem while our algorithm needs 3d messages where d is the number of neighbors of a philosopher. May 7, 2004 Ph.D. Thesis Defense

Algorithm Optimizations
Optimization to handle remote requests incl,k,y keeps track of the number of times Pl has executed Syncy until a request for Syncz from Pk arrives. Syncz Pl Pk Syncx Syncy conflict enable (a) reqz ack(incl,k,y) (b) (c) Syncz May 7, 2004 Ph.D. Thesis Defense

Optimization to handle local requests Pl Pk Syncx Syncy Syncz conflict reqx reqz ack ack(incl,k,y) (a) (b) Syncz May 7, 2004 Ph.D. Thesis Defense

Using Application Structure to optimize performance Pl Pk Syncx Syncy Syncz conflict reqy ack reqz ack(incl,k,x) (a) (b) May 7, 2004 Ph.D. Thesis Defense

Summary: We have proposed a general algorithm for synchronization in point-to-point system. We also show that our algorithm, by the optimizing, has performance comparable to known algorithms for specific synchronization problems . May 7, 2004 Ph.D. Thesis Defense

6. The algorithm for CAN based System
Introduction to CAN network Control Area Network (CAN) is well designed as a serial data communications bus that supports distributed control systems by sending and receiving short real-time control messages. CAN is a broadcast bus. The message is identified by message identifier. The identifier not only filters upon reception but also sets the priority of the messages. CAN bus behaves like a large AND-gate for all bit sent at the same time. May 7, 2004 Ph.D. Thesis Defense

The algorithm for CAN based System
Review the correctness criteria P1 P2 P3 reqa reqb reqc Sync Syna Synb Real Time t inconsistent May 7, 2004 Ph.D. Thesis Defense

When a process (node) wants to execute a synchronization statement, Syni, it must satisfy the following rules: C1: If ( j, Syni cf Synj  Synj cf Syni) and ( k, (Syni en Synk)), then a request is sent out. Ci' and Ci are incremented when Bi is true locally. C2: If  j, Syni en Synj and  k, (Synk cf Syni  Syni cf Synk), then a request is sent out after Bi is true locally. Thus, Ci' and Ci are incremented before the request is sent. C3: If  j, Syni cf Synj  Synj cf Syni and  k, Syni en Synk, then the request is sent first. Subsequently, Ci and Ci' are incremented when Bi is true locally, and then a notify message is sent out to all other processes. Other processes increment Ci locally only on receiving this notification. May 7, 2004 Ph.D. Thesis Defense

Implementation on distributed approach Application Order Control P1 P1 …… Pn CAN BUS May 7, 2004 Ph.D. Thesis Defense

Implementation on active monitor approach Application Monitor Proxy …… P1 Pn M Proxy CAN BUS May 7, 2004 Ph.D. Thesis Defense

The example used in our implementation Sleeping barber problem A shop has M barbers, one chair for each barber and a waiting room with K chairs. If all barbers are busy when a customer, says A, arrives, then A waits in the waiting room (provided there is an empty chair). If a barber, says B, is free, then A sits in B's chair. After B is done cutting the hair, A leaves the shop. Subsequently, B waits for another customer to sit on its chair. We have implemented solutions to the sleeping barber problem using the active monitor approach and the distributed approach. The system consists of six 167CR boards connected via a 250Kb/s speed CAN network, two barber nodes, three customer nodes and a noise node that is added to the system to adjust the system load. Three customers come to the shop in a random time, between 0 to 25 ms. The barber serves every customer 25ms. I May 7, 2004 Ph.D. Thesis Defense

The performance analysis of two approaches May 7, 2004 Ph.D. Thesis Defense

The performance analysis of two approaches Frequency (# of noise/s) 1000 500 333 250 167 100 50 Customer be served (Distributed Monitor) 618 748 779 795 807 815 823 831 (Active Monitor) 397 504 538 556 569 582 590 601 Replicated monitor outperforms active monitor by % 55.7 48.4 44.8 43.0 41.8 40.0 39.5 38.3 May 7, 2004 Ph.D. Thesis Defense

7. Integration with SyncGen
Introduction to SyncGen tool May 7, 2004 Ph.D. Thesis Defense

Integration with SyncGen
The SyncGen has two components: one is coarse-grain solution generator, a translation from the global invariant specification to the coarse-grain representation with await and atomic constructs and notification information (front end); the other includes multiple fine-grain solution back-ends. There will be one back-end for each language supported (Java, C, C++, etc.). Our purpose is to integrate our solutions for point-to-point systems as well as CAN based systems to SyncGen toolset as back-ends. May 7, 2004 Ph.D. Thesis Defense

Back-end for point-to-point systems Since our solution is for a point-to-point network and every node runs as a separate process on a machine, we need a configuration file to configure the network, specify which process go through which regions and the functional code. The relationship, conflict and enable, can be obtained automatically from the guard expression of the enter and exit requests in the coarse-grained solution. A package, “edu.ksu.cis.saves.MessagePassing.P2P”, handles communication and implements the algorithm we proposed. May 7, 2004 Ph.D. Thesis Defense

Back-end for CAN based systems To test and develop our solution for CAN system, we have developed a CAN simulator, which utilizes the broadcasting feature of the CAN bus. Having the same configuration file as point-to-point backend. The package, “edu.ksu.cis.saves.MessagePassing.CAN”, handles communication and implements the algorithm we proposed. SyncGen generates a replicated monitor class for every process. Similar to the back-end for shared memory solution, the translation of the coarse-grain solution to the fine grain solution is to implement await and atomic structure and notification information. May 7, 2004 Ph.D. Thesis Defense

8. An example A simple assembly process from [Suraj, Ramaswarm and Barber 1997] Part C O N V E Y R Robot Buffer Cutting Machine In Out In-path out-path May 7, 2004 Ph.D. Thesis Defense

An example The regions synchronization diagram R1 C1 B1 P1 Robot
Cutter Part Buffer P2 R2 C3 P3 P4 cutting out path in path C2 R3 B2 counter increment counter decrement May 7, 2004 Ph.D. Thesis Defense

An example The output of assembly example with one robot, two cutters and two parts arriving in loop. May 7, 2004 Ph.D. Thesis Defense

An example Consider the scenario where the robot is moving the part to one of the cutting machines, and two cutters and the out buffer are busy. In this case, a deadlock happens. [SRB97] solves this problem by using exit-safe state. on conveyor suspend ready for pick up c7 c6 e0 safe exit state e0 Sensor Signal e1 Ready for pickup from conveyor c6 !e0/\(in(Buffer:full) U in(Buffer:Empty) / !e1 c7 in(Buffer:Empty) The system is desinged for “out-path” process when part is in suspend state May 7, 2004 Ph.D. Thesis Defense

An example In our design, we use two relays, Relay(B1,P1) and Relay(P2,R1), to solve such kind of synchronization problem. However, both solutions are not optimal. If the buffer is full, but one of two cutters is free, it is safe for the robot to move the part from the conveyor to cutter. We can solve it by rewriting I of cluster P1B1C2: out[P1] ≤ in[B1]+2  out[C2] ≤ in[B1]  in[C2]-out[C2] ≤ 1 In addition, our design is suitable for multiple robots, but [SRB97] has to reconsider the synchronization among robots. May 7, 2004 Ph.D. Thesis Defense

An example Another typical synchronization problem in assembly example is that two cutters try to put the part in the buffer at the same time. In [SRB97], the authors do not consider this problem. To solve this problem, they would need to handle the change of status between two cutters. In our approach, we simply add a bound to region C2, so that the cutters should not be in region C2 at the same time. May 7, 2004 Ph.D. Thesis Defense

9. Summary and further work
The contribution We described a methodology for synthesizing synchronization code from high-level specifications in a distributed system. We developed algorithms to execute the coarse-grained solution in point-to-point and CAN based message passing systems. We have presented optimizations that exploit the synchronization structure of the problem as well the structure of the application to improve performance. We have integrated our solutions for message passing systems to the SyncGen tool. May 7, 2004 Ph.D. Thesis Defense

Summary and further work
The future work We give a preliminary algorithm for fault tolerance that has several limitations including the one for nested regions. We need to improve our algorithm so that it can handle the node failure and node recovery for all cases. We will extend our work to solve more complex synchronization problems where variables other than the in and out counters are allowed in the invariants. May 7, 2004 Ph.D. Thesis Defense

Thanks Any questions? May 7, 2004 Ph.D. Thesis Defense

Synchronization in Message Passing Systems

Similar presentations

Presentation on theme: "Synchronization in Message Passing Systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Synchronization in Message Passing Systems

Similar presentations

Presentation on theme: "Synchronization in Message Passing Systems"— Presentation transcript:

Similar presentations

About project

Feedback