Download presentation
Presentation is loading. Please wait.
Published byCory Ellis Modified over 9 years ago
1
COORDINATION, SYNCHRONIZATION AND LOCKING WITH ISIS2 Ken Birman 1 Cornell University
2
Isis 2 has many options for coordination 2 Within a group of processes there are many ways in which you might want coordinated or synchronized behavior Isis 2 can support all of them, but because there are many patterns, the topic isn’t trivial!
3
Examples 3 Primary/Backup fault-tolerance Group g receives a request A and B are assigned to handle it If A succeeds, it sends a reply and we’re done If A fails, B takes over
4
Examples 4 Coordinator/Cohort fault-tolerance Group g receives a request A and B are assigned to handle it, but in such a way that every request has its own primary (“coordinator”) and every request has one or more backups (“cohort”) If A succeeds, it sends a reply and we’re done If A fails, B takes over Updates issued to the group state by A prior to failing are visible to B when it takes over
5
Examples 5 Periodic action based on a timer A clock is running, and every X ms, the group members jointly perform some action They could each take some “part” of a shared task Or the action could be performed in primary/backup style with a primary member initiating it but others standing ready to help if the primary fails
6
Examples 6 Locking The group manages some form of data. The items have an associated key Members can obtain a lock on the key Mutex locks: while a member holds the key, no other member can access the key Read/Write locks: Read locks allow other readers but Write locks exclude both readers and writers
7
Barrier synchronization 7 Group has some form of task to do An initiator sends a request to start the members working on that task, then wishes to wait until they are all finished The members function as “workers”. As each finishes it signals that it has finished its part
8
Summary of Options 8 GoalPurposeTechnique to Use Primary/Backup“There can only be one” (primary, that is) Group of size 2. Primary is the member with rank=0 Coordinator/Cohort“Many hands, light work” Group of any size. Rotate the roles… Periodic ActionsHeart beatRank 0 member sends a timing signal Locking: Mutex“One at a time”Isis 2 locking tool (basic) Locking: Read/Write LocksDatabasesIsis 2 locking tool (fancy) Barrier Synchronization“Is everyone finished?”An Isis 2 Query
9
Primary - Backup 9
10
… Details 10 Primary backup Easiest: Form a group with 2 members Rule: The member with rank 0 is the primary, the member with rank 1 is a backup To update data, use the Isis 2 Send primitive, but call g.Flush() before talking to an external user of the service If the group updates databases or file storage may need to use SafeSend. This topic is covered in a different module. If a failure occurs, a new view will signal that the backup is now the new primary It picks up in the same state that the old primary was in
11
Issues with primary/backup 11 Notice that Isis 2 lacks a way to send a multicast to the group plus one external member So suppose an outside person asks the group to do something, like in our old picture: Backup will know what the primary intended to do But did the primary actually send the response? Did it reach the user? Time Update the monitoring and alarms criteria for Mrs. Marsh as follows… Confirmed Response delay seen by end-user would include Internet latencies Service response delay Service instance
12
This problem can’t be solved! 12 In the Isis 2 system, we can’t atomically send a message to the external user in such a way that the backup will be certain it was sent. So… the backup must re-send the last message(s) to the user! … but how? Time Update the monitoring and alarms criteria for Mrs. Marsh as follows… Confirmed Response delay seen by end-user would include Internet latencies Service response delay Service instance Confirmed
13
UDP, TCP-R… 13 With UDP the backup might be able to just send the identical reply. With TCP, the user sees the connection break and if the confirmation wasn’t received, may need to re-issue the request. The backup (the new primary) would sense that this is a repeated request and resend the old reply Cornell has a technology called TCP-R. With it a single TCP connection can be “taken over” by the backup. With TCP-R the backup can finish the sending the primary was in the midst of doing even if it crashed
14
Coordinator Cohort 14
15
… More details 15 Coordination-Cohort Really the identical idea, but now we relay the request into the group, using OrderedSend. If the group updates databases or file storage may need to use SafeSend. A topic covered in a different module. Some simple rule should be used to map requests to the members: not just “rank 0 is the primary” but “rank K will be the primary for request R” For example, the request could contain some sort of identification data, or we could compute a hashcode Any rule that all members can apply will do the trick In other ways, just like primary-backup
16
Which multicast should we use? 16 If external users connect in a load-balanced way, each member can Handle read-only work locally Use OrderedSend to relay and work that updates the group state.. If the group updates databases or file storage may need to use SafeSend. With OrderedSend, always call Flush if important updates were done and we are able to respond to the external user If external users all connect to the rank-0 member then we can just use Send, but we risk overloading that member if everyone connects to the same one
17
Periodic Actions with Timer 17
18
Periodic Actions 18 Easiest solution: Have the member with rank 0 launch a thread This thread loops Wait for K ms (use Thread.Sleep() or a timed call to WaitOne on a semaphore that is always 0) Then issue a g.Send to “ping everyone” Latencies of g.Send in an otherwise idle group will be very low and the jitter even smaller Action should be taken by everyone within a millisecond or two
19
Periodic Actions 19 Fancier solutions are also possible There is a literature on real-time actions in fault- tolerant systems If you are facing a “mission critical” need you might consider using such a solution For example, every member could run its own timer, and every member could send a “ping” On receiving “ping at time T” from a majority of members of the current view, take the action Such a solution will be far more robust, but more costly
20
Read and Write Locks 20
21
Locking 21 Isis 2 supports group-wide locks If you want local locking, don’t use this tool Basic API: g.WriteLock(“name” [, timeout]). g.Lock() for short. g.ReadLock(“name” [, timeout]) g.Unlock(“name”) You can also control the persistency of the lock state of the group and the way that failures are handled
22
Rules… 22 Lock requests are handled one by one in order Locks on different “names” don’t conflict Locks on the same name: Write locks exclude all other locks Read locks: allow further read locks, until a write lock is waiting. But then read locks wait behind the write lock Timeout: Causes a “cancel” request to be sent
23
Handling of failures 23 If the lock holder fails, the default action is to “break” the lock (release it). Read locks always act this way. For write locks, you can specify that instead that a broken lock be passed to the rank-0 group member A lock-transfer upcall event notifies you when this occurs You would need to code whatever handling you desire. It will run in the rank-0 member as necessary.
24
Lock-State Persistence 24 The lock manager state is stored in a data structure that can live in memory or be retained on disk The default configuration keeps the structure in memory If you override this and request persistent locking, we use SafeSend instead of OrderedSend, and the locking state will be saved on disk. In the default case, if the group terminates the lock state is discarded. In the persistent case lock state is retained even across group shutdowns
25
When is lock persistence important? 25 If your database will be used across periods when the whole group shuts down, we would say that the database itself is External (not in-memory) Persistent In such cases you’ll use SafeSend to update the database. And for this case may want to make the lock service state persistent too. DB 1 DB 2
26
Barrier Synchronization 26
27
Barrier Synchronization 27 This is easy achieved using g.Query/OrderedQuery Query can initiate the computation, or could simply specify “which” computation you have in mind, if you have many running in parallel. Group members use g.Reply() when they reach the barrier point. Sender waits for all to reply
28
Barrier Synchronization: Failures 28 We recommend that in the Reply you send The size of the group view when computation started Which rank this particular member had in the group … just record these values when the Query arrives Then do g.Reply(myRank, N, other data…) Caller can thus verify that it received all N replies. If not, it knows that some member crashed
29
Summary of Options 29 GoalPurposeTechnique to Use Primary/Backup“There can only be one” (primary, that is) Group of size 2. Primary is the member with rank=0 Coordinator/Cohort“Many hands, light work”Group of any size. Rotate the roles… Periodic ActionsHeart beatRank 0 member sends a timing signal Locking: Mutex“One at a time”Isis 2 locking tool (basic) Locking: Read/Write LocksDatabases, 2PL. Isis 2 has locks but doesn’t implement transactions. Isis 2 locking tool (fancy) Caution: make the lock state as persistent as the database! Barrier Synchronization“Is everyone finished?”An Isis 2 Query
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.