EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing
Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

EEC688/788: Secure & Dependable Computing
Outline Group communication systems Ordered multicast Techniques to implement ordered multicast Group membership service Agreed and safe delivery Checkpointing and recovery Reference: Reliable distributed systems, by K. P. Birman, Springer; Chapter 14-16 4/17/2019 EEC688/788: Secure & Dependable Computing

Group Communication System
Services provided by the GCS Membership service: who is up and who is down Deals with failure detection and more Reliable, ordered, multicast service FIFO, causal, total Virtual synchrony service Virtual synchrony synchronizes membership change with multicasts GCS is often used to build fault tolerant systems 4/17/2019 EEC688/788: Secure & Dependable Computing

Reliable Multicast Reliable multicast – the message is targeted to multiple receivers, and all receivers receive the message reliably Positive or negative acknowledgement Need to avoid ack/nack implosion Distinguish receiving from delivery! Application Delivering Middleware Receiving 4/17/2019 EEC688/788: Secure & Dependable Computing

Ordered Reliable Multicast
Ordered reliable multicast – if many messages are multicast by many senders, in what order the messages are delivered at the receivers? First in first out (FIFO) Causal – the causal relationship among msgs preserved Total – all msgs are delivered at all receivers in the same order 4/17/2019 EEC688/788: Secure & Dependable Computing

FIFO Ordered Multicast
FIFO or sender ordered multicast: Messages are delivered in the order they were sent (by any single sender) a e p q r s b c d delivery of c to p is delayed until after b is delivered 4/17/2019 EEC688/788: Secure & Dependable Computing

Causally Ordered Multicast
Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations a p q r s b 4/17/2019 EEC688/788: Secure & Dependable Computing

Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations a p q r s b c delivery of c to p is delayed until after b is delivered 4/17/2019 EEC688/788: Secure & Dependable Computing

Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations a e p q r s b c delivery of c to p is delayed until after b is delivered e is sent (causally) after b 4/17/2019 EEC688/788: Secure & Dependable Computing

Causal or happens-before ordering: If send(a)  send(b) then deliver(a) occurs before deliver(b) at common destinations a Remember this is about multicast, every message is sent to all processes e p q r s b c d delivery of c to p is delayed until after b is delivered delivery of e to r is delayed until after b&c are delivered 4/17/2019 EEC688/788: Secure & Dependable Computing

Totally Ordered Multicast
Total ordering: Messages are delivered in same order to all recipients (including the sender) a e p q r s b c d all deliver a, b, c, d, then e 4/17/2019 EEC688/788: Secure & Dependable Computing

Implementing Total Ordering
Use a token that moves around Token has a sequence number When you hold the token you can send the next burst of multicasts Use a sequencer to order all multicast Message is first multicast to all, including the sequencer; then the sequencer determines the order for the message and informs all Or send to the sequencer and the sequencer multicast with total order information Each sender can take turn to serve as the sequencer 4/17/2019 EEC688/788: Secure & Dependable Computing

Group membership service
Input: Process “join” events Process “leave” events Apparent failures Output: Membership views for group(s) to which those processes belong 4/17/2019 EEC688/788: Secure & Dependable Computing

Issues? The service itself needs to be fault-tolerant Otherwise our entire system could be crippled by a single failure! Hence Group Membership Service (GMS) must run some form of protocol (GMP) 4/17/2019 EEC688/788: Secure & Dependable Computing

Approach Assume that GMS has members {p,q,r} at time t Designate the “oldest” of these as the protocol “leader” To initiate a change in GMS membership, leader will run the GMP Others can’t run the GMP; they report events to the leader 4/17/2019 EEC688/788: Secure & Dependable Computing

GMP Example p q r Example: Initially, GMS consists of {p,q,r} Then q is believed to have crashed 4/17/2019 EEC688/788: Secure & Dependable Computing

Unreliable Failure Detection
Recall that failures are hard to distinguish from network delay So we accept risk of mistake If p is running a protocol to exclude q because “q has failed”, all processes that hear from p will cut channels to q Avoids “messages from the dead” q must rejoin to participate in GMS again In udp, there is no connection to cut. However, you can check the sender of each multicast packet, you want to drop it if it comes from someone labeled as failed 4/17/2019 EEC688/788: Secure & Dependable Computing

Basic GMP Someone reports that “q has failed” Leader (process p) runs a 2-phase commit protocol Announces a “proposed new GMS view” Excludes q, or might add some members who are joining, or could do both at once Waits until a majority of members of current view have voted “ok” Then commits the change 4/17/2019 EEC688/788: Secure & Dependable Computing

GMP Example Proposed V1 = {p,r} Commit V1 p q r OK V0 = {p,q,r} V1 = {p,r} Proposes new view: {p,r} [-q] Needs majority consent: p itself, plus one more (“current” view had 3 members) Can add members at the same time 4/17/2019 EEC688/788: Secure & Dependable Computing

Special Concerns? What if someone doesn’t respond? P can tolerate failures of a minority of members of the current view New first-round “overlaps” its commit: “Commit that q has left. Propose add s and drop r” P must wait if it can’t contact a majority Avoids risk of partitioning 4/17/2019 EEC688/788: Secure & Dependable Computing

What If Leader Fails? Here we do a 3-phase protocol New leader identifies itself based on age ranking (oldest surviving process) It runs an inquiry phase “The adored leader has died. Did he say anything to you before passing away?” Note that this causes participants to cut connections to the adored previous leader Then run normal 2-phase protocol but “terminate” any interrupted view changes leader had initiated 4/17/2019 EEC688/788: Secure & Dependable Computing

GMP Example p Inquire [-p] Proposed V1 = {r,s} Commit V1 q r OK: nothing was pending OK V0 = {p,q,r} V1 = {r,s} New leader first sends an inquiry Then proposes new view: {r,s} [-p] Needs majority consent: q itself, plus one more (“current” view had 3 members) Again, can add members at the same time 4/17/2019 EEC688/788: Secure & Dependable Computing

Safe and Agreed Delivery
For totally ordered reliable multicast, there are two delivery policies Safe delivery: a message is delivered only when all correct processes have received it Agreed delivery: a message is delivered as long as it is the next message in total order 4/17/2019 EEC688/788: Secure & Dependable Computing

Safe and Agreed Delivery
Safe delivery guarantees the uniformity of multicast: If a message is delivered to any process, it is delivered by all correct processes Agreed delivery does not: It is possible that a message is delivered in one (or more) process, but is not delivered by some correct process Why? Because the sender of the message and the one receiver both die at the same time, no other processes get the chance to receive the message 4/17/2019 EEC688/788: Secure & Dependable Computing

Checkpointing and Recovery
Faults occur over time. How to ensure a fault tolerant system remain operational for extensive period of time? Recover failed replicas, or replace failed replicas with new one => Recovery is needed How to recover a failed replica or install a new replica? Checkpointing a correct replica and transfer the state to the recovering replica 4/17/2019 EEC688/788: Secure & Dependable Computing

Checkpointing Checkpointing: the act of taking a snapshot of an entity so that we can restore it later A replica is a process running in an operating system. The state of a process Processes' memory, stack and registers Threads Open or mmap'ed files Current working directory Interprocess communication: Semaphores, shared memory, pipes, sockets Dynamic Load Libraries … 4/17/2019 EEC688/788: Secure & Dependable Computing

Checkpointing Many tools are available to perform checkpointing transparently or semi-transparently Condor, libckpt, etc. Checkpoints taken in general are not portable Checkpoint size might be big 4/17/2019 EEC688/788: Secure & Dependable Computing

Checkpointing of Application State
Sometimes it is more efficient to save and store the application state only Checkpoints can be very portable and compact in size class Counter { int counter; Counter(int initVal) { counter = initVal; } void increment() {counter++; } void decrement() {counter--; } void setState(int c) {counter = c; } int getState() { return counter;}| } What is the state variable? Int counter; // a single integer 4/17/2019 EEC688/788: Secure & Dependable Computing

Logging Logging of messages Checkpointing in general is expensive Logging of messages is cheaper => we can periodically do checkpointing, or do checkpointing on demand and log all messages in between Logging of other non-deterministic activities Access order to shared data Systematic checkpointing is not practical 4/17/2019 EEC688/788: Secure & Dependable Computing

Roll-Forward Recovery
With replication in space, it is possible to recover a fault while the system is progressing ahead Roll-forward recovery is made possible by Checkpointing of replica state Logging of incoming messages Reliable, totally ordered group communication system 4/17/2019 EEC688/788: Secure & Dependable Computing

We want to ensure the newly admitted replica to have a consistent state with others when it starts Steps of adding a new replica into a group (with on-demand checkpointing) A recovered (or a new) replica joins a group A join message is multicast in total order On receiving the join message, it is put into incoming message queue and wait for processing When the join message is at the head of the queue, a checkpoint is taken and it is transferred to the new replica 4/17/2019 EEC688/788: Secure & Dependable Computing

At the new replica, it starts queueing messages after it receives the join messages (sent by itself) When the checkpoint is received by the new replica, its state is restored using the received checkpoint (the checkpoint is delivered out of order!) The queued messages are delivered in order, at the new replica Other replicas do not stop and wait for the new replica Steps of adding a new replica into a group with periodic checkpointing is similar 4/17/2019 EEC688/788: Secure & Dependable Computing

Steps of Roll-Forward Recovery
4/17/2019 EEC688/788: Secure & Dependable Computing

Roll-backward Recovery
Roll-backward recovery is used for systems relying on replication in time for fault tolerance When a failure occurs, roll back using the most recent checkpoint (and retry) 4/17/2019 EEC688/788: Secure & Dependable Computing

Roll-backward Recovery in a Distributed System
Performing roll-backward recovery in a distributed system is non-trivial Need to solve the distributed snapshot problem It is easy to perform a local checkpoint of a process, but in a distributed system, when one process rolls back, other processes must also roll back to a consistent state 4/17/2019 EEC688/788: Secure & Dependable Computing

EEC 688/788 Secure and Dependable Computing

Similar presentations

Presentation on theme: "EEC 688/788 Secure and Dependable Computing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EEC 688/788 Secure and Dependable Computing

Similar presentations

Presentation on theme: "EEC 688/788 Secure and Dependable Computing"— Presentation transcript:

Similar presentations

About project

Feedback