Download presentation
Presentation is loading. Please wait.
1
EEC 688/788 Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org
2
2 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Outline Fault tolerance and replication Event ordering Group communication systems –Ordered multicast –Techniques to implement ordered multicast Reference: –Reliable distributed systems, by K. P. Birman, Springer; Chapter 14-16
3
3 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Redundancy To tolerant fault, some form of redundancy must be used –Replication in time (transaction processing) –Replication in space –Redundancy in software design (n-version programming) The three types of replication techniques are complimentary to each other
4
4 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replication in Time Replication in time –From the present state, apply one or more operations –If the system fails before completion, abort and rollback to the original state –Start all over again Example: transaction processing –Essence: roll-backward recovery
5
5 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replication in Space Replication in space: –Run multiple instances (replicas) of the systems so that if one fails, one or more replicas can take over –Must assume independent faults –Must ensure consistency among the replicas
6
6 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao N-version Programming N-version programming (redundancy in software design) –The system (or component) has n-different designs and implementations –In case of permanent software bugs, a different version is used for each replica, or for repeated executions
7
7 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replication is not a Trivial Task Suppose we want to replicate a server using the most popular (an inexpensive) approach –We run two servers on separate computers –The primary sends a log (its state, and/or logged incoming messages) to the backup –If primary crashes, the backup soon catches up and can take over
8
8 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Split Brain Syndrome primary backup Clients initially connected to primary, which keeps the backup up to date. Backup collects the log and updates its state log
9
9 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Split Brain Syndrome Transient problem causes some links to break but not all. Backup thinks it is now primary, primary thinks backup is down primary backup
10
10 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Split Brain Syndrome Some clients still connected to primary, but one has switched to backup and one is completely disconnected from both primary backup
11
11 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replica Consistency Replicas must be coordinated appropriately so that we can achieve strong replica consistency: –At the end of each execution step, all replicas must have the same state –The outputs from every replica must be consistent all the time
12
12 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replica Consistency Techniques to achieve strong replica consistency –As a prerequisite, we need to have an agreement on the membership of the replica group –Ensure that the same set of inputs is delivered to all replicas and the inputs must be in the same order –Ensure deterministic execution due to each input at every replica (applicable to active and semi-active replication)
13
13 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao One Copy Image When a component or system is replicated, we must ensure that the replicated unit appears as a single copy to external components/systems that interact with it Need a gateway in between the replicated and non-replicated units Alternatively, the gateway module can be integrated with the non-replicated unit Main tasks of the gateway –Multicast requests/replies to the replicas –Detect and suppress duplicated replies/requests
14
14 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Replication Styles Active replication –Every input (request) is executed by every replica –Every replica generates the outputs (replies) –Voting is needed to cope with non-fail-stop faults Passive replication –One of the replicas is designated as the primary replica –Only the primary replica executes requests –The state of the primary replica is transferred to the backups periodically or after every request processing Semi-active replication –One of the replicas is designated as the leader (or primary) –The leader determines the order of execution –Every input is executed by every replica per the leader’s instruction
15
15 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Duplicate Invocation Suppressed Duplicate Responses Suppressed Active Replication Actively Replicated Client Object A Actively Replicated Server Object B RM
16
16 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Active Replication with Voting Question: to cope with f number of faults (non-malicious), how many replicas are needed?
17
17 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao State Transfer State Response Invocation Passive Replication Passively Replicated Client Object A Passively Replicated Server Object B Primary Replica Primary Replica RM Question: can passive replication tolerate non-fail-stop faults?
18
18 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Ordering info Response Invocation Semi-Active Replication Semi-Actively Replicated Client Object A Semi-Actively Replicated Server Object B Primary Replica Primary Replica RM
19
19 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Ensuring Strong Replica Consistency Possible strategies –For active replication, use a group communication system that guarantees total ordering of all messages (plus deterministic processing in each replica) –Passive replication with systematic checkpointing –Semi-active replication –Use two-phase commit
20
20 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Total Ordering of Messages What is total ordering of messages? –All replicas receive the same set of messages in the same order –Atomic multicast – If a message is delivered to one replica, it is also delivered to all correct replicas With replication, we need to ensure total ordering of messages sent by a group of replicas to another group of replicas –FIFO ordering between one sender and a group is not sufficient m1 m2 m1 m2 m1
21
21 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Potential Sources of Non-determinisms Multithreading –The order of accesses of shared data by different threads might not be the same at different replicas System calls/library calls –A call at one replica might succeed while the same call might fail at another replica. E.g., memory allocation, file access Host/process specific information –Host name, process id, etc. –Local clocks - gettimeofday() Interrupts –Delivered and handled asynchronously – big problem
22
22 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Event Ordering “Time, Clocks, and the Ordering of Events in a Distributed System”, by Leslie Lamport, Communications of the ACM, July 1978, Volume 21, Number 7, pp.558-565 –What usually matters is not that all processes agree on exactly what time it is, but rather, that they agree on the order in which events occur
23
23 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Happens-Before Relation Assumptions: –The system is composed of a collection of processes, each process consists of a sequence of events –The events of a process form a sequence, where a occurs before b in this sequence if a happens before b –The sending or receiving of a message is an event in a process
24
24 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Happens-Before Relation The happens-before relation “→” on the set of events of a system is the relation satisfying the following three conditions: –If a and b are events in the same process, and a comes before b, then a → b –If a is the sending of a message by one process and b is the receipt of the same message by another process, then a → b –If a → b and b → c, then a → c
25
25 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Partial Ordering Not all events have the happens-before relationship Two distinct events a and b are said to be concurrent if a → b and b → a –Neither event can causally affect the other –This introduces a partial ordering of events in a system with concurrently operating processes “a happens before b” means that information can flow from a to b “a is concurrent with b” means that there is no information flow between a and b
26
26 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao How to Capture the Partial Ordering? Use logical clocks to capture the partial ordering –Define a clock C i for each process P i. Assign a number C i (a) to any event a in that process –The entire system of clocks is represented by the function C which assigns to any event b the number C(b), where C(b) =C j (b) if b is an event in process P j –The clocks C i are logical clocks rather than physical clocks
27
27 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Lamport Clock A Lamport logical clock is a monotonically increasing software counter Each process P i keeps its own logical clock C i to apply Lamport timestamps to events To capture the happens-before relation →, processes must do the following: –Before each event at P i : C i := C i +1 –When P i sends a message m, it piggybacks t = C i –When P j receives (m,t): C j := max(C j,t) + 1 e → e’ C(e) < C(e’)
28
28 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Lamport Clock: An Example
29
29 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Group Communication System Services provided by the GCS –Membership service: who is up and who is down Deals with failure detection and more –Reliable, ordered, multicast service FIFO, causal, total –Virtual synchrony service Virtual synchrony synchronizes membership change with multicasts GCS is often used to build fault tolerant systems
30
30 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Reliable Multicast Reliable multicast – the message is targeted to multiple receivers, and all receivers receive the message reliably –Positive or negative acknowledgement –Need to avoid ack/nack implosion Distinguish receiving from delivery! Application Middleware Receiving Delivering
31
31 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Ordered Reliable Multicast Ordered reliable multicast – if many messages are multicast by many senders, in what order the messages are delivered at the receivers? –First in first out (FIFO) –Causal – the causal relationship among msgs preserved –Total – all msgs are delivered at all receivers in the same order
32
32 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao FIFO Ordered Multicast FIFO or sender ordered multicast: Messages are delivered in the order they were sent (by any single sender) pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered
33
33 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a b
34
34 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bc delivery of c to p is delayed until after b is delivered
35
35 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bc e delivery of c to p is delayed until after b is delivered e is sent (causally) after b
36
36 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Causally Ordered Multicast Causal or happens-before ordering: If send(a) send(b) then deliver(a) occurs before deliver(b) at common destinations pqrspqrs a bcd e delivery of c to p is delayed until after b is delivered delivery of e to r is delayed until after b&c are delivered
37
37 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Totally Ordered Multicast Total ordering: Messages are delivered in same order to all recipients (including the sender) pqrspqrs a b c d e all deliver a, b, c, d, then e
38
38 6/25/2015 EEC688/788: Secure & Dependable ComputingWenbing Zhao Implementing Total Ordering Use a token that moves around –Token has a sequence number –When you hold the token you can send the next burst of multicasts Use a sequencer to order all multicast –Message is first multicast to all, including the sequencer; then the sequencer determines the order for the message and informs all –Or send to the sequencer and the sequencer multicast with total order information –Each sender can take turn to serve as the sequencer
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.