Download presentation
Presentation is loading. Please wait.
Published byRafe Fleming Modified over 9 years ago
2
Fault Tolerant Distributed Computing system.
3
zWhat is fault? yA fault is a blemish, weakness, or shortcoming of a particular hardware or software component. yFault, error and failures zWhy fault tolerant? yAvailability, reliability, dependability, … zHow to provide fault tolerance ? yReplication yCheckpointing and message logging yHybrid Fundamentals
4
Message Logging yTolerate crash failures yEach process periodically records its local state and log messages received after xOnce a crashed process recovers, its state must be consistent with the states of other processes xOrphan processes surviving processes whose states are inconsistent with the recovered state of a crashed process xMessage Logging protocols guarantee that upon recovery no processes are orphan processes
5
Message logging protocols yPessimistic Message Logging avoid creation of orphans during execution no process p sends a message m until it knows that all messages delivered before sending m are logged; quick recovery Can block a process for each message it receives - slows down throughput allows processes to communicate only from recoverable states; synchronously log to stable storage any information that may be needed for recovery before allowing process to communicate
6
Message Logging yOptimistic Message Logging take appropriate actions during recovery to eliminate all orphans Better performance during failure-free runs allows processes to communicate from non-recoverable states; failures may cause these states to be permanently unrecoverable, forcing rollback of any process that depends on such states
7
Causal Message Logging yCausal Message Logging no orphans when failures happen and do not block processes when failures do not occur. Weaken condition imposed by pessimistic protocols Allow possibility that the state from which a process communicates is unrecoverable because of a failure, but only if it does not affect consistency. Append to all communication information needed to recover state from which communication originates - this is replicated in memory of processes that causally depend on the originating state.
8
KAN – A Reliable Distributed Object System zDeveloped at UC Santa Barbara zProject Goal: yLanguage support for parallelism and distribution yTransparent location/migration/replication yOptimized method invocation yFault-tolerance yComposition and proof reuse
9
System Description Kan Compiler Kan source Java bytecode + Kan run-time libraries JVM UNIX sockets
10
Fault Tolerance in Kan zLog-based forward recovery scheme : yLog of recovery information for a node is maintained externally on other nodes. yThe failed nodes are recovered to their pre-failure states, and the correct nodes keep their states at the time of the failures. zOnly consider node crash failures. yProcessor stops taking steps and failures are eventually detected.
11
Basic Architecture of the Fault Tolerance Scheme Logical Node y Logical Node x Fault Detector Failure handler Request handler Communication Layer IP Address Network External Log Physical Node i
12
Logical Ring zUse logical ring to minimize the need for global synchronization and recovery. zThe ring is only used for logging (remote method invocations). zTwo parts: yStatic part containing the active correct nodes. It has a leader and a sense of direction: upstream and downstream. yDynamic part containing nodes that trying to join the ring zA logical node is logged at the next T physical nodes in the ring, where T is the maximum number of nodes failures to tolerate.
13
Logical Ring Maintenance zEach node participating in the protocol maintains a variables : yFailed i (j): true if i has detected the failure of j yMap i (x): the physical node on which logical node x resides yLeader i : i’s view of the leader of the ring yView i : i’s view of the logical ring (membership and order) yPending i : the set of physical nodes that i suspects of failing yRecovery_count i : the number of logical nodes that need to be recovered yReady i : records whether I is active. xInitial set of ready nodes; new nodes become ready when they are linked into the ring.
14
Failure Handling zWhen node i is informed of failure of node j: yIf every node upstream of i has failed, then I must become new leader. It remaps all logical nodes from the upstream physical nodes, and informs the other correct nodes by sending a remap message. It then recovers the logical nodes. yIf the leader has failed but there is some upstream node k that will become the new leader, then just update the map and leader variables to reflect the new situation yIf the failed node j is upstream of i, then just update map. If I is the next downstream node from j, also recover the logical nodes from j. yIf j is downstream of i and there is some node k downstream of j, then just update map. yIf j is downstream of I and there is no node downstream of j, then wait for the leader to update map. yIf i is the leader and must recover j, then change map, send a remap message to change the correct nodes’ maps, and recover all logical nodes that are mapped locally
15
Physical Node and Leader Recovery zWhen a physical node comes back up: yIt sends a join message to the leader. yThe leader tries to link this node in the ring: xAcquire Grant xAdd, Ack_add xRelease zWhen the leader fails, the next downstream node in the ring becomes the new leader.
16
AQuA zAdaptive Quality of Service Availability zDeveloped in UIUC and BBN. zGoal: yAllow distributed applications to request and obtain a desired level of availability. zFault tolerance yreplication yreliable messaging
17
Features of AQuA zUses the QuO runtime to process and make availability requests. zProteus dependability manager to configure the system in response to faults and availability requests. zEnsemble to provide group communication services. zProvide CORBA interface to application objects using the AQuA gateway.
18
Proteus functionality yHow to provide fault tolerance for appl. xStyle of replication (active, passive) xvoting algorithm to use xdegree of replication xtype of faults to tolerate (crash, value or time) xlocation of replicas yHow to implement chosen ft scheme xdynamic configuration modification xstart/kill replicas, activate/deactivate monitors,voters
19
Group structure yFor reliable mcast and pt-to-pt. Comm xReplication groups xConnection groups xProteus Communication Service Group for replicated proteus manager replicas and objects that communicate with the manager e.g. notification of view change, new QuO request ensure that all replica managers receive same info xPoint-to-point groups proteus manager to object factory
20
AQuA Architecture
21
Fault Model, detection and Handling yObject Fault Model: x Object crash failure - occurs when object stops sending out messages; internal state is lost crash failure of an object is due to the crash of at lease one element composing the object xValue faults - message arrives in time with wrong content (caused by application or QuO runtime) Detected by voter x Time faults Detected by monitor xLeaders report fault to Proteus; Proteus will kill objects with fault if necessary, and generate new objects
22
AQuA Gateway Structure
23
Egida zDeveloped in UT, Austin zAn object-oriented, extensible toolkit for low-overhead fault-tolerance zProvides a library of objects that can be used to compose log-based rollback recovery protocols. xSpecification language to express arbitrary rollback-recovery protocols
24
Log-based Rollback Recovery xCheckpointing independent, coordinated, induced by specific patterns of communication xMessage Logging Pessimistic, optimistic, causal
25
Core Building Blocks zAlmost all the log-based rollback recovery protocols share event-driven structures zThe common events are : yNon-deterministic events xOrphans, determinant yDependency-generating events yOutput-commit events yCheckpointing events yFailure-detection events
26
A grammar for specifying rollback-recovery protocols Protocol := * * op t op t := : determinant : on > opt := output commit on := send | receive | read | write := {source, sesn, dest, dest} := independent | co-ordinated := synchronously | asynchronously := local disk | volatile memory of self
27
Egida Modules zEventHandler zDeterminant zHowToOutputCommit zLogEventDeterminant zLogEventInfo zHowToLog zWhereToLog zStableStorage zVolatileStorage zCheckpointing z…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.