Causal Logging : Manetho Rohit C Fernandes 10/25/01
Manetho System Model Non determinististic events Message Receive Internal event(Kernel call) Creation of a new process Output Commit Stable Storage + Volatile Memory
Manetho properties Tolerate any number of simultaneous failures Low failure-free overhead Only failed processes roll back
Example Manetho Execution
Causal Logging : Intuition Piggyback determinant of non- deterministic event on outgoing messages Determinant? Piggyback Antecedence Graphs
Antecedence Graph Directed acyclic graph Nodes : State Intervals Edges : Happened before(immediate)
Antecedence Graph
Receive Node Two incoming edges Fields Receiver ID Sender ID Index of created state interval Unique identifier of message
Internal Event Node One incoming edge Fields Type of event Replay information
Failure Free Operation Each process maintains AG of its current interval Log that contains data and ID of each message sent Message Send : Piggyback AG of current state interval
Optimization Need not send complete AG Incremental piggybacking AG( i+1 p ) is a proper subgraph of AG( i p ) Process q communicates to p max j such that j p is in q’s AG P sends AG ( i p ) - AG ( j p )
Information on Stable Storage Checkpoints AG (asynchronously) : Need not piggyback part of AG which is in disk Output commit: Save AG to disk
Incarnation Numbers Each process starts a new incarnation after recovery Integer stored in stable storage Tagged on outgoing messages Messages from old incarnations discarded
Recovery Protocol Recover(p,c,INCNUM,S) Step 1 INCNUM INCNUM+1 ; save INCNUM INCVEC[p] INCNUM G AG( p c ) // stable storage
Recovery Protocol Step 2 For all q S, q p (INQ,AGQ) remote call at q:GET_AG(p) G G AGQ INCVEC[q] INQ For all q S, q p Remote call at q: SEND_INC(p,INCVEC)
Recovery Protocol Step 3 m max j such that p j G Recover upto p m Don’t send out application messages but log them For receive, request message from sender’s log Replay internal event
Recovery Example
Available Antecedence Graphs
Application Characteristics
Performance Overhead
Coordinated vs. Uncoordinated