Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, Ding Yuan Pensieve: Non-Intrusive Failure Reproduction for Distributed Systems using the Event Chaining Approach Yongle Zhang, Serguei Makarov, Xiang Ren, David Lion, Ding Yuan
Failure Reproduction is Time-Consuming $156 Billion in 2013 Understand bug Verify fix
A Real-World Debugging Procedure Provided error description, log & reproduction steps. Can’t reproduce. Added some reproduction steps. Still cannot reproduce. Could you upload fsimage? Fsimage uploaded. Revised reproduction steps. Reporter Still cannot reproduce. Developer HDFS-6130
A Real-World Debugging Procedure Provided error description, log & reproduction steps. Can’t reproduce. Added some reproduction steps. Still cannot reproduce. Could you upload fsimage? Fsimage uploaded. Revised reproduction steps. Reporter Still cannot reproduce. Developer . . . [5 days, 29 discussions] Reproduced… HDFS-6130 [after another 8 minutes] Posted the working patch.
Pensieve: Non-Intrusive Failure Reproduction Analyzes Java bytecode & log Non-intrusive Works on JVM based system
Pensieve: Non-Intrusive Failure Reproduction Analyzes Java bytecode & log Non-intrusive Works on JVM based system Command Line Input (HDFS-4022) ./pensieve -jar ./hadoop-hdfs-2.0.0-alpha.jar // Java bytecode -log ./HDFS-logs/ // failure logs -error ./HDFS-logs/datanode-2.log#800 // symptoms
Pensieve: Non-Intrusive Failure Reproduction Analyzes Java bytecode & log Non-intrusive Works on JVM based system Command Line Input (HDFS-4022) ./pensieve -jar ./hadoop-hdfs-2.0.0-alpha.jar // Java bytecode -log ./HDFS-logs/ // failure logs -error ./HDFS-logs/datanode-2.log#800 // symptoms . 800: ERROR: invalid block datanode-2.log
Pensieve: Non-Intrusive Failure Reproduction Analyzes Java bytecode & log Non-intrusive Works on JVM based system Output Unit Test (HDFS-4022) initCluster(1); // start 1-machine cluster create(“a.txt”,2); // create 2-replica file append(“a.txt”,“X”); // append to file addDataNode(); // add a datanode on the fly
Existing Solutions Are Limited Record-and-replay (deterministic replay) Intrusive: modifies existing software stack Incurs performance overhead Symbolic Execution E.g., ESD [Zamfir EuroSys’10], SherLog [Yuan ASPLOS’10]. Pros: precise & non-intrusive Cons: hard to scale to large systems
Scalability of Symbolic Execution 1.for(int i=0;i<blocks.length;i++){ 2. if(blocks[i].genStamp!=VALID_GS) 3. log("invalid block. . ."); 4.} Simplified code snippet for HDFS-4022 Enumerates every possible execution path
Scalability of Symbolic Execution 1.for(int i=0;i<blocks.length;i++){ 2. if(blocks[i].genStamp!=VALID_GS) 3. log("invalid block. . ."); 4.} Simplified code snippet for HDFS-4022 Enumerates every possible execution path blocks[i].genStamp!=VALID_GS
Scalability of Symbolic Execution 1.for(int i=0;i<blocks.length;i++){ 2. if(blocks[i].genStamp!=VALID_GS) 3. log("invalid block. . ."); 4.} Simplified code snippet for HDFS-4022 Enumerates every possible execution path blocks[0].genStamp!=VALID_GS blocks[i].genStamp!=VALID_GS blocks[1].genStamp!=VALID_GS && blocks[0].genStamp==VALID_GS .
Scalability of Symbolic Execution 1.for(int i=0;i<blocks.length;i++){ 2. if(blocks[i].genStamp!=VALID_GS) 3. log("invalid block. . ."); 4.} Simplified code snippet for HDFS-4022 Enumerates every possible execution path blocks[0].genStamp!=VALID_GS OR blocks[i].genStamp!=VALID_GS blocks[1].genStamp!=VALID_GS && blocks[0].genStamp==VALID_GS OR .
Scalability of Symbolic Execution 1.for(int i=0;i<blocks.length;i++){ 2. if(blocks[i].genStamp!=VALID_GS) 3. log("invalid block. . ."); 4.} Simplified code snippet for HDFS-4022 Enumerates every possible execution path Failure Total Branch Instructions S.E. Stopped At (Instructions) Condition Size Pensieve instruction HDFS-4022 72,943,652 693 109,018,324 166
Core Idea – Partial Trace Observation Developers almost never debug a failure by reconstructing its complete execution path. Instead, they construct a simplified trace which only contains events that are likely to be causally relevant to the failure.
How do developers debug HDFS-4022? for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } void rollLog(. . .){ b.genStamp=logGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } void rollLog(. . .){ b.genStamp=logGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? Client.java void append(. . .){ stage=APPEND; . . . } if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? Network serialization de-serialization Client.java DataNode.java void append(. . .){ stage=APPEND; . . . } if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
How do developers debug HDFS-4022? Network serialization de-serialization Client.java DataNode.java void append(. . .){ stage=APPEND; . . . } if(this.stage==APPEND){ log("Appending" + b); b.genStamp=newGS; } Got one user command (append) by looking at 8 instructions! for(int i=0;i<blocks.length;i++) if(blocks[i].genStamp!=VALID_GS) log("invalid block. . .");
Event Chaining Approach Event – a point in time during execution Location event – a program location reached Condition event – a condition holds Invocation event – a function invoked
Event Chaining Approach Location event Condition event Invocation event An event is explained by other events Location event path conditions 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); 888: ERROR: invalid block… e1:line2(L2) datanode-2.log
Event Chaining Approach Location event Condition event Invocation event An event is explained by other events Location event path conditions 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); 888: ERROR: invalid block… e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 datanode-2.log
Event Chaining Approach Location event Condition event Invocation event An event is explained by other events Condition event definitions 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); e3:L5 888: ERROR: invalid block… e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 datanode-2.log
Event Chaining Approach Location event Condition event Invocation event An event is explained by other events 6.void append(. . .){ 7. stage=APPEND; … . . . … } 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } e5:L7 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); e4:this.stage==APPEND@L3 e3:L5 888: ERROR: invalid block… e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 datanode-2.log
Event Chaining Approach Location event Condition event Invocation event An event is explained by other events Location event function invocation 6.void append(. . .){ 7. stage=APPEND; … . . . … } 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } e6:append() e5:L7 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); e4:this.stage==APPEND@L3 e3:L5 888: ERROR: invalid block… e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 datanode-2.log
Event Chaining Approach Location event Condition event Invocation event Captures dependency on shared variables 6.void append(. . .){ 7. stage=APPEND; … . . . … } 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } Thread 1 e6:append() Thread 2 e5:L7 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . ."); e4:this.stage==APPEND@L3 e3:L5 888: ERROR: invalid block… e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 datanode-2.log
Forking for Multiple Possibilities e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Forking for Multiple Possibilities e7:L8 e3:L5 fork e2:blocks[i].genStamp !=VALID_GS@L1 e2 e1 e1:L2 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } … void rollLog(. . .){ 8. b.genStamp=logGS; … } 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Priority based scheduling e7:L8 P:500 e3:L5 P:500 P:1000 fork e2:blocks[i].genStamp !=VALID_GS@L1 e2 e1 e1:L2
Priority based scheduling e7:L8 P:500 e3:L5 P:500 e2 e2:blocks[i].genStamp !=VALID_GS@L1 e1 e1:L2 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } … void rollLog(. . .){ 8. b.genStamp=logGS; … } 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Priority based scheduling e7:L8 P:0 e3:L5 P:1000 e2 e2:blocks[i].genStamp !=VALID_GS@L1 e1 e1:L2 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } s . Appending… log files … void rollLog(. . .){ 8. b.genStamp=logGS; … } 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Priority based scheduling Favors event chains with most matched logs Favors simpler reproduction paths
Eliminating Infeasible Event Chains path conditions Path conditions Variable substitution Logical conjunction blocks[i].genStamp!=VALID_GS@L1 e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Eliminating Infeasible Event Chains path conditions Path conditions Variable substitution Logical conjunction blocks[i].genStamp!=VALID_GS@L1 newGS!=VALID_GS@L5 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } e3:L5 e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Skip Less-Relevant Loops Skips loops when there’s no loop carried dependency 77% of randomly sampled loops in HDFS Follows loop iterations otherwise 3.if(this.stage==APPEND){ 4. log("Appending" + b); 5. b.genStamp=newGS; … } e3:L5 e2:blocks[i].genStamp !=VALID_GS@L1 e1:L2 0.for(int i=0;i<blocks.length;i++) 1. if(blocks[i].genStamp!=VALID_GS) 2. log("invalid block. . .");
Verification & Testcase Refinement Sacrifices precision for efficiency Event Chain e3: i=2; e2: i>1 e1: log(“ERROR”);
Verification & Testcase Refinement Sacrifices precision for efficiency Event Chain Execution e3: i=2; e2: i>1 e1: log(“ERROR”);
Verification & Testcase Refinement Sacrifices precision for efficiency Event Chain Execution e3: i=2; e2: i>1 X Diverged! … e1: log(“ERROR”);
Verification & Testcase Refinement Sacrifices precision for efficiency e3: i=2 Event Chain Execution e5: N>0 e4: i=0; e2: i>1 X Diverged! … e1: log(“ERROR”);
Verification & Testcase Refinement Sacrifices precision for efficiency e3: i=2 Event Chain e3: i=2 Execution e5: N>0 e6: N<=0 e4: i=0; e2: i>1 e2: i>1 X e1: log(“ERROR”); Diverged! … e1: log(“ERROR”);
Verification & Testcase Refinement Variable modified in a different thread (in paper) e3: i=2 Event Chain e3: i=2 Execution e5: N>0 e6: N<=0 e4: i=0; e2: i>1 e2: i>1 X e1: log(“ERROR”); Diverged! … e1: log(“ERROR”);
Evaluation Evaluated on 18 cases from JVM distributed systems HDFS, HBase, ZooKeeper, Cassandra with noisy logs generated from manual reproduction Overall Result Successfully reproduces 72% Finishes analysis within 10 min Scalability Result Average # of Events in A Event Chain 105.2 # of Forked Event Chains 1367.2
Case Study: HDFS-6130 Useful for hard bugs path conditions inferred fsimage.layoutVersion!=TXID_LAYOUT // Use old fsimage initCluster(UPGRADE); restartCluster(); initCluster(UPGRADE); restartCluster(); Developers’ reproduction Pensieve’s reproduction
Case Study: HDFS-4022 Finds different reproduction than developers’ initCluster(3); setConfig(“policy”, “ALWAYS”); create(“a.txt”,2); stopDataNode(1); append(“a.txt”,“data”); initCluster(4); create(“a.txt”,3); stopDataNode(3); append(“a.txt”,“data”); Developers’ reproduction Pensieve’s reproduction – fewer nodes!
Limitations Error is not logged (e.g., silent data loss) Bugs involving resource exhaustion Systems need to have clearly defined input events. E.g., not for compilers.
Related Work Static program slicing [Weiser81] Obtains static trace but not dynamic partial trace Symbolic execution based approach ESD, SherLog. Record-and-replay based approach BugRedux [Wei ICSE’12], etc.
Conclusion Thanks! Pensieve: automated failure reproduction Based on Partial Trace Observation Scales to real-world distributed systems Non-intrusive and relies on logs Pensieve leverages the natural way human beings do failure reproduction. Thanks!