Download presentation
Presentation is loading. Please wait.
Published byShawn Houston Modified over 9 years ago
1
DrDebug: Deterministic Replay based Cyclic Debugging with Dynamic Slicing Yan Wang *, Harish Patil **, Cristiano Pereira **, Gregory Lueck **, Rajiv Gupta *, and Iulian Neamtiu * * University of California Riverside ** Intel Corporation
2
Cyclic Debugging for Multi-threaded Programs Mozilla developer Bug report Id: 515403 Observe program state/ reach failure Fast-forward to the buggy region Program binary + input Form/Refine a hypothesis about the cause of the bug Buggy Region: only 12% of total execution Long time taken by fast-forwarding ver. 1.9.1 # of instructions in buggy region: 999,997 Difficult to locate bug among them Non-deterministic execution, difficult to reproduce Data race on variable rt->scriptFilenameTable T0 T1T2
3
Key Contributions of DrDebug Execution Region Only capture execution of buggy region Avoid fast forwarding Execution Slice Only capture bug related execution Single-step slice in a live debugging session Cyclic Debugging Based on Deterministic Replay of Execution Region and Execution Slice Results: # of instructions in execution region vs. whole program execution: 0.04%— 14.3% for bugs in 3 real-world programs # of instructions in execution slice vs. execution region: 0.01%—47.2% for bugs in 3 real-world programs, and only 41% on average for PARSEC T1T2 Region
4
DrDebug Program binary + input Observe program state/ reach failure Form/Refine a hypothesis about the cause of the bug Only Capture Bug Related Program Execution DrDebug – Deterministic replay based Debugging slice pinball Cyclic Debugging Based on Replay of Execution Slice Benefits: Only need fast- forwarding once Deterministic program execution Single-step bug- related statements
5
PinPlay in DrDebug PinPlay [Patil et. al., CGO’10] is a record/replay system, using the Pin dynamic instrumentation system. Logger Program binary + input region pinball Captures the non-deterministic events of the execution of a (buggy) region Replayer region pinball Program Output Deterministically repeat the captured execution Relogger pinball region pinball Relog execution—exclude the execution of some code regions
6
T2T1 Region Replay Efficiency using Execution Region region pinball
7
T1T2 region pinball compute slice Location Efficiency via Dynamic Slicing Dynamic slicing identifies bug related executed statement
8
T1T2 region pinball compute slice Replaying Execution Region and Dynamic Slicing slice pinball Excluded Code Region
9
T1T2 slice pinball execute slice Replaying Execution Slice Inject value Prior work: post-mortem analysis
10
Computing Dynamic Slicing for Multi-threaded Programs Collect Per Thread Local Execution Traces Construct the Combined Global Trace Shared Memory Access Order Topological Order Compute Dynamic Slice by Backwards Traversing the Global Trace Adopted Limited Preprocessing (LP) algorithm [Zhang et al., ICSE’03] to speed up the traversal of the trace
11
Dynamic Slicing a Multithreaded Program 1 1 {x} {} 2 1 {z} {x} 5 1 {m} {x} 3 1 {w} {y} Def-Use Trace for T1 4 1 {w}{w} 6 1 {x} {m} Def-Use Trace for T2 10 1 {k} {y} 8 1 {j} {y} 9 1 {j} {z,j} 11 1 {k,x} {} 12 1 {k}{k,x} 13 1 {k} {} 7 1 {y} {} x x y x z shared memory access order fox x x program order Per Thread Traces and Shared Memory Access Order T1T2 1x=5; 2 z=x; 3 int w=y; 4 w=w-2; 5 int m=3*x; 6 x=m+2; 7 y=2; 8 int j=y + 1; 9 j=z + j; 10 int k=4*y; 11 if (k>x){ 12 k=k-x; 13 assert(k>0); } Example Code int x, y, z; wrongly assumed atomic region
12
Dynamic Slicing a Multithreaded Program 7 1 {y} {} 8 1 {j} {y} 9 1 {j} {z,j} 10 1 {k} {y} 11 1 {k,x} {} 3 1 {w} {y} 4 1 {w} {w} 5 1 {m} {x} 6 1 {x} {m} 1 1 {x} {} 2 1 {z} {x} 12 1 {k} {k,x} 13 1 {k} {} T1 T2 T1 Global Trace 5 1 m=3*x 11 1 if(k>x) 12 1 k=k-x 13 1 assert(k>0) 6 1 x=m+2 7 1 y=2 CD x k m 1 1 x=5 10 1 k=4*y x x k y slice criterion root cause Slice for k at 13 1 should read (depend on) the same definition of x
13
Execution Slice Example 10 1 k=4*y 11 1 if (k>x) 12 1 k=k-x 13 1 assert(k>0) T1T2 5 1 m=3*x 6 1 x=m+2 1 1 x=5 7 1 y=2 inject j=8 z=5 w=0 inject Injecting Values During Replay 8 1 j=y + 1 9 1 j=z + j 10 1 k=4*y 11 1 if (k>x) 12 1 k=k-x 13 1 assert(k>0) T1T2 5 1 m=3*x 6 1 x=m+2 1 1 x=5 2 1 z=x 3 1 w=y 4 1 w=w-2 7 1 y=2 Code Exclusion Regions Only Bug Related Executions (e.g., root cause, failure point) are Replayed and Examined to Understand and Locate bugs. Prior works-- postmortem analysis Execution Slice – single-stepping/examining slice in a live debugging session
14
Improved Dynamic Dependence Precision Dynamic Control Dependence Precision Indirect jump (switch-case statement): Inaccurate CFG missing Control Dependence Refine CFG with dynamically collected jump targets Dynamic Data Dependence Precision Spurious dependence caused by save/restore pairs at the entry/exit of each function Identify save/restore pairs and bypass data dependences
15
Control Dependences in the Presence of indirect jump 1P(FILE* fin, int d){ 2 int w; 3 char c=fgetc(fin); 4 switch(c){ 5 case 'a': /* slice criterion */ 6 w = d + 2; 7 break; 8 … 11} C Code 3call fgetc mov %al,- 0x9(%ebp) 4... mov 0x8048708(,%eax,4),%eax jmp *%eax 6 mov 0xc(%ebp),%eax add $0x2,%eax mov %eax,-0x10(%ebp) 7 jmp 80485c88... Assembly Code Inaccurate CFG Causing Missed Control Dependence 6 1 : w=d+2 Imprecise Slice for w at line 6 1 3 1 : c=fgetc(fin) 4 1 : switch(c) 6 1 : w=d+2 ‘a’ c CD Capture Missing Control Dependence due to indirect jump
16
Improve Dynamic Control Dependence Precision Implement a static analyzer based on Pin's static code discovery library -- this allows DrDebug to work with any x86 or Intel64 binary. We construct an approximate static CFG and as the program executes, we collect the dynamic jump targets for the indirect jumps and refine the CFG by adding the missing edges. The refined CFG is used to compute the immediate post- dominator for each basic block
17
Spurious Dependences Example 1P(FILE* fin, int d){ 2 int w, e; 3 char c=fgetc(fin); 4 e= d + d; 5 if(c=='t') 6 Q(); 7 w=e; /* slice criterion */ 8 } 9Q() 10 { 11... 12 } C Code 3 call fgetc mov %al,-0x9(%ebp) 4 mov 0xc(%ebp),%eax add %eax,%eax 5 cmpb $0x74,-0x9(%ebp) jne 804852d 6 call Q 804852d 7 mov %eax,-0x10(%ebp) 9 Q() 10 push %eax... 12 pop %eax Assembly Code save/restore pair Spurious Data/Control Dependence
18
Spurious Dependences Example 7 1 : w = e mov %eax, -0x10(%ebp) 4 1 : e = d+d add %eax, %eax e Refined Slice 3 1 : c=fgetc(fin) 5 1 : if(c==‘t’) 121: pop %eax ‘t’ c 7 1 : w = e mov %eax, -0x10(%ebp) 10 1 : push %eax 4 1 : e = d+d add %eax, %eax CD eax e Imprecise Slice for w at line 7 1 Bypass data dependences caused by save/restore pairs True Definition of eax
19
Integration with Maple Maple [Yu et al. OOPSLA’12] is a thread interleaving coverage-driven testing tool. Maple exposes untested thread interleaving as much as possible. We changed Maple to optionally do PinPlay-based logging of the buggy execution it exposes. We have successfully recorded multiple buggy executions and replayed them using DrDebug.
20
Slice Criterion DrDebug GUI showing a dynamic slice
21
Data Race bugs used in our Case Studies Program NameBug Description pbzip2-0.9.4A data race on variable fifo mut between main thread and the compressor threads Aget-0.57A data race on variable bwritten between downloader threads and the signal handler thread Mozilla-1.9.1A data race on variable rt scriptFilenameTable. One thread destroys a hash table, and another thread crashes in js_SweepScriptFilenames when accessing this hash table Quantify the buggy execution region size for real bugs. Time and space overhead of DrDebug are reasonable for real bugs.
22
Time and Space Overheads for Data Race Bugs with Buggy Execution Region Program Name #ins(%ins in region vs. whole) #ins in slice pinball (%ins in slice vs. region pinball) Logging Overhead Replay Time (sec) Slicing Time (sec) Time (sec) Space (MB) Pbzip2 (0.9.4) 11,186 (0.04%) 1,065 (9.5%)5.7 0.71.50.01 Aget (0.57) 108,695 (14.3%) 51,278(47.2%)8.40.63.90.02 Mozilla (1.9.1) 999,997 (12.2%) 100 (0.01%) 9.91.13.6 1.2 Buggy region size ~ 1M # of instructions in execution region vs. whole program execution: 0.04%— 14.3% for bugs in 3 real-world programs # of instructions in execution slice vs. execution region: 0.01%—47.2% for bugs in 3 real-world programs
23
Logging Time Overheads with native input
24
Replay Time Overheads with native input
25
Execution Slice: replay time with native input 36%
26
Conclusions A working debugger with a graphical user interface that allows cyclic debugging based on replay of pinballs for multi-threaded programs Support for recording: execution regions and dynamic slices Execution of dynamic slices for improved bug localization and replay efficiency Backward navigation of a dynamic slice along dependence edges with Kdbg based GUI Results: # of instructions in execution slice vs. execution region: 0.01%—47.2% for bugs in 3 real-world programs, and only 41% on average for PARSEC
27
Backup
28
Time and Space Overheads for Data Race Bugs with Whole Execution Region Program Name #executed ins #ins in slice pinball (%ins in slice pinball) Logging Overhead Replay Time (sec) Slicing Time (sec) Time (sec) Space (MB) pbzip230,260,30011,152 (0.04%)12.51.3 8.2 1.6 Aget761,59279,794 (10.5%)10.51.0 10.1 52.6 Mozilla8,180,858 813,496 (9.9%)21.0 2.1 19.63,200.4
29
Logging Time Overheads
30
Replay Time Overheads
31
Removal of Spurious Dependences: slice sizes
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.