1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona
2 Program Slicing Definition Slice of v at S is the set of statements involved in computing v ’s value at S. [Mark Weiser, 1982] Static slice is the set of statements that COULD influence the value of a variable for ANY input. Construct static dependence graph Control dependences Data dependences Traverse dependence graph to compute slice Transitive closure over control and data dependences
3 Dynamic Slicing Dynamic slice is the set of statements that DID affect the value of a variable at a program point for ONE specific execution. [Korel and Laski, 1988] Execution trace control flow trace -- dynamic control dependences memory reference trace -- dynamic data dependences Construct a dynamic dependence graph Traverse dynamic dependence graph to compute slices Smaller, more precise, slices are more helpful
4 Slice Sizes: Static vs. Dynamic ProgramStatements Avg. of 25 slicesStatic / Dynamic StaticDynamic 126.gcc 099.go 134.perl 130.li 008.espresso 585,491 95, ,182 31,829 74,039 51,098 16,941 5,242 2,450 2,353 6,614 5, Static slice can be much larger than the dynamic slice
5 Applications of Dynamic Slicing Debugging [Korel & Laski ] Detecting Spyware [Jha ] Installed without users’ knowledge Software Testing [Duesterwald, Gupta, & Soffa ] Dependence based structural testing - output slices. Module Cohesion [N.Gupta & Rao ] Guide program structuring Performance Enhancing Transformations Instruction criticality [Ziles & Sohi ] Instruction isomorphism [Sazeides ] Others…
6 The Graph Size Problem Program Statements Executed (Millions) Dynamic Dependence Graph Size(MB) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go ,568 1,296 1,442 1,816 1, ,954 1,745 1,534 1,707 Graphs of realistic program runs do not fit in memory.
7 Space and Time Cost of LP [ICSE 2003] Program Slicing Time Average (Minutes) Max. Dynamic Dependence Graph Size(MB) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go Still not fast enough. Need to keep graph in memory.
8 Input: N=2 Dependence Graph Representation 5 1 : for I=1 to N do 6 1 : if (i%2==0) then 7 1 : p=&a 8 1 : a=a : z=2*(*p) 10 1 : print(z) 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 2 : for I=1 to N do 6 2 : if (i%2==0) then 8 2 : a=a : z=2*(*p) 1: z=0 2: a=0 3: b=2 4: p=&b 5: for i = 1 to N do 6: if ( i %2 == 0) then 7: p=&a endif 8: a=a+1 9: z=2*(*p) endfor 10: print(z)
9 5:for i=1 to N 6:if (i%2==0) then 7: p=&a 8: a=a+1 9: z=2*(*p) 10: print(z) T F 1: z=0 2: a=0 3: b=2 4: p=&b T Input: N=2 1 1 : z=0 2 1 : a=0 3 1 : b=2 4 1 : p=&b 5 1 : for i = 1 to N do 6 1 : if ( i %2 == 0) then 8 1 : a=a : z=2*(*p) 5 2 : for i = 1 to N do 6 2 : if ( i %2 == 0) then 7 1 : p=&a 8 2 : a=a : z=2*(*p) 10 1 : print(z) T Dependence Graph Representation F
10 OPT: Compacted Graph Algorithm Compaction Elimination of timestamp labels. Remove labels that can be inferred Transform dependence graph to enable elimination Remove labels that are redundant Fast Traversal Long search for relevant dependence is often replaced by quick computation of dependence Consequence of compaction
11 OPT-1a. Infer Local Def-Use Labels: Full Elimination X = = X X = = X 0 X = = X (10,10) (20,20) (30,30) Assign timestamps on node level
12 OPT-1b. Infer Local Def-Use Labels: Partial Elimination In Presence of Aliasing X = *P = = X X = *P = = X (10,10) (20,20) X = *P = = X (10,10) 0 *P is a may alias of X
13 OPT-2a. Transform Local Def-Use Labels: Full Elimination In Presence of Aliasing Z = Y = (10,11) (20,21) (10,11) (20,21) X = f(Y) = X *P = g(Z) (11,11) (21,21) Z = Y = (10,11) (20,21) (10,11) (20,21) X = f(Y) = X *P = g(Z) X = f(Y) = X *P = g(Z) 0 0 X = f(Y) = X *P = g(Z) Z = Y =
14 OPT-2b. Transform Non-local Def-Use to Local Use-Use Edges = X X = (10,11) (20,21) (10,11) (20,21) = X X = (10,11) (20,21) = X X = 0 use-use
15 OPT-2c. Transform Non-Local Def-Use to Local Def-Use Edges X = = Y = X Y = 1 2 X = Y = 1 2 = Y = X (1,3) (2,3) (10,12) (11,12) X = Y = 1 2 = Y = X (1,3) (2,3) = Y = X Y = 2 X = 0 0 Node for path
16 OPT-3. Redundant Labels Across Non-Local Def-Use Edges X = Y = = Y = X X = Y = X = Y = = Y = X X = Y = (1,2) (10,11) X = Y = = Y = X X = Y = (10,11) (1,2)
17 OPT-4.(Control Dep.) Infer Fixed Distance Unique Control Ancestor PathTimestamps (32,33) (10,13) (20,23) (30,34) (21,22) (11,12) (31,32) (10,11) (20,21) (30,31) 1 1
18 OPT-5a. Transform Multiple Control Ancestors (32,33) (10,13) (20,23) (30,34) (21,22) (10,13) (30,34)
19 OPT-5b. Transform Varying Distance to Unique Control Ancestors
20 OPT-6. Redundant Across Non-Local Def- Use and Control Dependence Edges X = If P = X X = If P = X (1,2) X = If P = X (1,2)
21 Completeness of Label Elimination Optimizations Data Dependence Labels Local to a basic block Infer (OPT-1a, OPT-1b) Transform (OPT-2a) Non-Local across basic blocks Transform (OPT-2b, OPT-2c) Redundant (OPT-3) Control Dependence Labels Infer (OPT-4) Transform (OPT-5a, OPT-5b) Redundant (OPT-6)
22 Slicing algorithm (1) {s2} U t 0 t = … s2: x= … s1:v=f(x,…) 0
23 Slicing algorithm (2) t 0 t = … s2: …=x … s1:v=f(x,…) 0 Use-use edge
24 Slicing algorithm (3) {s3} U t’ t = … s1:v=f(x,…) … s3: x=… … s4: x=… … …
25 Shortcuts to Speed Up Traversal 0: X = 1: Y = f(X) 2: Z = g(Y) 3: … = Z (10,11) (20,21) 0 0 0: X = 1: Y = f(X) 2: Z = g(Y) 3: … = Z (10,11) (20,21) 0 {2}
26 Experimental Setup Implementation Trimaran: C programs, IR (intermediate representation) An instrumented interpreter executes IR, collects compact control flow trace and memory trace. CFG and PDG are constructed on IR level so that the slicing is also on IR level. Experiment In order to get fair comparisons among algorithms, we shared as much code as possible in different implementations. 2.2 GHz Pentium, 2 G RAM, 1 G swap space. For each benchmark, we collected 3 different traces, for each trace, we randomly computed 25 slices.
27 OPT: Compacted Graph Sizes Program Graph Size (MB)Before / After Explicit Dependences (%) BeforeAfter 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go 1,568 1,296 1,442 1,816 1, ,954 1,745 1,534 1,
28 OPT: Effects
29 OPT: Slicing Times at Different Execution Points
30 OPT: Benefit of Shortcuts Program OPT Slicing Times (Avg. of 25 slices) W/O Shortcuts (Seconds) With Shortcuts (Seconds) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go
31 OPT vs. LP: Graph Sizes Program Graph Size (MB) OPTLP (Max. of 25) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go
32 OPT vs. LP: Slicing Times Program Slicing Times (Avg. of 25 slices) OPT (Seconds) LP (Minutes) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go
33 Traditional vs. OPT: Short Program Runs Program Slicing Times (Avg. of 25 slices) OPT (Seconds) Traditional (Seconds) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go 36.3 : : : : : : : : : :
34 Graph Construction Cost Trace Generation - Instrumented program takes twice as long to run as the uninstrumented program. Trace Preprocessing for Graph Construction Time(LP) < Time(OPT) < Time(Traditional) ProgramLP (min)OPT (min)Trad. (min) 300.twolf 256.bzip2 255.vortex 197.parser 181.mcf 164.gzip 134.perl 130.li 126.gcc 099.go
35 Conclusion A straightforward implementation of precise algorithm is not practical. Carefully designed precise dynamic slicing algorithms provide precise dynamic slices at reasonable space and time costs. Our work is one step toward making dynamic slicing practical. On going work: Efficient online compression another 5-10 times reduction; 15MB for 150Mills(over 100 times reduction in total); 4-10 times slowdown.