Download presentation
Presentation is loading. Please wait.
Published byCesar Applewhite Modified over 9 years ago
1
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University
2
2 Static Analysis Interesting Questions Is every statement reachable? Does every non- void method return a value? Will local variables definitely be assigned before they are read? Will the current value of a variable ever be read? ... How much heap space will the program need? Does the program always terminate? Will the output always be correct?
3
3 Static Analysis Rice’s Theorem Theorem 11.9 (Martin p. 420) If R is a property of languages that is satisfied by some but not all recursively enumerable languages then the decision problem P R : Given a TM T, does L(T) have property R? is unsolvable.
4
4 Static Analysis Rice’s Theorem Explained Theorem 11.9 (Martin, p. 420) "Every non-trivial question about the behavior of a program is undecidable."
5
5 Static Analysis Static analysis provides approximate answers to non-trivial questions about programs The approximation is conservative, meaning that the answers only err to one side Compilers spend most of their time performing static analysis so they may: understand the semantics of programs provide safety guarantees generate efficient code
6
6 Static Analysis Conservative Approximation A typical scenario for a boolean property: if the analysis says yes, the property definitely holds if it says no, the property may or may not hold only the yes answer will help the compiler a trivial analysis will say no always the engineering challenge is to say yes often enough For other kinds of properties, the notion of approximation may be more subtle
7
7 Static Analysis A Range of Static Analyses Static analysis may take place: at the source code level at some intermediate level at the machine code level Static analysis may look at: statement blocks only an entire method (intraprocedural) the whole program (interprocedural) The precision and cost both rise as we include more information
8
8 Static Analysis The Phases of GCC (1/2) Parsing Tree optimization RTL generation Sibling call optimization Jump optimization Register scan Jump threading Common subexpression elimination Loop optimizations Jump bypassing Data flow analysis Instruction combination If-conversion Register movement Instruction scheduling Register allocation Basic block reordering Delayed branch scheduling Branch shortening Assembly output Debugging output
9
9 Static Analysis The Phases of GCC (2/2) Parsing Tree optimization RTL generation Sibling call optimization Jump optimization Register scan Jump threading Common subexpression elimination Loop optimizations Jump bypassing Data flow analysis Instruction combination If-conversion Register movement Instruction scheduling Register allocation Basic block reordering Delayed branch scheduling Branch shortening Assembly output Debugging output Static analysis uses 60% of the compilation time
10
10 Static Analysis Reachability Analysis Java requires two reachability guarantees: all statements must be reachable (avoid dead code) all non- void methods must return a value These are non-trivial properties and thus they are undecidable But a static analysis may provide conservative approximations To ensure that different compilers accept the same programs, the Java language specification mandates a specific static analysis
11
11 Static Analysis Constraint-Based Analysis For every node S that represents a statement in the AST, we define two boolean properties: C[[S]] denotes that S may complete normally R[[S]] denotes that S is possibly reachable A statement may only complete if it is reachable For each syntactic kind of statement, we generate constraints that relate C[[...]] and R[[...]]
12
12 Static Analysis Information Flow The values of R[[...]] are inherited The values of C[[...]] are synthesized AST R C
13
13 Static Analysis Reachability Constraints (1/3) if( E ) S: R[[S]] = R[[ if( E ) S]] C[[ if ( E ) S]] = R[[ if( E ) S]] if( E ) S 1 else S 2 : R[[S i ]] = R[[ if( E ) S 1 else S 2 ]] C[[ if( E ) S 1 else S 2 ]] = C[[S 1 ]] C[[S 2 ]] while(true) S: R[[S]] = R[[ while(true) S]] C[[ while(true) S]] = false while(false) S: R[[S]] = false C[[ while(false) S]] = R[[ while(false) S]]
14
14 Static Analysis Reachability Constraints (2/3) while( E ) S: R[[S]] = R[[ while( E ) S]] C[[ while( E ) S]] = R[[ while( E ) S]] return : C[[ return ]] = false return E: C[[ return E]] = false throw E: C[[ throw E]] = false { σ x ; S } : R[[S]] = R[[ { σ x ; S } ]] C[[ { σ x ; S } ]] = C[[S]]
15
15 Static Analysis Reachability Constraints (3/3) S 1 S 2 : R[[S 1 ]] = R[[S 1 S 2 ]] R[[S 2 ]] = C[[S 1 ]] C[[S 1 S 2 ]] = C[[S 2 ]] for any simple statement S: C[[S]] = R[[S]] for any method or constructor body { S } : R[[S]] = true
16
16 Static Analysis Exploiting the Information For any statement S where R[[S]] = false: unreachable statement For any non- void method with body { S } where C[[S]] = true: missing return statement These guarantees are sound but conservative
17
17 Static Analysis Approximations C[[S]] may be true too often: some unfair missing return errors may occur if (b) return 17; if (!b) return 42; R[[S]] may be true too often: some dead code is not detected if (b==!b) {... }
18
18 Static Analysis Definite Assignment Analysis Java requires that a local variable is assigned before its value is used This is a non-trivial property and thus it is undecidable But a static analysis may provide a conservative approximation To ensure that different compilers accept the same programs, the Java language specification mandates a specific static analysis
19
19 Static Analysis Constraint-Based Analysis For every node S that represents a statement in the AST, we define some set-valued properties: B[[S]] denotes the variables that are definitely assigned before S is executed A[[S]] denotes the variables that are definitely assigned after S is executed For every node E that represents an expression in the AST, we similarly define B[[E]] and A[[E]]
20
20 Static Analysis Increased Precision To handle cases such as: { int k; if (a>0 && (k=System.in.read())>0) System.out.print(k); } we also use two refinements of A[[...]]: A t [[E]] which assumes that E evaluates to true A f [[E]] which assumes that E evaluates to false
21
21 Static Analysis Information Flow The values of B[[...]] are inherited The values of A[[...]], A t [[....]] and A f [[...]] are synthesized AST B A, A t, A f
22
22 Static Analysis Definite Assignment Constraints (1/7) if( E ) S: B[[E]] = B[[ if( E ) S]] B[[S]] = A t [[E]] A[[ if( E ) S]] = A[[S]] A f [[E]] if( E ) S 1 else S 2 : B[[E]] = B[[ if( E ) S 1 else S 2 ]] B[[S 1 ]] = A t [[E]] B[[S 2 ]] = A f [[E]] A[[ if( E ) S 1 else S 2 ]] = A[[S 1 ]] A[[S 2 ]]
23
23 Static Analysis Definite Assignment Constraints (2/7) while( E ) S: B[[E]] = B[[ while( E ) S]] B[[S]] = A t [[E]] A[[ while( E ) S]] = A f [[E]] return : A[[ return ]] = return E: B[[E]] = B[[ return E]] A[[ return E]] = throw E: B[[E]] = B[[ throw E]] A[[ throw E]] = the set of all variables in scope
24
24 Static Analysis Definite Assignment Constraints (3/7) E ; : B[[E]] = B[[E ; ]] A[[E ; ]] = A[[E]] { σ x = E ; S } : B[[E]] = B[[ { σ x = E ; S } ]] B[[S]] = A[[E]] {x} A[[ { σ x = E ; S } ]] = A[[S]] { σ x ; S } : B[[S]] = B[[ { σ x ; S } ]] A[[ { σ x ; S } ]] = A[[S]]
25
25 Static Analysis Definite Assignment Constraints (4/7) S 1 S 2 : B[[S 1 ]] = B[[S 1 S 2 ]] B[[S 2 ]] = A[[S 1 ]] A[[S 1 S 2 ]] = A[[S 2 ]] x = E: B[[E]] = B[[x = E]] A[[x = E]] = A[[E]] {x} x[E 1 ] = E 2 : B[[E 1 ]] = B[[x[E 1 ] = E 2 ]] B[[E 2 ]] = A[[E 1 ]] A[[x[E 1 ] = E 2 ]] = A[[E 2 ]]
26
26 Static Analysis Definite Assignment Constraints (5/7) true : A t [[ true ]] = B[[ true ]] A f [[ true ]] = A[[ true ]] = B[[ true ]] false : A t [[ false ]] = A f [[ false ]] = B[[ false ]] A[[ false ]] = B[[ false ]] ! E: B[[E]] = B[[ ! E]] A f [[ ! E]] = A t [[E]] A[[ ! E]] = A[[E]] A t [[ ! E]] = A f [[E]]
27
27 Static Analysis Definite Assignment Constraints (6/7) E 1 && E 2 : B[[E 1 ]] = B[[E 1 && E 2 ]] B[[E 2 ]] = A t [[E 1 ]] A t [[E 1 && E 2 ]] = A t [[E 2 ]] A f [[E 1 && E 2 ]] = A f [[E 1 ]] A f [[E 2 ]] A[[E 1 && E 2 ]] = A t [[E 1 && E 2 ]] A f [[E 1 && E 2 ]] E 1 || E 2 : B[[E 1 ]] = B[[E 1 || E 2 ]] B[[E 2 ]] = A f [[E 1 ]] A t [[E 1 || E 2 ]] = A t [[E 1 ]] A t [[E 2 ]] A f [[E 1 || E 2 ]] = A f [[E 2 ]] A[[E 1 || E 2 ]] = A t [[E 1 || E 2 ]] A f [[E 1 || E 2 ]]
28
28 Static Analysis Definite Assignment Constraints (7/7) EXP(E 1,...,E k ): (any other expression with subexpressions) B[[E 1 ]] = B[[EXP(E 1,...,E k )]] B[[E i+1 ]] = A[[E i ]] A[[EXP(E 1,...,E k )]] = A[[E k ]] When not specified otherwise: A t [[E]] = A f [[E]] = A[[E]]
29
29 Static Analysis Exploiting the Information For every expression E of the form: x x ++ x -- x[E'] where x B[[E]]: variable might not have been initialized
30
30 Static Analysis Approximation A[[...]] and B[[...]] may be too small: some unfair uninitialized variable errors may occur { int x; if (b) x = 17; if (!b) x = 42 System.out.print(x); }
31
31 Static Analysis A Simpler Guarantee In Joos 1, definite assignment is guaranteed by: requiring initializers for all local declarations forbidding a local variable to appear in its own initializer This is an even coarser approximation: { int x = (x=1)+42; System.out.print(x) }
32
32 Static Analysis Flow-Sensitive Analysis The analyses for: reachability definite assignment may simply be computed by traversing the AST Other analyses are defined on the control flow graph of a program and require more complex techniques
33
33 Static Analysis Motivation: Register Optimization For native code, we may want to optimize the use of registers: mov 1,R3 mov 1,R1 mov R3,R1 This optimization is only sound if the value in R3 is not used in the future
34
34 Static Analysis Motivation: Register Spills When pushing a new frame, we write back all variables from registers to memory It would be better to only write back those registers whose values may be used in the future c b a y x R1 R2 R3
35
35 Static Analysis Liveness In both examples, we need to know if the value of some register R i might be read in the future If so, it is called live (and otherwise dead) Exact liveness is of course undecidable A static analysis may conservatively approximate liveness at each program point A trivial analysis thinks everything is live A superior analysis identifies more dead registers
36
36 Static Analysis Liveness Analysis For every program point S i we define the following set-valued properties: B[[S i ]] denotes the set of registers that are possibly live before S i A[[S i ]] denotes the set of registers that are possibly live after S i For every program point we generate a constraint that relates A[[...]] and B[[...]] for neighboring program points We no longer just use the AST...
37
37 Static Analysis Terminology succ(S i ) denotes the set of program points to which execution may continue (by falling through or jumping) uses(S i ) denotes the set of registers that S i reads defs(S i ) denotes the set of registers that S i writes
38
38 Static Analysis A Tiny Example uses( S i )defs( S i ) succ( S i ) S1: mov 3,R1 {}{R1} {S2} S2:mov 4,R2 {}{R2} {S3} S3:add R1,R2,R3 {R1,R2}{R3} {S4} S4:mov R3,R0 {R3}{R0} {S5} S5:return {R0}{} {}
39
39 Static Analysis Dataflow Constraints For every program point S i we have: B[[S i ]] = uses(S i ) (A[[S i ]] \ defs(S i )) A[[S i ]] = B[[x]] x succ(S i ) A cyclic control flow graph will generate a cyclic collection of constraints
40
40 Static Analysis Dataflow Example S7: add R1,R2,R3 S8 S9 S17 B[[ S8 ]]={ R4 } B[[ S9 ]]={ R1 } B[[ S17 ]]={ R3 } B[[ S7 ]]={ R1, R2 } ({ R4, R1, R3 }\{ R3 })={ R1, R2, R4 }
41
41 Static Analysis A Small Example { int i, even, odd, sum; i = 1; even = 0; odd = 0; sum = 0; while (i < 10) { if (i%2 == 0) even = even+i; else odd = odd+i; sum = sum+i; i++; }
42
42 Static Analysis Generated Native Code mov 1,R1 // R1 is i mov 0,R2 // R2 is even mov 0,R3 // R3 is odd mov 0,R4 // R4 is sum loop: andcc R1,1,R5 // R5 = R1 & 1 cmp R5,0 bne else // if R5!=0 goto else add R2,R1,R2 // R2 = R2+R1 b endif else: add R3,R1,R3 // R3 = R3+R1 endif: add R4,R1,R4 // R4 = R4+R1 add R1,1,R1 // R1 = R1+1 cmp R1,9 ble loop // if i<=9 goto loop
43
43 Static Analysis The Control Flow Graph S5: andcc R1,1,R5 S1: mov 1,R1 S6: cmp R5,0 S8: add R2,R1,R2 S2: mov 0,R2 S7: bne S9 S10: add R4,R1,R4 S9: add R3,R1,R3 S3: mov 0,R3 S11: add R1,1,R1 S4: mov 0,R4 S12: cmp R1,9 S13: ble S5
44
44 Static Analysis Cyclic Constraints B[[ S1 ]] = f 1 (B[[ S2 ]]) B[[ S2 ]] = f 2 (B[[ S3 ]]) B[[ S3 ]] = f 3 (B[[ S4 ]]) B[[ S4 ]] = f 4 (B[[ S5 ]]) B[[ S5 ]] = f 5 (B[[ S6 ]]) B[[ S6 ]] = f 6 (B[[ S7 ]]) B[[ S7 ]] = f 7 (B[[ S8 ]],B[[ S9 ]]) B[[ S8 ]] = f 8 (B[[ S10 ]]) B[[ S9 ]] = f 9 (B[[ S10 ]]) B[[ S10 ]] = f 10 (B[[ S11 ]]) B[[ S11 ]] = f 11 (B[[ S12 ]]) B[[ S12 ]] = f 12 (B[[ S13 ]]) B[[ S13 ]] = f 13 (B[[ S5 ]]) where the f i (...) functions express the local constraints
45
45 Static Analysis Fixed-Point Solutions Reminder: x is a fixed point of a function f iff f(x)=x Define = ( , , , , , , , , , , , ) Define R = { R0, R1, R2, R3, R4, R5 }, (R) = Powerset of R Define F: (R) 13 → (R) 13 as: F(X 1,X 2,X 3,X 4,X 5,X 6,X 7,X 8,X 9,X 10,X 11,X 12,X 13 ) = (f 1 (X 2 ),f 2 (X 3 ),f 3 (X 4 ),f 4 (X 5 ),f 5 (X 6 ),f 6 (X 7 ),f 7 (X 8,X 9 ), f 8 (X 9 ),f 9 (X 10 ),f 10 (X 11 ),f 11 (X 12 ),f 12 (X 13 ),f 13 (X 5 )) A solution is now a fixed point X (R) 13 such that F(X)=X The least fixed point is computed as a Kleene iteration: F n ( ) for some n 0
46
46 Static Analysis Computing the Least Fixed Point (1/3) F( )F 2 ( ) F 3 ( ) S1{}{}{}{} S2{}{}{}{} S3{}{}{}{R1} S4{}{}{R1}{R1} S5{}{R1}{R1}{R1} S6{}{R5}{R5}{R1,R2,R3,R5} S7{}{}{R1,R2,R3}{R1,R2,R3,R4} S8{}{R1,R2}{R1,R2,R4}{R1,R2,R4} S9{}{R1,R3}{R1,R3,R4}{R1,R3,R4} S10{}{R1,R4}{R1,R4}{R1,R4} S11{}{R1}{R1}{R1} S12{}{R1}{R1}{R1} S13{}{R1}{R1}{R1}
47
47 Static Analysis Computing the Least Fixed Point (2/3) F 4 ( ) F 5 ( ) F 6 ( ) S1{}{}{} S2{R1}{R1}{R1} S3{R1}{R1}{R1,R2} S4{R1}{R1,R2,R3}{R1,R2,R3} S5{R1,R2,R3}{R1,R2,R3,R4}{R1,R2,R3,R4} S6{R1,R2,R3,R4,R5}{R1,R2,R3,R4,R5}{R1,R2,R3,R4,R5} S7{R1,R2,R3,R4}{R1,R2,R3,R4}{R1,R2,R3,R4} S8{R1,R2,R4}{R1,R2,R4}{R1,R2,R4} S9{R1,R3,R4}{R1,R3,R4}{R1,R3,R4} S10{R1,R4}{R1,R4}{R1,R4} S11{R1}{R1}{R1,R2,R3} S12{R1}{R1,R2,R3}{R1,R2,R3,R4} S13{R1,R2,R3}{R1,R2,R3,R4}{R1,R2,R3,R4}
48
48 Static Analysis Computing the Least Fixed Point (3/3) F 7 ( ) F 8 ( ) S1{}{} S2{R1}{R1} S3{R1,R2}{R1,R2} S4{R1,R2,R3}{R1,R2,R3} S5{R1,R2,R3,R4}{R1,R2,R3,R4} S6{R1,R2,R3,R4,R5}{R1,R2,R3,R4,R5} S7{R1,R2,R3,R4}{R1,R2,R3,R4} S8{R1,R2,R4}{R1,R2,R3,R4} S9{R1,R3,R4}{R1,R2,R3,R4} S10{R1,R2,R3,R4}{R1,R2,R3,R4} S11{R1,R2,R3,R4}{R1,R2,R3,R4} S12{R1,R2,R3,R4}{R1,R2,R3,R4} S13{R1,R2,R3,R4}{R1,R2,R3,R4} F 8 ( )= F 9 ( )
49
49 Static Analysis Background: Why does this work? (R), the powerset of R, forms a lattice: It is a partial order, i.e., it is reflexive: S S transitive: if S 1 S 2 and S 2 S 3 then S 1 S 3 anti-symmetric: if S 1 S 2 and S 2 S 1 then S 1 =S 2 It has a least element ( ) and a greatest element (R) Any two elements have a join S 1 S 2 and a meet S 1 S 2 Fixed point theorem: In a finite lattice L a monotone function F: L L has a unique least fixed point: F n ( ) computable by Kleene iteration n 0
50
50 Static Analysis Application: Register Allocation Variables that are never live at the same time may share the same register Create a graph of variables where edges indicate simultaneous liveness: Register allocation is done by finding a minimal graph coloring and assigning a register to each color: {{ a, d, f }, { b, e }, { c }} a b c d e f
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.