Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these things We will learn more about each one of the topics we touched upon last time
Project schedule Find groups a week from today –groups of 2-3 –groups of 1 possible if you ask me Project proposal due the following Tuesday –but... how are we supposed to come up with ideas!!!!
Ideas for generating project ideas If you are currently doing research –that is related to programming languages/compilers –carve out some part of your current research, and do it for the project Think about what annoys you the most –in programming –and fix it using program analysis techniques
Ideas for generating project ideas Think about technology trends –iphone, mobile devices –multicore, parallelism –web browsers, java scripts, etc. –bluetooth, ad-hoc networks –how do these change the way we program/compile? Try to find bugs in real programs using PL techniques –search for “Dawson Engler” and look at some of his papers. Try to use similar techniques to find real bugs.
Ideas for generating project ideas How can I use information that is out there? –ask yourself: what information is out there? And how can I record it/analyze it/use it to improve programmer productivity? –For example: cvs logs, bugzilla, keystrokes, mouse movements, etc. Talk to your neighbor –in the 231 class, that is –have a discussion
Tour of common optimizations
Simple example foo(z) { x := 3 + 6; y := x – 5 return z * y }
Simple example foo(z) { x := 3 + 6; y := x – 5 return z * y }
Another example x := a + b;... y := a + b;
Another example x := a + b;... y := a + b;
Another example if (...) { x := a + b; }... y := a + b;
Another example if (...) { x := a + b; }... y := a + b;
Another example x := y... z := z + x
Another example x := y... z := z + x
Another example x := y... z := z + y What if we run CSE now?
Another example x := y... z := z + y What if we run CSE now?
Another example x := y**z... x :=...
Another example Often used as a clean-up pass x := y**z... x :=... x := y z := z + x x := y z := z + y Copy propDAE x := y z := z + y
Another example if (false) {... }
Another example if (false) {... }
Another example In Java: a = new int [10]; for (index = 0; index < 10; index ++) { a[index] = 100; }
Another example In “lowered” Java: a = new int [10]; for (index = 0; index < 10; index ++) { if (index = a.length()) { throw OutOfBoundsException; } a[index] = 0; }
Another example In “lowered” Java: a = new int [10]; for (index = 0; index < 10; index ++) { if (index = a.length()) { throw OutOfBoundsException; } a[index] = 0; }
Another example p := &x; *p := 5 y := x + 1;
Another example p := &x; *p := 5 y := x + 1; x := 5; *p := 3 y := x + 1; ???
Another example for j := 1 to N for i := 1 to M a[i] := a[i] + b[j]
Another example for j := 1 to N for i := 1 to M a[i] := a[i] + b[j]
Another example area(h,w) { return h * w } h :=...; w := 4; a := area(h,w)
Another example area(h,w) { return h * w } h :=...; w := 4; a := area(h,w)
Optimization themes Don’t compute if you don’t have to –unused assignment elimination Compute at compile-time if possible –constant folding, loop unrolling, inlining Compute it as few times as possible –CSE, PRE, PDE, loop invariant code motion Compute it as cheaply as possible –strength reduction Enable other optimizations –constant and copy prop, pointer analysis Compute it with as little code space as possible –unreachable code elimination
Dataflow analysis
Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why is such a framework useful?
Dataflow analysis: what is it? A common framework for expressing algorithms that compute information about a program Why is such a framework useful? Provides a common language, which makes it easier to: –communicate your analysis to others –compare analyses –adapt techniques from one analysis to another –reuse implementations (eg: dataflow analysis frameworks)
Control Flow Graphs For now, we will use a Control Flow Graph representation of programs –each statement becomes a node –edges between nodes represent control flow Later we will see other program representations –variations on the CFG (eg CFG with basic blocks) –other graph based representations
x :=... y :=... p :=... if (...) {... x... x := y... } else {... x... x :=... *p :=... }... x y... y :=... p := x... x := y x... x :=... *p := x... y :=... if (...) Example CFG
An example DFA: reaching definitions For each use of a variable, determine what assignments could have set the value being read from the variable Information useful for: –performing constant and copy prop –detecting references to undefined variables –presenting “def/use chains” to the programmer –building other representations, like the DFG Let’s try this out on an example
1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=... x :=... y :=... p := x... x := y x... x :=... *p := x... y :=... if (...) Visual sugar
1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...
1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...
Safety Recall intended use of this info: –performing constant and copy prop –detecting references to undefined variables –presenting “def/use chains” to the programmer –building other representations, like the DFG Safety: –can have more bindings than the “true” answer, but can’t miss any
Reaching definitions generalized DFA framework is geared towards computing information at each program point (edge) in the CFG –So generalize the reaching definitions problem by stating what should be computed at each program point For each program point in the CFG, compute the set of definitions (statements) that may reach that point Notion of safety remains the same
Reaching definitions generalized Computed information at a program point is a set of var ! stmt bindings –eg: { x ! s 1, x ! s 2, y ! s 3 } How do we get the previous info we wanted? –if a var x is used in a stmt whose incoming info is in, then:
Reaching definitions generalized Computed information at a program point is a set of var ! stmt bindings –eg: { x ! s 1, x ! s 2, y ! s 3 } How do we get the previous info we wanted? –if a var x is used in a stmt whose incoming info is in, then: { s | (x ! s) 2 in } This is a common pattern –generalize the problem to define what information should be computed at each program point –use the computed information at the program points to get the original info we wanted
1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...
1: x :=... 2: y :=... 3: y :=... 4: p := x... 5: x := y x... 6: x :=... 7: *p := x y... 8: y :=...
Using constraints to formalize DFA Now that we’ve gone through some examples, let’s try to precisely express the algorithms for computing dataflow information We’ll model DFA as solving a system of constraints Each node in the CFG will impose constraints relating information at predecessor and successor points Solution to constraints is result of analysis