1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.

1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning parsing semantic analysis HIR intermediate code gen. HIR optim LIR code gen. LIR

2 Intermediate representation Components –code representation –symbol table –analysis information –string table Issues –Use an existing IR or design a new one? –How close should it be to the source/target?

3 IR selection Using an existing IR –cost savings due to reuse –it must be expressive and appropriate for the compiler operations Designing an IR –decide how close to machine code it should be –decide how expressive it should be –decide its structure –consider combining different kinds of IRs

4 IR classification: Level High-level –closer to source language –used early in the process –usually converted to lower form later on –Example: AST

5 IR classification: Level Medium-level –try to reflect the range of features in the source language in a language-independent way –most optimizations are performed at this level algebraic simplification copy propagation dead-code elimination common subexpression elimination loop-invariant code motion etc.

6 IR classification: Level Low-level –very close to target-machine instructions –architecture dependent –useful for several optimizations loop unrolling branch scheduling instruction/data prefetching register allocation etc.

7 IR classification: Level for i := op1 to op2 step op3 instructions endfor i := op1 if step < 0 goto L2 L1: if i > op2 goto L3 instructions i := i + step goto L1 L2: if i < op2 goto L3 instructions i := i + step goto L2 L3: High-level Medium-level

8 IR classification: Structure Graphical –trees, graphs –not easy to rearrange –large structures Linear –looks like pseudocode –easy to rearrange Hybrid –combine graphical and linear IRs –Example: low-level linear IR for basic blocks, and graph to represent flow of control

9 (Basic blocks) Basic block = a sequence of consecutive statements in which flow of control enters at the beginning and leaves at the end without halt or possibility of branching except at the end.

10 (Basic blocks) Partitioning a sequence of statements into BBs –Determine leaders (first statements of BBs) the first statement is a leader the target of a conditional is a leader a statement following a branch is a leader –For each leader, its basic block consists of the leader and all the statements up to but not including the next leader.

11 Linear IRs Sequence of instructions that execute in order of appearance Control flow is represented by conditional branches and jumps Common representations –stack machine code –three-address code

12 Linear IRs stack machine code –assumes presence of operand stack –useful for stack architectures, JVM –operations typically pop operands and push results. –advantages easy code generation compact form –disadvantages difficult to rearrange difficult to reuse expressions

13 Linear IRs three-address code –compact –generates temp variables –level of abstraction may vary –loses syntactic structure –quadruples operator up to two operands destination –triples similar to quadruples but the results are not named explicitly (index of operation is implicit name) –Implement as table, array of pointers, or list

14 Linear IRs L1: i := 2 t1:= i+1 t2 := t1>0 if t2 goto L1 (1) 2 (2) i st (1) (3) i + 1 (4) (3) > 0 (5) if (4), (1) Quadruples Triples

15 Graphical IRs Parse tree Abstract syntax tree –high-level –useful for source-level information –retains syntactic structure –Common uses source-to-source translation semantic analysis syntax-directed editors

16 Graphical IRs Tree, for basic block –root: operator –up to two children: operands –can be combined Uses: –algebraic simplifications –may generate locally optimal code. L1: i := 2 t1:= i+1 t2 := t1>0 if t2 goto L1 assgn, i add, t1 gt, t2 2 i 1 t1 0 2 assgn, i 1 add, t1 0 gt, t2

17 Graphical IRs Directed acyclic graphs (DAGs) –Like compressed trees leaves: variables, constants available on entry internal nodes: operators –annotated with variable names? distinct left/right children –Used for basic blocks (doesn't show control flow) –Can generate efficient code. Note: DAGs encode common expressions –But difficult to transform –Better for analysis

18 Graphical IRs Generating DAGs –check whether an operand is already present if not, create a leaf for it –check whether there is a parent of the operand that represents the same operation if not create one, then label the node representing the result with the name of the destination variable, and remove that label from all other nodes in the DAG.

19 Graphical IRs Directed acyclic graphs (DAGs) –Example m := 2 * y * z n := 3 * y * z p := 2 * y - z

20 Graphical IRs Control flow graphs (CFGs) –Each node corresponds to a basic block, or –fewer nodes –may need to determine facts at specific points within BB a single statement –more space and time –Each edge represents flow of control

21 Graphical IRs Dependence graphs –Encode flow of values from definition to use –Nodes represent operations –Edges connect definitions to uses –Graph represents constraints on the sequencing of operations –Built for specific optimizations, then discarded

22 SSA form Static Single Assignment Form –Encodes information about data and control flow –Two constraints: each definition has a unique name each use refers to a single definition –all uses reached by a definition are renamed –Example: x := 5 x 0 := 5 x := x+1 becomes x 1 := x 0 + 1 y := x *2 y 0 := x 1 * 2 What if we have a loop?

23 SSA form The compiler inserts special join functions (called  -functions) at points where different control flow paths meet. Example: read(x) read(x 0 ) if (x>0) if (x 0 >0) y:=5 y 0 := 5 else becomes else y:=10 y 1 := 10 x := y y 2 :=  (y 0, y 1 ) x 1 := y 2

24 SSA form Example 2: x := 0 x 0 := 0 i := 1 i 0 := 1 while (i =10) goto L2 x := x+i L1: i := i+1

25 SSA form Example 2: x := 0 x 0 := 0 i := 1 i 0 := 1 while (i =10) goto L2 x := x+i L1: x 1 :=  (x 0, x 2 ) i := i+1 i 1 :=  (i 0, i 2 ) x 2 := x 1 +i 1 i 2 := i 1 +1 if (i 2 <10) goto L1 L2: x 3 :=  (x 0, x 2 ) i 3 :=  (i 0, i 2 )

26 SSA form Note:  is not an executable function A program is in SSA form if –each variable is assigned a value in exactly one statement –each use of a variable is dominated by the definition. point x dominates point y if every path from the start to y goes through x

27 SSA form Why use SSA? –explicit def-use pairs –no write-after-read and write-after-write dependences –speeds up many dataflow optimizations But –too many temp variables,  -functions –limited to scalars how to handle arrays?

1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.

Similar presentations

Presentation on theme: "1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.

Similar presentations

Presentation on theme: "1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning."— Presentation transcript:

Similar presentations

About project

Feedback