Download presentation
Presentation is loading. Please wait.
Published byPinar Gültekin Modified over 5 years ago
1
Fall 2017-2018 Compiler Principles Lecture 5: Intermediate Representation
Roman Manevich Ben-Gurion University of the Negev
2
Tentative syllabus mid-term exam Front End Intermediate Representation
Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Intermediate Representation Operational Semantics Lowering Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection mid-term exam
3
Previously Becoming parsing ninjas
Going from text to an Abstract Syntax Tree By Admiral Ham [GFDL ( or CC-BY-SA-3.0 ( via Wikimedia Commons
4
From scanning to parsing
program text 59 + (1257 * xPosition) Lexical Analyzer token stream ) id * num ( + Grammar: E id E num E E + E E E * E E ( E ) Parser syntax error valid + num x * Abstract Syntax Tree
5
Agenda The role of intermediate representations Two example languages
A high-level language An intermediate language Lowering Correctness Formal meaning of programs
6
Role of intermediate representation
Bridge between front-end and back-end Allow implementing optimizations independent of source language and executable (target) language High-level Language (scheme) Executable Code Lexical Analysis Syntax Analysis Parsing AST Symbol Table etc. Inter. Rep. (IR) Code Generation
7
Motivation for intermediate representation
8
Intermediate representation
A language that is between the source language and the target language Not specific to any source language or machine language Goal 1: retargeting compiler components for different source languages/target machines Java Pentium C++ IR Java bytecode Pyhton Sparc
9
Intermediate representation
A language that is between the source language and the target language Not specific to any source language or machine language Goal 1: retargeting compiler components for different source languages/target machines Goal 2: machine-independent optimizer Narrow interface: small number of node types (instructions) Lowering Code Gen. Java Pentium optimize C++ IR Java bytecode Pyhton Sparc
10
Multiple IRs Some optimizations require high-level constructs while others more appropriate on low-level code Solution: use multiple IR stages Pentium optimize optimize AST HIR LIR Java bytecode Sparc
11
Multiple IRs example Elixir – a language for parallel graph algorithms
Elixir Program + delta Elixir Program Delta Inferencer HIR Lowering IL Synthesizer HIR LIR C++ backend Query Answer Planning Problem Plan Automated Reasoning (Boogie+Z3) C++ code Automated Planner Galois Library
12
AST vs. LIR for imperative languages
Rich set of language constructs Rich type system Declarations: types (classes, interfaces), functions, variables Control flow statements: if-then-else, while-do, break-continue, switch, exceptions Data statements: assignments, array access, field access Expressions: variables, constants, arithmetic operators, logical operators, function calls An abstract machine language Very limited type system Only computation-related code Labels and conditional/ unconditional jumps, no looping Data movements, generic memory access statements No sub-expressions, logical as numeric, temporaries, constants, function calls – explicit argument passing
13
three address code
14
Three-Address Code IR A popular form of IR
High-level assembly where instructions have at most three operands There exist other types of IR For example, IR based on acyclic graphs – more amenable for analysis and optimizations Chapter 8
15
Base language: While
16
Syntax n Num numerals x Var program variables A n | x | A ArithOp A | (A) ArithOp - | + | * B true | false | A = A | A A | B | B B | (B) S x := A | skip | S; S | { S } | if B then S else S | while B S
17
Example program while x < y { x := x + 1 { y := x;
18
Intermediate language: IL
19
Syntax n Num Numerals l Num Labels
x Temp Var Temporaries and variables V n | x R V Op V Op - | + | * | = | | << | >> | … C l: skip | l: x := R | l: Goto l’ | l: IfZ x Goto l’ | l: IfNZ x Goto l’ IR C+
20
Intermediate language programs
An intermediate program P has the form 1:c1 … n:cn We can view it as a map from labels to individual commands and write P(j) = cj 1: t0 := 137 2: y := t0 + 3 3: IfZ x Goto 7 4: t1 := y 5: z := t1 6: Goto 9 7: t2 := y 8: x := t2 9: skip
21
Lowering
22
TAC generation At this stage in compilation, we have
an AST annotated with scope information and annotated with type information We generate TAC recursively (bottom-up) First for expressions Then for statements
23
TAC generation for expressions
Define a function cgen(expr) that generates TAC that: Computes an expression, Stores it in a temporary variable, Returns the name of that temporary Define cgen directly for atomic expressions (constants, this, identifiers, etc.) Define cgen recursively for compound expressions (binary operators, function calls, etc.)
24
Translation rules for expressions
cgen(n) = (l: t:=n, t) where l and t are fresh cgen(x) = (l: t:=x, t) where l and t are fresh cgen(e1) = (P1, t1) cgen(e2) = (P2, t2) cgen(e1 op e2) = (P1· P2· l: t:=t1 op t2, t) where l and t are fresh
25
cgen for basic expressions
Maintain a counter for temporaries in c, and a counter for labels in l Initially: c = 0, l = 0 cgen(k) = { // k is a constant c = c + 1, l = l emit(l: tc := k) return tc { cgen(id) = { // id is an identifier c = c + 1, l = l emit(l: tc := id) return tc {
26
Naive cgen for binary expressions
cgen(e1 op e2) = { let A = cgen(e1) let B = cgen(e2) c = c + 1, l = l emit( l: tc := A op B; ) return tc } The translation emits code to evaluate e1 before e2. Why is that?
27
Example: cgen for binary expressions
cgen( (a*b)-d)
28
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d)
29
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = cgen(a*b) let B = cgen(d) c = c + 1 , l = l emit(l: tc := A - B; ) return tc }
30
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = cgen(a) let B = cgen(b) c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = cgen(d) c = c + 1, l = l emit(l: tc := A - B; ) return tc }
31
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } Code here A=t1
32
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } Code 1: t1:=a; here A=t1
33
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } Code 1: t1:=a; 2: t2:=b; here A=t1
34
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1
35
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } here A=t3 Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 here A=t1
36
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } here A=t3 Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d here A=t1
37
Example: cgen for binary expressions
c = 0, l = 0 cgen( (a*b)-d) = { let A = { let A = {c=c+1, l=l+1, emit(l: tc := a;), return tc } let B = {c=c+1, l=l+1, emit(l: tc := b;), return tc } c = c + 1, l = l emit(l: tc := A * B; ) return tc } let B = {c=c+1, l=l+1, emit(l: tc := d;), return tc } c = c + 1, l = l emit(l: tc := A - B; ) return tc } here A=t3 Code 1: t1:=a; 2: t2:=b; 3: t3:=t1*t2 4: t4:=d 5: t5:=t3-t4 here A=t1
38
cgen for statements We can extend the cgen function to operate over statements as well Unlike cgen for expressions, cgen for statements does not return the name of a temporary holding a value (Why?)
39
Syntax n Num numerals x Var program variables A n | x | A ArithOp A | (A) ArithOp - | + | * | / B true | false | A = A | A A | B | B B | (B) S x := A | skip | S; S | { S } | if B then S else S | while B S
40
Translation rules for statements
cgen(skip) = l: skip where l is fresh cgen(e) = (P, t) cgen(x := e) = P · l: x :=t where l is fresh cgen(S1) = P1, cgen(S2) = P cgen(S1; S2) = P1 · P2
41
Translation rules for conditions
cgen(b) = (Pb, t), cgen(S1) = P1, cgen(S2) = P cgen(if b then S1 else S2) = Pb IfZ t Goto lfalse P lfinish: Goto Lafter lfalse: skip P lafter: skip where lfinish, lfalse, lafter are fresh
42
Translation example y := 137+3; if x=0 z := y; else x := y;
1: t1 := 137 2: t2 := 3 3: t3 := t1 + t2 4: y := t3 5: t4 := x 6: t5 := 0 7: t6 := t4=t5 8: IfZ t6 Goto 12 9: t7 := y 10: z := t7 11: Goto 14 12: t8 := y 13: x := t8 14: skip
43
Translation rule for loops
cgen(b) = (Pb, t), cgen(S) = P cgen(while b S) = lbefore: skip Pb IfZ t Goto lafter P lloop: Goto Lbefore lafter: skip where lafter, lbefore, lloop are fresh
44
Translation example y := 137+3; while x=0 z := y;
1: t1 := 137 2: t2 := 3 3: t3 := t1 + t2 4: y := t3 5: t4 := x 6: t5 := 0 7: t6 := t4=t5 8: IfZ t6 Goto 12 9: t7 := y 10: z := t7 11: Goto 14 12: skip
45
Translation Correctness
46
Compiler correctness Compilers translate programs in one language (usually high) to another language (usually lower) such that they are both equivalent Our goal is to formally define the meaning of this equivalence First, we must formally define the meaning of programs
47
Formal semantics
49
What is formal semantics?
“Formal semantics is concerned with rigorously specifying the meaning, or behavior, of programs, pieces of hardware, etc.”
50
Why formal semantics? Implementation-independent definition of a programming language Automatically generating interpreters (and some day maybe full fledged compilers) Optimization, verification, and debugging If you don’t know what it does, how do you know its correct/incorrect? How do you know whether a given optimization is correct?
51
Operational semantics
Elements of the semantics States/configurations: the (aggregate) values that a program computes during execution Transition rules: how the program advances from one configuration to another
52
Operational semantics of while
53
While syntax reminder n Num numerals x Var program variables A n | x | A ArithOp A | (A) ArithOp - | + | * B true | false | A = A | A A | B | B B | (B) S x := A | skip | S; S | { S } | if B then S else S | while B do S
54
Semantic categories Z Integers {0, 1, -1, 2, -2, …} T Truth values {ff, tt} State Var Z Example state: =[x5, y7, z0] Lookup: (x) = 5 Update: [x6] = [x6, y7, z0]
55
Semantics of expressions
56
Semantics of arithmetic expressions
Semantic function A : State Z Defined by induction on the syntax tree n = n x = (x) a1 + a2 = a1 + a2 a1 - a2 = a1 - a2 a1 * a2 = a1 a2 (a1) = a1 not needed - a = 0 - a1 Compositional Expressions in While are side-effect free
57
Arithmetic expression exercise
Suppose x = 3 Evaluate x+1
58
Semantics of boolean expressions
Semantic function B : State T Defined by induction on the syntax tree true = tt false = ff a1 = a2 = a1 a2 = b1 b2 = b = Compositional Expressions in While are side-effect free
59
Semantics of statements
first step By Vanillase (Own work) [CC BY-SA 3.0 ( via Wikimedia Commons This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
60
Operational semantics
Developed by Gordon Plotkin Configurations: has one of two forms: S, Statement S is about to execute on state Terminal (final) state Transitions S, = S’, ’ Execution of S from is not completed and remaining computation proceeds from intermediate configuration = ’ Execution of S from has terminated and the final state is ’ S, is stuck if there is no such that S, first step PLOTKIN G .D., “A Structural Approach to Operational Semantics”, DAIMI FN-19, Computer Science Department, Aarhus University, Aarhus, Denmark, September 1981.
61
Form of semantic rules defined by rules of the form
The meaning of compound statements is defined using the meaning immediate constituent statements side condition premise S1, 1 1’, … , Sn, n n’ S, ’ if… conclusion
62
Operational semantics for While
x:=a, [xa] [ass] skip, [skip] S1, S1’, ’ S1; S2, S1’; S2, ’ [comp1] When does this happen? S1, ’ S1; S2, S2, ’ [comp2] if b then S1 else S2, S1, if b = tt [iftt] if b then S1 else S2, S2, if b = ff [ifff] while b do S, S; while b do S, [whilett] if b = tt while b do S, [whileff] if b = ff
63
Factorial (n!) example Input state such that x = 3
y := 1; while (x=1) do (y := y * x; x := x – 1) y :=1 ; W, W, [y1] ((y := y * x; x := x – 1); W), [y1] (x := x – 1; W), [y3] W , [y3][x2] ((y := y * x; x := x – 1); W), [y3] [x2] (x := x – 1; W) , [y6] [x2] W, [y6][x1] [y6][x1]
64
Derivation sequences An interpreter
Given a statement S and an input state start with the initial configuration S, Repeatedly apply rules until either a terminal state is reached or an infinite sequence We will write S, k ’ if ’ can be obtained from S, by k derivation steps We will write S, * ’ if S, k ’ for some k0
65
The semantics of statements
The meaning of a statement S is defined as Examples: skip = x:=1 = [x 1] while true do skip = S = ’ if S, * ’ else
66
Properties of operational semantics
67
Deterministic semantics for While
Theorem: for all statements S and states 1, 2 if S, * 1 and S, * 2 then 1= 2 Theorem: for all statements S and states , no derivation sequence from S, ever reaches a stuck state
68
Program termination Given a statement S and input
S terminates on if there exists a state ’ such that S, * ’ S loops on if there is no state ’ such that S, * ’ Given a statement S S always terminates if for every input state , S terminates on S always loops if for every input state , S loops on
69
Semantic equivalence S1 and S2 are semantically equivalent if for all : S1, * if and only if S2, * is either stuck or terminal and there is an infinite derivation sequence for S1, if and only if there is one for S2, A formal basis for correct optimizations of While programs Examples S; skip and S while b do S and if b then (S; while b S) else skip ((S1; S2); S3) and (S1; (S2; S3)) (x:=5; y:=x*8) and (x:=5; y:=40)
70
Operational semantics of IL
71
IL syntax reminder n Num Numerals l Num Labels
x Var Temp Variables and temporaries V n | x R V Op V |V Op - | + | * | = | | << | >> | … C l: skip | l: x := R | l: Goto l’ | l: IfZ x Goto l’ IR C+
72
Intermediate program states
Z Integers {0, 1, -1, 2, -2, …} IState (Var Temp {pc}) Z pc (for program counter) is a special variable Var, Temp, and {pc} are all disjoint For every intermediate program state m and program P=1:c1,…,n:cn we have that 1 m(pc) n+1 We can check that the labels used in P are legal
73
Rules for executing commands
We will use rules of the following form Here m is the pre-state, which is scheduled to be executed as the program counter indicates, and m’ is the post-state The rules specialize for the particular type of command C and possibly other conditions m(pc) = l P(l) = C m m’
74
Evaluating values nm = n xm = m(x) v1 op v2m = v1m op v2m
75
Rules for executing commands
m(pc) = l P(l) = skip m m[pcl+1] m(pc) = l P(l) = x:=v m m[pcl+1, xv m] m(pc) = l P(l) = x:=v1 op v2 m m[pcl+1, xv1 op v2 m] m(pc) = l P(l) = Goto l’ m m[pcl’]
76
Rules for executing commands
m(pc) = l P(l) = IfZ x Goto l’ m(x)=0 m m[pcl’] m(pc) = l P(l) = IfZ x Goto l’ m(x)0 m m[pcl+1] m(pc) = l P(l) = IfNZ x Goto l’ m(x) 0 m m[pcl’] m(pc) = l P(l) = IfNZ x Goto l’ m(x)=0 m m[pcl+1]
77
Executing programs For a program P=1:c1,…,n:cn we define executions as finite or infinite sequences m1 m2 … mk … We write m * m’ if there is a finite execution starting at m and ending at m’: m = m1 m2 … mk = m’
78
Semantics of a program For a program P=1:c1,…,n:cn and a state m s.t. m(pc)=1 we define the result of executing P on m as Lemma: the function is well-defined (i.e., at most one output state) P m = m’ if m * m’ and m’(pc)=n+1 else
79
Execution example Execute the following intermediate language program on a state where all variables evaluate to 0 1: t1 := 137 2: y := t1 + 3 3: IfZ x Goto 7 4: t2 := y 5: z := t2 6: Goto 9 7: t3 := y 8: x := t3 9: skip m = [pc1, t10 , t20 , t30 , x0 , y0] ?
80
Execution example Execute the following intermediate language program on a state where all variables evaluate to 0 1: t1 := 137 2: y := t1 + 3 3: IfZ x Goto 7 4: t2 := y 5: z := t2 6: Goto 9 7: t3 := y 8: x := t3 9: skip m = [pc1, t10 , t20 , t30 , x0 , y0] m[pc2, t1137] m[pc3, t1137, y140] m[pc7, t1137, y140] m[pc8, t1137, t3140, y140] m[pc9, t1137, t3140, , x140, y140] m[pc10, t1137, t3140, , x140, y140]
81
Semantic equivalence P1 and P2 are semantically equivalent if for all m the following holds: ?
82
Are these programs semantically equivalent?
Semantic equivalence P1 and P2 are semantically equivalent if for all m the following holds: Attempt 1: P1 m = P2 m Are these programs semantically equivalent? 1: t1 := 137 2: y := t1 + 3 1: y := 140
83
Semantic equivalence P1 and P2 are semantically equivalent if for all m the following holds: (P1 m)|Var = (P2 m)|Var
84
Exercise Why do we need this?
85
Adding procedures
86
While+procedures syntax
n Num Numerals x Var Program variables f Function identifiers, including main A n | x | A ArithOp A | (A) | f(A*) ArithOp - | + | * | / B true | false | A = A | A A | B | B B | (B) S x := A | skip | S; S | { S } | if B then S else S | while B S | f(A*) F f(x*) { S } P ::= F+ Can be used to call side-effecting functions and to return values from functions
87
IL+procedures syntax n Num Numerals
l Num Labels and function identifiers x Var Temp Params Variables, temporaries, and parameters f Function identifiers V n | x R V Op V | V | Call f(V*) | pn Op - | + | * | / | = | | << | >> | … C l: skip | l: x := R | l: Goto l’ | l: IfZ x Goto l’ | l: IfNZ x Goto l’ | l: Call f(x*) | l: Ret x IR C+ Accesses value of function parameter n
88
Exercise
89
Formalizing the correctness of lowering
90
Exercise 1 Are the following equivalent in your opinion? While IL
2: x := t1
91
Exercise 2 Are the following equivalent in your opinion? While IL
2: x := y
92
Exercise 3 Are the following equivalent in your opinion? While IL
2: t2 := 138 3: t1 := 137 4: x := t1
93
Exercise 4 Are the following equivalent in your opinion? While IL
2: t2 := 138 3: t1 := 137 4: x := t1 5: t1 := 138
94
Equivalence of arithmetic expressions
While state: Var Z IL state: m (Var Temp {pc}) Z Define label(P) to be first label of IL program P cgen(a) = (P,res) An arithmetic While expression a is equivalent to an IL program P and resVar Temp iff for every input state m such that m(pc)=label(P): a m|Var = (P m) res
95
Expression equivalence example
137+x m|Var = (P m) t3 for every state m such that m(pc)=1 Example: m=[pc1, t10 , t20 , t30 , x3] then m|Var is m|{x} = [x3] (P [pc1, t10 , t20 , t30 , x3])= [pc4, t13 , t2137 , t3140 , x3] t3 = 140 While IL 137+x 1: t1 := x 2: t2 := 137 3: t3 := t1 + t2
96
Statement equivalence
We define state equivalence by m iff =m|Var That is, for each xVar (x)=m(x) A While statement S is equivalent to an IL program P iff for every input states m such that m(pc)=label(P): S P m
97
Example While IL S m|Var P m
Example: if m=[pc1, t10 , t20 , t30 , x3] then m|Var is m|{x} = [x3] S m|Var = [x137] P m = [pc6, t1137 , t2138 , t30 , x137] While IL x := 137 2: t2 := 138 3: t1 := 137 4: x := t1 5: t1 := 138
98
Next lecture: Dataflow-based Optimizations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.