Download presentation
1
Intermediate Code Representations
2
Conceptual phases of compiler
Lexical Analysis (scanner) Syntax analysis (parser) Semantic Analysis Code optimization Code generation Sequence of tokens Optimized code Intermediate code - IR1 Intermediate code IR2 Target code Front End machine independent language dependent Middle Back End machine dependent language independent
3
Why use an IR? Separates machine independent and machine dependent parts of the compiler - Both retargetable. Easier to perform machine independent optimizations than at machine code level Example: common sub-expression elimination 3. Simplifies code generation
4
IR – Encodes Compiler’s Program Knowledge
Thus, some IR PROPERTIES: Ease of generation Ease of manipulation Size Freedom of Expression Level of Abstraction Selecting IR is critical.
5
3 Categories of IRs Structural/Graphical - AST and Concrete ST
- call graph - program dependence graph (PDG) 2. Linear - 3-address code - abstract stack machine code Hybrid - control flow graph (CFG) Advantages and disadvantages and typical uses of these categories of IRs
6
Level of Abstraction Consider:A[j,i] = @A + j*10 + i Loadi 1, R1
[ ] A I J Loadi 1, R1 Sub RJ, R1, R2 Loadi 10, R3 Mult R2, R3, R4 Sub Ri, R1, r5 Add R4, R5, R6 R7 Add R7, R6, R8 Load R8, RAIJ What is the construct being represented? Array subscripting of A[I,j]. High level AST – good for memory disambiguation, maybe harder to optimize, easier to generate Low level 3-addr code: different opts capable here
7
Some Design Issues for IRs
Questions to Ponder: What is the minimum needed in the language’s set of operators? What is the advantage of a small set of operators? What is the concern of designing the operations Close to actual machine operations? 4. What is the potential problem of having a small Set of IR operations? Need to express the source languages Small set of oeprators – easier to implement If too close to particular machine, then lose portability Small set could lead to long instruction sequences – requires more work during optimization phase
8
High Level Graphical Representations
Consider: A -> V := E E -> E + E | E * E | - E | id String: a := b * - c + b * - c Exercise: Concrete ST? AST? DAG? AST: more compact, easier to generate code DAG: unique node for each value. More compact. Showing redundant expressions explicitly. Easy to Generate during parsing. Encodes redundancy.
9
Linear IRs: Three Address Code
Sequence of instructions of the form X := y op z where x, y and z are variable names, constants, or compiler generated variables (“temporaries”) Only one operator is permitted on the RHS – expressions computed using temporaries
10
Simple Linear IRs Write the 3 – address code for: a := b * - c + b * - c ? = -c = b * ? … complete the code from the ast? The dag? There is a need for compiler-generated temporary variables (temps) to represent intermediary values in internal nodes of ast. Code from ast: T1 = -c T2 = b * T1 T3 = -c T4 = b * T3 T5 = T2 + T4 A = T5 Versus from dag T3 = T1 + T2 A = T3
14
Exercise Give the 3 address code for: Z := x * y + a[j] / sum(b)
15
More Simple Linear IRs Stack machine code: push, pop, ops Consider: x – 2 * y Advantages? Push x Push 2 Push y Mult Sub Advantages: compact, temp names are implicit. Temps take up no extra space. Simple to generate and execute, useful when code transmitted over slow common links (the internet).
16
Hybrid IRs
18
Exercise – Construct the CFG
Where are the leaders? Basic blocks? Edges?
19
Call Graph Representation
Node = function or method Edge from A to B : A has a call site where B is potentially called
20
Exercise: Construct a call graph
21
Multiple IRs: WHIRL
22
Key Highlights of IRs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.