Semantic Analysis III + Intermediate Representation I
2 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation IC compiler
3 Subtyping Type hierarchy is a tree Subtyping relation ≤ For all types: For reference types: A ≤ A A extends B {…} A ≤ B A ≤ B B ≤ C A ≤ C null ≤ A
4 Subtyping S ≤ T values(S) values(T) “A value of type S may be used wherever a value of type T is expected”
Examples int ≤ int ? null ≤ A ? null ≤ string ? string ≤ null ? null ≤ boolean ? null ≤ boolean[] ? A[] ≤ B[] ? 5
IC rules with subtyping 6 E e1 : T1 E e2 : T2 T1 ≤ T2 or T2 ≤ T1 op {==,!=} E e1 op e2 : bool Method invocation rules:
Semantics Analysis Flow 7
class A { int x; int f(int x) { boolean y;... } } class B extends A { boolean y; int t; } class C { A o; int z; } IC Program 8 ICClass name = A Field name = x … Method name = f Formal name = x … LocalVariable varName = y initExpr = null … fields[0] methods[0] body formals[0] AST Program file = … classes[0] ICClass name = B super = A classes[1] classes[2] … ICClass name = C … …
class A { int x; int f(int x) { boolean y;... } } class B extends A { boolean y; int t; } class C { A o; int z; } 9 class TypeTable { createUnique … get … } abstract class Type { String name; boolean subtypeof(Type t) {...} } class IntType extends Type {...} class BoolType extends Type {...} class ArrayType extends Type { Type elemType; } class MethodType extends Type { Type[] paramTypes; Type returnType; } class ClassType extends Type { ICClass classAST; } IntType BoolType A B C int->int … TypeTable Types
10 IntType BoolType... TypeTable ClassType name = A ClassType name = B ClassType name = C MethodType retType paramTypes super ICClass name = A Field name = x type = IntType Method name = f Formal name = x type = IntType LocalVariable name = y initExpr = null type = BoolType fields[0] methods[0] body formals[0] AST Program file = … classes[0] ICClass name = B super = A classes[1] classes[2] … ICClass name = C … …
Types Data Types Table Subtyping relation … Partial Correctness Acyclic Hierarchy No Redefinitions … 11
Symbol tables 12 ICClass name = A Field name = x type = IntType Method name = f Formal name = x type = IntType LocalVariable name = y initExpr = null type = BoolType fields[0] methods[0] body formals[0] AST Program file = … classes[0] ICClass name = B super = A classes[1] classes[2] … ICClass name = C … … ACLASS B C Global symtab xFIELDIntType fMETHODint->int A symtab oCLASSA zFIELDIntType C symtab tFIELDIntType yFIELDBoolType B symtab xPARAMIntType yVARBoolType thisVARA $retRET_VARIntType f symtab Location name = x type = ? … Resolve each id to a symbol check scope rules: illegal symbol re-definitions, illegal shadowing, illegal use of undefined symbols
Symbol tables 13 ICClass name = A Field name = x type = IntType Method name = f Formal name = x type = IntType LocalVariable name = y initExpr = null type = BoolType fields[0] methods[0] body formals[0] AST Program file = … classes[0] ICClass name = B super = A classes[1] classes[2] … ICClass name = C … … ACLASS B C Global symtab xFIELDIntType fMETHODint->int A symtab oCLASSA zFIELDIntType C symtab tFIELDIntType yFIELDBoolType B symtab xPARAMIntType yVARBoolType thisVARA $retRET_VARIntType f symtab Location name = x type = ? … this belongs to method scope $ret can be used later for type-checking return statements
Symbol tables Data Symbol tables … Partial Correctness Scope rules … 14
Type Checking Infer types for expression nodes Check rules 15
Miscellaneous semantic checks Single main method break/continue inside loops return on every control path … 16
Semantic Analysis Program is “correct” Data 17
Intermediate Representation I 18
19 Compiler IC Program ic x86 executable exe Lexical Analysis Syntax Analysis Parsing ASTSymbol Table etc. Inter. Rep. (IR) Code Generation IC compiler
20 Lexical analyzer tomatoes + potatoes + carrots tomatoes,PLUS,potatoes,PLUS,carrots,EOF Parser symbolkindtype tomatoesvarint potatoesvarint carrotsvarint LocationExpr id=tomatoes AddExpr leftright AddExpr leftright LocationExpr id=potatoesid=carrots LocationExpr IdType obj intO1 booleanO2 FooO3 Symtab hierarchy Global type table A E1 : T[] A E1.length : int Type checking Additional semantic checks Move tomatoes,R1 Move potatoes,R2 Add R2,R1... LIR
21 Intermediate representation Allows language-independent, machine independent optimizations and transformations Easy to translate from AST Easy to translate to assembly Narrow interface ASTIR Pentium Java bytecode Sparc optimize
22 Multiple IRs Some optimizations require high-level structure e.g. optimizations of nested “for” loops Others more appropriate on low-level code e.g. register allocation Solution: use multiple IR stages ASTLIR Pentium Java bytecode Sparc optimize HIR optimize
Machine Optimizations Some optimizations take advantage of the features of the target machine 23 ASTLIR Pentium Java bytecode Sparc optimize HIR optimize
24 High-level IR (HIR) High-level intermediate representation is essentially the AST Must be expressive for all input languages Preserves high level constructs If, while, for, switch, … Allows high-level optimizations
25 Low-level IR (LIR) Low-level representation is essentially an abstract machine language Low-level language constructs jumps, conditional jumps, … Allows optimizations specific to these constructs
26 InstructionMeaning Move c,RnRn = c Move x,RnRn = x Move Rn,xx = Rn Add Rm,RnRn = Rn + Rm Sub Rm,RnRn = Rn – Rm Mul Rm,RnRn = Rn * Rm... Note 1: rightmost operand = operation destination Note 2: two register instr - second operand doubles as source and destination LIR instructions Immediate (constant) Memory (variable)
27 Example x = 42; while (x > 0) { x = x - 1; } Move 42,R1 Move R1,x _test_label: Move x,R1 Compare 0,R1 JumpLE _end_label Move x,R1 Move 1,R2 Sub R2,R1 Move R1,x Jump _test_label _end_label:
28 Translation (IR lowering) How to translate HIR to LIR? Assuming HIR has AST form (ignore non-computation nodes) Define how each HIR node is translated Recursively translate HIR (HIR tree traversal) TR[e] = LIR translation of HIR construct e A sequence of LIR instructions Use temporary variables (LIR registers) to store intermediate values during translation
29 TR[e1 OP e2] R1 := TR[e1] R2 := TR[e2] R3 := R1 OP R2 TR[OP e] R1 := TR[e] R2 := OP R1 Binary operations (arithmetic and comparisons) Fresh virtual (LIR) register generated by translation Shortcut notation to indicate target register NOT LIR instruction Unary operations Translating expressions
30 LocationEx id = x AddExpr leftright ValueExpr val = 42 visit visit (left) visit (right) TR[x + 42] Move x, R1 Move 42, R2 Add R2, R1 Move x, R1Move 42, R2 Add R2, R1 Translating expressions – example
31 Translating (short-circuit) OR TR[e1 OR e2] R1 := TR[e1] Compare 1,R1 JumpTrue _end_label R2 := T[e2] Or R2,R1 _end_label: (OR can be replaced by Move operation since R1 is 0) Fresh labels generated during translation
32 Translating (short-circuit) AND TR[e1 AND e2] R1 := TR[e1] Compare 0,R1 JumpTrue _end_label R2 := T[e2] And R2,R1 _end_label: (AND can be replaced by Move operation since R1 is 1)
33 Translating array and field access TR[e1[e2]] R1 := TR[e1] R2 := TR[e2] MoveArray R1[R2], R3 TR[e1.f] R1 := TR[e1] MoveField R1. c f,R3
34 Translating statement block TR[s1; s2; … ; sN] TR[s1] TR[s2] TR[s3] … TR[sN]
35 Translating if-then-else TR[if (e) then s1 else s2] R1 := TR[e] Compare 0,R1 JumpTrue _false_label TR[s1] Jump _end_label _false_label: TR[s2] _end_label:
36 Translating if-then TR[if (e) then s] R1 := TR[e] Compare 0,R1 JumpTrue _end_label TR[s] _end_label:
37 Translating while TR[while (e) s] _test_label: R1 := TR[e] Compare 0,R1 JumpTrue _end_label TR[s] Jump _test_label _end_label