Mooly Sagiv and Roman Manevich School of Computer Science

Slides:



Advertisements
Similar presentations
Winter Compiler Construction T6 – semantic analysis part I scopes and symbol tables Mooly Sagiv and Roman Manevich School of Computer Science.
Advertisements

CPSC 388 – Compiler Design and Construction
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Intermediate Code Generation
Programming Languages and Paradigms
Intermediate Representation III. 2 PAs PA2 deadline is
Lecture 08a – Backpatching & Recap Eran Yahav 1 Reference: Dragon 6.2,6.3,6.4,6.6.
1 Compiler Construction Intermediate Code Generation.
Winter Compiler Construction T7 – semantic analysis part II type-checking Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv.
Program Representations. Representing programs Goals.
Compiler Construction Intermediate Representation III Activation Records Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Chapter 14: Building a Runnable Program Chapter 14: Building a runnable program 14.1 Back-End Compiler Structure 14.2 Intermediate Forms 14.3 Code.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Intermediate code generation. Code Generation Create linear representation of program Result can be machine code, assembly code, code for an abstract.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
Compiler Construction Semantic Analysis I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction Semantic Analysis II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Environments and Evaluation
Compiler Summary Mooly Sagiv html://
Compiler Construction Recap Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction Intermediate Representation I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction Semantic Analysis II Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction Semantic Analysis I Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
CS412/413 Introduction to Compilers Radu Rugina Lecture 15: Translating High IR to Low IR 22 Feb 02.
Semantic Analysis III + Intermediate Representation I.
Winter Compiler Construction T8 – semantic analysis recap + IR part 1 Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Winter Compiler Construction T9 – IR part 2 + Runtime organization Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Semantic Analysis II. Messages Please check lecturer notices in the Moodle Appeals  Legitimate: “I don’t have the bug you mentioned…”  Illegitimate:
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Semantic Analysis III + Intermediate Representation I.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
Code Generation CPSC 388 Ellen Walker Hiram College.
CS412/413 Introduction to Compilers Radu Rugina Lecture 11: Symbol Tables 13 Feb 02.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Lecture 9 Symbol Table and Attributed Grammars
CS 404 Introduction to Compiler Design
Information and Computer Sciences University of Hawaii, Manoa
Introduction to Compiler Construction
Intermediate code Jakub Yaghob
Constructing Precedence Table
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Semantic Analysis with Emphasis on Name Analysis
Review: Chapter 5: Syntax directed translation
Compiler Chapter 9. Intermediate Languages
CS 536 / Fall 2017 Introduction to programming languages and compilers
Compiler Construction
Code Generation.
Chapter 6 Intermediate-Code Generation
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
CSE401 Introduction to Compiler Construction
Three-address code A more common representation is THREE-ADDRESS CODE . Three address code is close to assembly language, making machine code generation.
Compiler design.
Getting into the back-end Noam Rinetzky
Intermediate Code Generation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Mooly Sagiv and Roman Manevich School of Computer Science
Review: What is an activation record?
Presentation transcript:

Mooly Sagiv and Roman Manevich School of Computer Science Winter 2007-2008 Compiler Construction T8 – semantic analysis recap + IR part 1 Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University

Announcements נא להשתתף בסקר ההוראה Take grader’s comments to your attention You must document your code with javadoc You must create a testing strategy Catch all exceptions you throw in main and handle them (don’t re-throw) Print a sensible error message Compiler usage format: java IC.Compiler <file.ic> [options] Check that there is a file argument, don’t just crash Don’t do semantic analysis in IC.cup

More announcements Know thy group’s code What is expected in PA3 documentation (5 pts) Testing strategy: lists of tests and what they check (scope rules, type rules etc.) How did you conduct the semantic analysis – in which order do you do things Don’t document the most obvious things (“The input to CUP is in IC.cup”), don’t repeat stuff already in PA3.pdf Document non-obvious choices (“We build the type table after parsing”, “we do not handle situation X”, “we handle special situation Y like this…”) Output format (e.g., “first we print type table then we print…”) Know thy group’s code

Today Today: Semantic analysis recap (PA3) IC Language ic Lexical Analysis Syntax Analysis Parsing AST Symbol Table etc. Inter. Rep. (IR) Code Generation Executable code exe Today: Semantic analysis recap (PA3) Intermediate representation (PA4) (HIR and) LIR Beginning IR lowering

Semantic analysis flow example class A { int x; int f(int x) { boolean y; ... } } class B extends A { boolean y; int t; } class C { A o; int z; }

Parsing and AST construction Program file = … class A { int x; int f(int x) { boolean y; ... } } class B extends A { boolean y; int t; } class C { A o; int z; } parser.parse() classes[2] classes[0] classes[1] … ICClass name = A ICClass name = B super = A ICClass name = C fields[0] methods[0] … Field name = x type = IntType Method name = f … formals[0] body LocalVariable varName = y initExpr = null type = BoolType Formal name = x type = IntType TypeTable IntType BoolType A B C f : int->int … Table populated with user-defined types during parsing (or special AST pass)

Defined types and type table abstract class Type { String name; boolean subtypeof(Type t) {...} } class IntType extends Type {...} class BoolType extends Type {...} class ArrayType extends Type { Type elemType; } class MethodType extends Type { Type[] paramTypes; Type returnType; } class ClassType extends Type { ICClass classAST; } class A { int x; int f(int x) { boolean y; ... } } class B extends A { boolean y; int t; } class C { A o; int z; } class TypeTable { public static Type boolType = new BoolType(); public static Type intType = new IntType(); ... public static ArrayType arrayType(Type elemType) {…} public static ClassType classType(String name, String super, ICClass ast) {…} public static MethodType methodType(String name,Type retType, Type[] paramTypes) {…} } TypeTable IntType BoolType A B C f : int->int …

Assigning types by declarations AST Program file = … All type bindings available during parsing time classes[2] classes[0] classes[1] … ICClass name = A ICClass name = B super = A ICClass name = C TypeTable type IntType BoolType ... fields[0] methods[0] … Field name = x type = IntType Method name = f … ClassType name = C type formals[0] body ClassType name = B LocalVariable name = y initExpr = null type = BoolType type Formal name = x type = IntType super ClassType name = A type MethodType name = f retType paramTypes

Symbol tables AST Global symtab Program file = … classes[2] classes[0] A symtab C symtab ICClass name = A ICClass name = B super = A ICClass name = C x FIELD IntType f METHOD int->int o CLASS A z FIELD IntType fields[0] methods[0] … Field name = x type = IntType Method name = f … B symtab f symtab t FIELD IntType y BoolType x PARAM IntType y VAR BoolType this A ret RET_VAR formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType abstract class SymbolTable { private SymbolTable parent; } class ClassSymbolTable extends SymbolTable { Map<String,Symbol> methodEntries; Map<String,Symbol> fieldEntries; } class MethodSymbolTable extends SymbolTable { Map<String,Symbol> variableEntries; } abstract class Symbol { String name; } class VarSymbol extends Symbol {…} class LocalVarSymbol extends Symbol {…} class ParamSymbol extends Symbol {…} ...

Scope nesting in IC Global names of all classes Class class GlobalSymbolTable extends SymbolTable {} class ClassSymbolTable extends SymbolTable {} class MethodSymbolTable extends SymbolTable {} class BlockSymbolTable extends SymbolTable { Global Symbol Kind Type Properties names of all classes Symbol Kind Type Properties Class fields and methods Symbol Kind Type Properties Method formals + locals Symbol Kind Type Properties Block variables defined in block

Symbol tables AST this belongs to method scope Global symtab AST A CLASS B C Program file = … classes[2] classes[0] classes[1] … A symtab C symtab ICClass name = A ICClass name = B super = A ICClass name = C x FIELD IntType f METHOD int->int o CLASS A z FIELD IntType fields[0] methods[0] … Field name = x type = IntType Method name = f … B symtab f symtab t FIELD IntType y BoolType x PARAM IntType y VAR BoolType this A ret RET_VAR formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType this belongs to method scope … Location name = x type = ? ret can be used later for type-checking return statements

Sym. tables phase 1 : construct Global symtab AST enclosingScope A CLASS B C Program file = … classes[2] classes[0] classes[1] … A symtab C symtab ICClass name = A ICClass name = B super = A ICClass name = C x FIELD IntType f METHOD int->int o CLASS A z FIELD IntType fields[0] methods[0] … Field name = x type = IntType Method name = f … B symtab f symtab t FIELD IntType y BoolType x PARAM IntType y VAR BoolType this A ret RET_VAR formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType … Build tables, Link each AST node to enclosing table ? Location name = x type = ? symbol abstract class ASTNode { SymbolTable enclosingScope; } class TableBuildingVisitor implements Visitor { ... }

Sym. tables phase 1 : construct Global symtab AST A CLASS B C Program file = … classes[2] classes[0] classes[1] … A symtab C symtab ICClass name = A ICClass name = B super = A ICClass name = C x FIELD IntType f METHOD int->int o CLASS A z FIELD IntType fields[0] methods[0] … Field name = x type = IntType Method name = f … B symtab f symtab t FIELD IntType y BoolType x PARAM IntType y VAR BoolType this A ret RET_VAR formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType During this phase, add symbols from definitions, not uses, e.g., assignment to variable x … ? Location name = x type = ? symbol class TableBuildingVisitor implements Visitor { ... }

Sym. tables phase 2 : resolve Global symtab AST A CLASS B C Program file = … classes[2] classes[0] classes[1] … A symtab C symtab ICClass name = A ICClass name = B super = A ICClass name = C x FIELD IntType f METHOD int->int o CLASS A z FIELD IntType fields[0] methods[0] … Field name = x type = IntType Method name = f … B symtab f symtab t FIELD IntType y BoolType x PARAM IntType y VAR BoolType this A ret RET_VAR formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType Resolve each id to a symbol, e.g., in x=5 in foo, x is the formal parameter of f … Location name = x type=? symbol enclosingScope check scope rules: illegal symbol re-definitions, illegal shadowing, illegal use of undefined symbols ... class SymResolvingVisitor implements Visitor { ... }

Type-check AST AST Program file = … TypeTable classes[2] classes[0] IntType BoolType ... classes[1] … ICClass name = A ICClass name = B super = A ICClass name = C fields[0] methods[0] … Field name = x type = IntType Method name = f … formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType Use type-rules to infer types for all AST expression nodes and check type rules for statements … Location name = x type = IntType class TypeCheckingVisitor implements Visitor { ... }

Miscellaneous semantic checks AST Program file = … classes[2] classes[0] classes[1] … ICClass name = A ICClass name = B super = A ICClass name = C fields[0] methods[0] … Field name = x type = IntType Method name = f … formals[0] body LocalVariable name = y initExpr = null type = BoolType Formal name = x type = IntType Check remaining semantic checks: single main method, break/continue inside loops etc. … Location name = x type = IntType class SemanticChecker implements Visitor { ... }

How to write PA3 Implement skeleton of type hierarchy + type table Modify IC.cup/Library.cup to use types Check result using -print-ast option Implement Symbol classes and SymbolTable classes Implement symbol table construction Check using -dump-symtab option Implement symbol resolution Implement checks Scope rules Type-checks Remaining semantic rules

Class quiz Classify the following events according to compile time / runtime / other time (Specify exact compiler phase and information used by the compiler) x is declared twice in method foo Grammar G is ambiguous Attempt to dereference a null pointer x Number of arguments passed to foo is different from number of parameters reduce/reduce conflict between two rules Assignment to a+5 is illegal since it is not an l-value The non-terminal X does not derive any finite string Test expression in an if statement is not Boolean $r is not a legal class name Size of the activation record for method foo is 40 bytes

Intermediate representation Allows language-independent, machine independent optimizations and transformations Easy to translate from AST Easy to translate to assembly Narrow interface: small number of node types (instructions) Pentium optimize AST IR Java bytecode Sparc

Multiple IRs Some optimizations require high-level structure Others more appropriate on low-level code Solution: use multiple IR stages Pentium optimize optimize AST HIR LIR Java bytecode Sparc

What’s in an AST? Administration Expressions Flow of control Declarations For example, class declarations Many nodes do not generate code Expressions Data manipulation Flow of control If-then, while, switch Target language (usually) more limited Usually only jumps and conditional jumps

High-level IR (HIR) Close to AST representation High-level language constructs Statement and expression nodes Method bodies Only program’s computation Statement nodes if nodes while nodes statement blocks assignments break, continue method call and return

In this project we have HIR=AST High-level IR (HIR) Expression nodes unary and binary expressions Array accesses field accesses variables method calls New constructor expressions length-of expressions Constants In this project we have HIR=AST

Low-level IR (LIR) An abstract machine language Generic instruction set Not specific to a particular machine Low-level language constructs No looping structures, only labels + jumps/conditional jumps We will use – two-operand instructions Advantage – close to Intel assembly language LIR spec. available on web-site Will be used in PA4 Other alternatives Three-address code: a = b OP c Has at most three addresses (or fewer) Also named quadruples: (a,b,c,OP) Stack machine (Java bytecodes)

Arithmetic / logic instructions Abstract machine supports variety of operations a = b OP c a = OP b Arithmetic operations: ADD, SUB, DIV, MUL Logic operations: AND, OR Comparisons: EQ, NEQ, LEQ, GE, GEQ Unary operations: MINUS, NEG

Data movement Copy instruction: a = b Load/store instructions: a = *b *a = b Address of instruction a=&b Not used by IC Array accesses: a = b[i] a[i] = b Field accesses: a = b.f a.f = b

Branch instructions Label instruction label L Unconditional jump: go to statement after label L jump L Conditional jump: test condition variable a; if true, jump to label L cjump a L Alternative: two conditional jumps: tjump a L fjump a L

Call instruction Supports call statements call f(a1,…,an) And function call assignments a = call f(a1,…,an) No explicit representation of argument passing, stack frame setup, etc.

LIR instructions Instruction Meaning Move c,Rn Rn = c Move x,Rn Rn = x Move Rn,x x = Rn Add Rm,Rn Rn = Rn + Rm Sub Rm,Rn Rn = Rn – Rm Mul Rm,Rn Rn = Rn * Rm ... Immediate (constant) Memory (variable) Note 1: rightmost operand = operation destination Note 2: two register instr - second operand doubles as source and destination

Example Move 42,R1 Move R1,x x = 42; _test_label: while (x > 0) { Move x,R1 Compare 0,R1 JumpLE _end_label Move 1,R2 Sub R1,R2 Move R2,x Jump _test_label _end_label: x = 42; while (x > 0) { x = x - 1; } Compare results stored in global compare register (implicit register) Condition depends on compare register (warning: code shown is a naïve translation)

Translation (IR Lowering) How to translate HIR to LIR? Assuming HIR has AST form (ignore non-computation nodes) Define how each HIR node is translated Recursively translate HIR (HIR tree traversal) TR[e] = LIR translation of HIR construct e A sequence of LIR instructions Temporary variables = new locations Use temporary variables (LIR registers) to store intermediate values during translation

Translating expressions – example TR[x + 42] visit Add R2, R1 Move x, R1 AddExpr left right Move 42, R2 visit (left) visit (right) Add R2, R1 LocationEx id = x ValueExpr val = 42 Move x, R1 Move 42, R2 Very inefficient translation – can we do better?

Translating expressions (HIR) AST Visitor Generate LIR sequence for each visited node Propagating visitor – register information When visiting a expression node A single Target register designated for storing result A set of available auxiliary registers TR[node, target, available set] Leaf nodes Emit code using target register No auxiliaries required What about internal nodes?

Translating expressions Internal nodes Process first child, store result in target register Process second child Target is now occupied by first result Allocate a new register Target2 from available set for result of second child Apply node operation on Target and Target2 Store result in Target All initially available register now available again Result of internal node stored in Target (as expected)

Translating expressions – example TR[x + 42,T,A] visit T=R1,A={R2,…,Rn} Add R2, R1 AddExpr left right Move x, R1 Move 42, R2 visit (left) visit (right) Add R2, R1 LocationEx id = x ValueExpr val = 42 Move x, R1 Move 42, R2 T=R1,A={R2,…,Rn} T=R2,A={R3,…,Rn}

See you next week