The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University
Outline Overview. Front-End Lexical Analysis. Syntactic Analysis. Semantic Analysis. Back-End Code Generation. Code Optimization.
Overview Translate a “source” program (in language S) into an “equivalent” “object” program (in language O). Translator source program (S) object program (O) error messages
The Model of Compilation Reduce Complexity Source/Target Independent Plug-able Compiler IR: contain sufficient information tree-like structure the “syntax tree” or Assembly-like format “three-address code”. Analysis (Front-End) Synthesis (Back-End) Intermediate Representation sourceobject
Front-End Lexical Analysis group the input stream into tokens Syntactic Analysis see if the source is “valid” or “correct” Contextual/Semantic Analysis make sure the program is “meaningful” or semantically correct.
Front-End Components Scanner Source program (text stream) Parser Intermediate Representation (file or in memory) Semantic Analyzer Front-End Construct parse tree. Group token. next-token token Symbol Table main(){ Check semantic/contextual. identifier main symbol ( parse-tree
Lexical Analysis Scanner. Group the input stream into tokens identifiers. numbers. keywords. symbols & signs. Lexeme: Character sequence forming a token. Eliminate all blanks and comments.
position := initial + rate * 60 Example: Tokens identifier position 2. assignment symbol := 3. identifier initial 4. plus symbol + 5. identifier rate 6. muliplication symbol * 7. integer-literal 60
Syntax Analysis Parser. Check if the source is “grammatically” correct. Construct a parse tree.
Mini-Triangle Syntax single-Command ::=V-name := Expression |Identifier ( Expression ) |if Expression then single-Command else single-Command |while Expression do single-Command |let Declaration in single-Command |begin Command end
Mini-Triangle Syntax Expression::=primary-Expression |Expression Operator primary-Expression primary-Expression ::=Integer-Literal |V-name |Operator primary-Expression |( Expression ) V-name::=Identifier... Operator::=+ | - | * | / | | = | \
Example: Parse Tree
Semantic Analysis Make sure that the program is “meaningful”. Walk the parse tree to check Type checking. Type conversion. Example: rate * 60 rate is a real variable rate * inttoreal(60) Generate IR (can also done by parser).
Example of IR Abstract Syntax Tree (AST) position := initial + rate * 60 interior node = operation children = arguments leaves = identifiers or constants := position+ initial* rate60
Example of IR Three-Address Code tmp := rate * 60 tmp := initial + tmp position := tmp position := initial + rate * 60
Back-End Code Optimization improve IR: machine-independent. improve object code: machine-depedent. optimizing compiler. widely-used. Code Generation generate object code. assign memory/register locations. instruction selection.
Front-End Components IR Optimizer Intermediate Representation (file or in memory) Code Generator Object code (assembly or binary) Peephole Optimizer Back-End Generate object code. Machine-independent optimization. IR Symbol Table Machine-dependent optimization Object code
Other Phases Symbol-Table Management information about identifier being-used. name type scope Scanner creates an entry into the table. Error Handler what to do when found errors in the source.
Compiler-Construction Tools Parser generators. Generate a parser from a CFG. Yacc, Bison. Scanner generators. Generate a scanner from regular expressions. Lex, Flex.