Download presentation
Presentation is loading. Please wait.
1
Yu-Chen Kuo1 Chapter 1 Introduction to Compiling
2
Yu-Chen Kuo2 1.1 Compilers
3
Yu-Chen Kuo3 Source languages: Fortran, Pascal, C, etc. Target languages: another PL, machine Lang Compilers: –Single-pass –Multi-pass –Load-and-Go –Debugging –Optimizing
4
Yu-Chen Kuo4 Analysis-Synthesis Model Compilation: Analysis & Synthesis Analysis: –Break source program into pieces –Intermediate representation –Hierarchical structure: syntax tree Node: operation Leaf: arguments Synthesis: construct target program from tree
5
Yu-Chen Kuo5 Analysis-Synthesis Model
6
Yu-Chen Kuo6 Context of a Compiler Several other programs to create.exe files –Preprocessor: macros –Assembler: translate assembly into machine code –Loader/link-editor: link library routines
7
Yu-Chen Kuo7 Context of a Compiler
8
Yu-Chen Kuo8 1.2 Analysis of the source program Three phases 1.Linear analysis Divide source program into tokens 2.Hierarchical analysis Tokens grouped hierarchically 3.Semantic analysis Ensure components fit meaningfully
9
Yu-Chen Kuo9 Lexical Analysis Linear analysis: lexical analysis, scanning e.g., position:= initial+rate*60 1.Identifier position 2.Assignment symbol “: =“ 3.Identifier initial 4.“+” sign 5.Identifier rate 6.“*” sign 7.number 60
10
Yu-Chen Kuo10 Syntax Analysis Hierarchical analysis: parsing or syntax analysis –Group tokens into grammatical phrases Grammatical phrases: parser tree
11
Yu-Chen Kuo11 Syntax Analysis
12
Yu-Chen Kuo12 Syntax Analysis Hierarchical structure is expressed by recursive rules Recursively define expression 1.identifier is an expression 2.number is an expression 3.expression1 +/ expression2 (expression1) are an expression By rule 1, initial and rate are exp. By rule 2, 60 is an exp. By rule 3, initial+rate*60 is an exp.
13
Yu-Chen Kuo13 Syntax Analysis Recursively define statement 1. identifier1:= expression2 is a statement 2. while (expression1) do statement2 If (expression1) then statement2 are statements
14
Yu-Chen Kuo14 Lexical v.s. Syntax Analysis Division is arbitrary Recursion or not –recognize identifiers, by linear scan until neither a letter or a digital was found, no recursion E.g., initial –Not powerful enough to analyze exp. or statement, without putting hierarchical structure E.g, ( …..), begin …. end, statements
15
Yu-Chen Kuo15 Lexical v.s. Syntax Analysis Division is arbitrary Recursion or not –recognize identifiers, by linear scan until neither a letter or a digital was found, no recursion E.g., initial –Not powerful enough to analyze exp. or statement, without putting hierarchical structure E.g, ( …..), begin …. end, statements
16
Yu-Chen Kuo16 Semantic Analysis Check semantic error Gather type information for code-generation Using hierarchical structure to identify operators and operands Doing type checking –E.g, using a real number to index an array (error) –Type convert –E.g, Fig.1.5 ittoreal(60) if initial is a real number
17
Yu-Chen Kuo17 Semantic Analysis
18
Yu-Chen Kuo18 Analysis in Text Formatters \hbox { } \hbox {\vbox{! 1} \vbox{@ 2}}
19
Yu-Chen Kuo19 1.3 The Phases of A Compiler
20
Yu-Chen Kuo20 1.3 The Phases of A Compiler Phases First three phases: analysis portion Last three phases: synthesis portion Symbol-table management phase Error handler phases
21
Yu-Chen Kuo21 Symbol-table Management To record the identifiers in source program –Identifier is detected by lexical analysis and then is stored in symbol table To collect the attributes of identifiers (not by lexical analysis) –Storage allocation : memory address –Types –Scope (where it is valid, local or global) –Arguments (in case of procedure names) Arguments numbers and types Call by reference or address Return types
22
Yu-Chen Kuo22 Symbol-table Management Semantic analysis uses type information check the type consistence of identifiers Code generating uses storage allocation information to generate proper relocation address code
23
Yu-Chen Kuo23 Error Detection and Reporting Syntax and semantic analysis handle a large fraction of errors Lexical phase: could not form any token Syntax phase: tokens violate structure rules Semantic phase: no meaning of operations –Add an array name and a procedure name
24
Yu-Chen Kuo24 Translation of A Statement
25
Yu-Chen Kuo25 Translation of A Statement
26
Yu-Chen Kuo26 The Analysis Phases Lexical analysis –Group characters into tokens Identifiers Keywords ( if, while ) Punctuations ( ‘(‘,’)’) Multi-character operator (‘:=‘) –Enter lexical value (lexeme) into symbol table position, rate, initial Syntax analysis –Fig. 1.11(a), 1.11(b)
27
Yu-Chen Kuo27 The Analysis Phases Syntax analysis Semantic analysis –Type checking and converting
28
Yu-Chen Kuo28 Intermediate Code Generation Represent the source program for an abstract machine code Should be easy to produce Should be easy to translate into target program Three-address code (at most three operands) –temp2:=id3*temp1 –every memory location can act like a register temp2 BX
29
Yu-Chen Kuo29 Code Optimization Improve the intermediate code Faster-running machine code –temp1 :=id3*60.0 id1:=id2+temp1
30
Yu-Chen Kuo30 Code Generation Generate relocation machine code or assembly code –MOVF id3, R2 MULF#60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1
31
Yu-Chen Kuo31 1.4 Cousins of The Compiler Preprocessors Assemblers Two-Pass Assembler Loaders and Link-Editors
32
Yu-Chen Kuo32 Preprocessors Macro processing File inclusion –#include replace by file “global.h” Rational preprocessors Language extensions –## query language embedded in C –Translated into procedure call
33
Yu-Chen Kuo33 Preprocessors Example 1.2 –\define\JACM #1; #2; #3 {{\s1 J. ACM} {\bf #1}: #2, pp. #3.} –\JACM 17;4;715-728 J. ACM 17:4, pp. 715-728.
34
Yu-Chen Kuo34 Assembler Producing relocatable machine code –DW a #10 DW b #20 MOV a, R1 ADD #2, R1 MOV R1, b Load content of address a into R1 Add constant 2 Store R1 into address b
35
Yu-Chen Kuo35 Two-Pass Assembly First pass –Find all identifiers and their storage location and store in symbol table Identifier Address a 0 b 4 Second pass –Translate each operation code into the sequence of bits –Relocatable machine code
36
Yu-Chen Kuo36 Two-Pass Assembly Example 1.3 Inst. Code Register Mem/Const. Content (R) 0001(MOV) 01(R1) 00(Mem) 00000000(a) * 0011(ADD) 01(R1) 10(Constant) 00000010 0010(MOV) 01(R1) 00(Mem) 00000100(b) *
37
Yu-Chen Kuo37 Two-Pass Assembly ‘*’ denotes relocation bit –if data is loaded starting at address 00001111 –a should be at location 00001111+00000000 –b should be at location 00001111+00000100 Inst. Code Register Mem/Const. Content (R) 0001(MOV) 01(R1) 00(Mem) 00000111(a) * 0011(ADD) 01(R1) 10(Constant) 00000010 0010(MOV) 01(R1) 00(Mem) 00010011(b) *
38
Yu-Chen Kuo38 Loaders and Link-Editors Loader –Taking and altering relocatable address machine codes Link-editors –External references Library file, routines by system, any other program
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.