Download presentation
Presentation is loading. Please wait.
1
1 Introduction to Compilation Cheng-Chia Chen
2
2 What is a compiler? l a program that translates an executable program in one language into an executable program in another language l the compiler typically lowers the level of abstraction of the program l for “optimizing” compilers, we also expect the program produced to be better, in some way, than the original
3
3 Abstract view of compiler Implications: »recognize legal (and illegal) programs »generate correct code »manage storage of all variables and code »need format for object (or assembly) code
4
4 Traditional decomposition of a compiler Implications: intermediate language (il) front end maps legal code into il back end maps il onto target machine simplify retargeting allows multiple front ends multiple phases => better code Front end is O(n) or O(n log n) Back end is NP-Complete
5
5 Advantage of the decomposition
6
6 Components of a Compiler l Analysis »Lexical Analysis »Syntax Analysis »Semantic Analysis l Synthesis »Intermediate Code Generation »Code Optimization »Code Generation
7
7 The Structure of a Compiler l Front-end »Lexical Analysis »Parsing »Semantic Analysis »intermediate code generation l back-end »Optimization »Code Generation l The first 3, at least, can be understood by analogy to how humans comprehend a natural language.
8
8 Responsibilities of Frond End l recognize legal programs l report errors l produce il l preliminary storage map l shape the code for the back end Much of front end construction can be automated
9
9 Responsibilities of Back-end code optimization: [middle-end] »analyzes and changes il »goal is to reduce runtime »must preserve values code generation: »translate il into target machine code »choose instructions for each il operation »decide what to keep in registers at each point »ensure conformance with system interfaces
10
10 Lexical Analysis l First step: recognize words. »Smallest unit above letters Compiler is an interesting course. l Note the »Capital “ C ” (start of sentence symbol) »Blank “ “ (word separator) »Period “. ” (end of sentence symbol)
11
11 More Lexical Analysis l Lexical analysis is not trivial. Consider: 編譯器是一門有趣的課程。 l Programming languages are typically more cryptic than English: *h->j++ = -12.345e-5
12
12 And More Lexical Analysis Lexical analyzer divides program text into “ words ” or “ tokens ” if x == y then z = 1; else z = 2; l Units: if, x, ==, y, then, z, =, 1, ;, else, z, =, 2, ;
13
13 Parsing (syntax analysis) l Once words are understood, the next step is to understand sentence structure l Parsing = Diagramming Sentences »The diagram is a tree
14
14 Diagramming a Sentence Thislineisalongersentence verbarticlenounarticleadjectivenoun NP sentence VP
15
15 Parsing Programs l Parsing program expressions is the same l Consider: If x == y then z = 1; else z = 2; l Diagrammed: if-then-else xyz1z2== assignrelationassign predicateelse-stmtthen-stmt
16
16 Semantic Analysis Once sentence structure is understood, we can try to understand “ meaning ” »But meaning is too hard for compilers l Compilers perform limited analysis to catch inconsistencies l Some do more analysis to improve the performance of the program
17
17 Semantic Analysis in Natural Language l Example: 張三認為李四拿走他的課本. 誰的課本被拿走 ? 張三, 李四 or 第三者 ? l Even worse: Jack said Jack left his assignment at home? How many Jacks are there? Which one left the assignment?
18
18 Semantic Analysis in Programming l Programming languages define strict rules to avoid such ambiguities This C++ code prints “ 4 ” ; the inner definition is used l Illegal in Java. { int x = 3; { int x = 4; cout << x; }
19
19 More Semantic Analysis l Compilers perform many semantic checks besides variable bindings l Example: John loves her sister. A “ type mismatch ” between her and John; we know they are different people »Presumably John is male
20
20 Optimization l No strong counterpart in English, but akin to editing l Automatically modify programs so that they »Run faster »Use less memory »In general, conserve some resource
21
21 Optimization Example X = Y * 0 is the same as X = 0 X = Y * 2 is the same as X = Y + Y Assume X and Y are integers
22
22 Code Generation l Produces assembly code (usually) l A translation into another language »Analogous to human translation
23
23 Intermediate Languages l Many compilers perform translations between successive intermediate forms »All but first and last are intermediate languages internal to the compiler »Typically there is 1 IL IL ’ s generally ordered in descending level of abstraction »Highest is source »Lowest is assembly
24
24 Intermediate Languages (Cont.) IL ’ s are useful because lower levels require exposure of many features hidden by higher levels »registers »memory layout »etc. l It is hard to obtain all these hidden features directly from the source input.
25
25 Example l source line: a = bb+abs(c-7); »a sequence of ASCII characters in a text file. l The scanner groups characters into tokens: a = bb+abs(c-7); l After scanning, we have the token sequence: Id a Asg Id bb Plus Id abs Lparen Id c Minus IntLiteral 7 Rparen Semi
26
26 Example l The parser groups these tokens into parse tree: note: (, ) and ; disappear in the tree.
27
27 l The type checker resolves types and binds declarations within scopes:
28
28 l Finally, JVM code is generated for each node in the tree (leaves first, then roots): iload 3 // push local 3 (bb) iload 2 // push local 2 (c) ldc 7 // Push literal 7 isub // compute c-7 invokestatic java/lang/Math/abs(I)I iadd // compute bb+abs(c-7) istore 1 // store result into local 1(a)
29
29 Issues l Compiling is almost this simple, but there are many pitfalls. l Example: How are erroneous programs handled? l Language design has big impact on compiler »Determines what is easy and hard to compile »Course theme: many trade-offs in language design
30
30 Compilers Today l The overall structure of almost every compiler adheres to the outline l The proportions have changed since FORTRAN »Early: lexing, parsing most complex, expensive »Today: optimization dominates all other phases, lexing and parsing are cheap
31
31 Applications of Compilation Techniques l Editor l Interpreter l Debugger l Word Processing (Tex, Word) l VLSI Design (VHDL, Verilog) l Pattern Recognition
32
32 Trends in Compilation l Compilation for speed is less interesting. But: »scientific programs »advanced processors (Digital Signal Processors, advanced speculative architectures) l Ideas from compilation used for improving code reliability: »memory safety »detecting data races »...
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.