Download presentation
Presentation is loading. Please wait.
1
Dept of Computer Science
What are compilers? Dr. Barbara G. Ryder Dept of Computer Science Feb06 CS442 B.Ryder
2
A Compiler A program that translates computer programs that people write, into a language that a machine can execute INTUITIVELY (and oversimplified) Compiler has 3 parts - Parser checks that the program is syntactically CORRECT according to a grammar, and if it is, translates the computer program written by a human into an internal representation to aid translation ; all 3 parts of a compiler use the SAME internal representation. Optionally, the parser calls the optimizer to change the program so it runs faster or in less space or uses less energy (e.g., to run on a PDA or iPod. Code generator take the internal representation of the program and translates it into machine language (like the code You have seen earlier in the course. code generator parser optimizer Feb06 CS442 B.Ryder
3
Parser Programs are written in a high-level language such as Java or C++ A grammar description of the programming language describes a well-formed program Example of an English grammar excerpt: sentence = noun verb John swims sentence = noun verb adverb John swims well sentence = article adjective noun verb the tall boy swims sentence = article noun verb the boy swims Parsers check that a program adheres to the rules of the programming language’s grammar If so, parser translates the program into an internal representation used by the compiler High-level means not close to machine language; commands do not talk about memory locations or op codes (we’ll see examples) A program is a sequence of commands that are executed in order. Programming languages have grammars as well that describe the structure of well-formed programs Feb06 CS442 B.Ryder
4
Code Generator Translates the internal representation of a program into machine language Has all the info it needs in the internal representation and knows the program is correct according to the rules of the grammar Is targeted to output a specific machine language for a specific kind of computer Can change to a different computer chip with a different instruction set by changing code generators, without other changes to the compiler Feb06 CS442 B.Ryder
5
Arithmetic Expressions
Arithmetic expressions using +* operations Assume the acc can perform acc=acc <op> mem[const] where <op> can be any of +* Assume we only use integer constants in our expressions How can we represent an expression? 2 + 3 * 5? Do some more of these at the board First binary expressions 1+2 etc and then longer ones w same operator , 4*5*6, finally longer ones with different operators 4*5 +6 versus 4* (5+6) and show the 2 trees. + “2+(3*5)” 2 * “3*5” Feb06 CS442 B.Ryder 3 5
6
Examples (4+5)*6 4+5*6 * (4+5)*6 + 4+5*6 6 4 * 5*6 4+5 + 5 6 4 5
Feb06 CS442 B.Ryder
7
Internal Representation
As we parse an expression we can build a (tree) representation of it Let’s consider expressions involving integer variables and integer constants Feb06 CS442 B.Ryder
8
Example b = 3 x = a + 2 y = b + 1 z = y * x w = a + 2 u = 4 * x =, b 3
Feb06 CS442 B.Ryder
9
Example b = 3 x = a + 2 y = b + 1 z = y * x w = a + 2 u = 4 * x 3 =, b
Feb06 CS442 B.Ryder
10
Example b = 3 x = a + 2 y = b + 1 z = y * x w = a + 2 u = 4 * x 3 =, b
Local common subexpression elimination Feb06 CS442 B.Ryder
11
Example Optimizations 3 =, b a 2 +, x,w 1 +, y *, z *,u 4 4 3
Two labels on a+2 node saves computation; is encoded as x=a+2; w=x; Can figure out constant operands 3 =, b a 2 +, x,w 1 +, y *, z *,u 4 4 After find constants, Then z and u are same expression! 3 Feb06 CS442 B.Ryder
12
Example Now how to generate machine language for this expression? =, y
Walk the graph and at each internal node, generate appropriate code. =, y *,u,z 4 =, b +, x,w a 2 3 Feb06 CS442 B.Ryder
13
Transformed Code b = 3 x = a + 2 w = x y = 4 z = 4 * x u = z =, y
+, x,w *,u,z 4 =, y Feb06 CS442 B.Ryder
14
Comparison Original code Optimized code b = 3 b = 3 x = a + 2
y = b + 1 z = y * x w = a + 2 u = 4 * x Optimized code b = 3 x = a + 2 w = x y = 4 z = 4 * x u = z Note: fewer arithmetic operations and many inexpensive copies. Feb06 CS442 B.Ryder
15
Code Generation b = 3 mem[42] =3 x = a + 2 acc = 2 acc = acc + mem[43]
mem[44] = acc w = x mem[45] = acc y = 4 mem[46] = 4 z = 4 * x acc = 4 acc = acc * mem[44] mem[46] = acc u = z mem[47] = acc We needed to add multiplication to your machine language in order to translate this code Feb06 CS442 B.Ryder
16
Digging Deeper - Grammars
How do we define well-formed expressions? Expr = Const <op> Const, where <op> is +* How do we show the rules of arithmetic for unparenthesized expressions? Expr = Subexp + Subexp Subexp = Const * Const Subexp = Const + Expr Subexp Grammar rules correspond to shape of the tree. Subexp * 4 Const 5 6 Feb06 CS442 B.Ryder Const Const
17
Examples Adding parenthesized expressions requires new rule: * 6 + 4 5
Expr = Subexp + Subexp Subexp = Const* Const Subexp = Const Adding parenthesized expressions requires new rule: Subexp = ( Expr ) * Expr Subexp Subexp 6 Expr + Const Subexp Subexp Can show that this rule also works for 6 * (4 + 5) Note that rule for Expr allows the constant to be first or last argument to product 4 5 Const Const (4+5)*6 Feb06 CS442 B.Ryder
18
Example Expr = Subexp + Subexp Subexp = Const * Const Subexp = Const Subexp = ( Expr ) Adding arbitrary length, nested subexpressions requires changing the grammar. Expr = Expr + Expr Expr = Subexp Subexp = Subexp * Subexp Subexp = Const Subexp = ( Expr ) Subexp 4+5+6 + 5 6 Const Expr 4 Subexp 4*5*6 * 5 6 Const 4 Expr Sequences of sums require the Expr--> Expr + Expr; Expr = Subexp rules Sequences of products require the Subexp = Subexp * Subexp Mention that the rules are using recursion! Feb06 CS442 B.Ryder
19
Complicated Example 2*3+5*6+7*8 + + * * * 7 8 2 5 3 6 Expr
Expr = Expr + Expr Expr = Subexp Subexp = Subexp * Subexp Subexp = Const Subexp = ( Expr ) 2*3+5*6+7*8 Expr + Expr Expr + Subexp * Expr Expr Subexp Subexp Subexp * Subexp * 7 8 Subexp Subexp Subexp Const Subexp Const 2 3 5 6 Const Feb06 CS442 B.Ryder Const Const Const
20
Summing Up Parser uses grammar rules to check expressions for correct structure -- syntax If correct, then builds the expression graphs Optimizes the graphs to find repeated subexpressions and constants that can be evaluated at compile-time Then generates code from the graph Feb06 CS442 B.Ryder
21
Interpreters A compiler translates a program into machine language
An interpreter translates the statements in a program by executing equivalent commands No real translation step Interpretation requires that a programming language have a defined meaning for its statements -- semantics Sometimes defined mathematically, sometimes in English. Feb06 CS442 B.Ryder
22
Expression Interpreter
Requires input expression rules for operator evaluation a stack -- storage for partial results Think of how you store plates in your cupboard; Take next plate to use off the top of the pile Stack newly cleaned plates on the top of the pile LIFO: last-in, first-out Example Interpreter for un-parenthesized arithmetic expressions Feb06 CS442 B.Ryder
23
Example Initially, Operator Operand Input: 2 * 3 + 5 stack: stack:
empty empty Input: 2 * 3 + 5 Input: * 3 + 5 Input: 3 + 5 2 empty 2 * top of stack 2 3 * Feb06 CS442 B.Ryder
24
Example 2 * 3 + 5 Operator Operand stack: stack: 3 * 2 Input: + 5
Input: empty Answer on top of operand stack + 6 + 6 5 top of stack empty 11 Feb06 CS442 B.Ryder
25
Example Initially, Operator Operand Input: 2 + 3 * 5 stack: stack:
empty empty Input: * 5 Input: + 3 * 5 Input: 3 * 5 2 empty 2 + top of stack 2 3 + Feb06 CS442 B.Ryder
26
Example 2 + 3 * 5 Input: * 5 Input: 5 Operator Operand stack: stack: +
of stack 2 3 5 + * Feb06 CS442 B.Ryder
27
Example 2 + 3 * 5 2 3 5 + * Input: empty + 2 15 empty 17
Feb06 CS442 B.Ryder
28
Algorithm When see operator input, compare to top of operator stack.
If + on stack and + in input, pop 2 operands, evaluate their sum, push result on top of operand stack If + on stack and * in input, push operator If * on stack and + in input, pop 2 operands, evaluate the product, push result on top of operand stack If * on stack and * in input, pop 2 operands, evaluate their product, push result on top of operand stack Always push operands onto operand stack When input is empty, evaluate all operators left on stack Answer is on top of operand stack Feb06 CS442 B.Ryder
29
What’s going on? Algorithm is enforcing rules of arithmetic, assuming we accumulate sums and products from left to right. If + on stack and + in input, pop 2 operands, evaluate, push result on top of operand stack 2+3+4 ~ (2+3) + 4 If + on stack and * in input, push operator Matches ? + ? * ? If * on stack and + in input, pop 2 operands, evaluate, push result on top of operand stack Matches ? * ? + ? If * on stack and * in input, pop 2 operands, evaluate, push result on top of operand stack 2*3*4 ~ (2*3) * 4 Feb06 CS442 B.Ryder
30
How are interpreters useful?
Allow prototyping of new programming languages (PL’s) Get to test out PL design quickly E.g., Scheme, Prolog, Java A way to achieve portability and universality for a PL Generate code to be interpreted by a Virtual Machine (VM) Can install the PL on a different machine (i.e., chip) merely by rewriting the VM As long as PL definition is carefully written (syntax and semantics), programs should work equivalently! Model for Java (e.g., JVM - Java Virtual Machine) Feb06 CS442 B.Ryder
31
Java Language definition ~mid-1990’s
Used to write applications built out of pieces (e.g., libraries, components, middleware) Built by different people, in different places, on different machines Works because of VM mechanism Interpretation frees user from worries about machine-dependent translation details Feb06 CS442 B.Ryder
32
PLs & Compilers: An Incomplete History
Machine language programming Scientific computation in Fortran with first compilers LISP for non-numerical computation 1960’s First optimizing Fortran compiler (IBM) 1970’s First program analyses designed to enable complex optimizations C language and UNIX (Linux is a form of UNIX) Optimizing for space and time savings ENIAC Upenn, first programmers were WOMEN Feb06 CS442 B.Ryder
33
PLs & Compilers: An Informal History
First widely-used object-oriented PLs - Smalltalk, C++ Compilers translate for parallel machines (e.g., Thinking Machines, Cray) PLs allowing explicit parallelism (i.e., use of multiple processors; Ada) 1990’s Birth of the Internet PLs for explicitly distributed computation (e.g., across machines in an network) Object-oriented PLs - Java (VMs) 2000’s Compiling for low power Special purpose (domain specific) PLs Scalability, distributed computation, ubiquity Thinking Machines -- massively parallel computation -- doing the ‘same thing’ to lots and lots of data at the same time Cray- vector machines -- organizing computation in terms of sets of values Feb06 CS442 B.Ryder
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.