Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras.

Compilers for Algorithmic Languages Design and Construction of Compilers
Leonidas Fegaras

Catalogue Description
Review of programming language structures, translation, and storage allocation. Introduction to context-free grammars and their description. Design and construction of compilers including lexical analysis, parsing and code generation techniques. Error analysis and simple code optimizations will be introduced.

Objectives The goal of this course is to give a working knowledge of the basic techniques used in the implementation of modern programming languages. The course is centered around a substantial programming project: implementing a complete compiler for a realistic language. Students successfully completing this course will be able to apply the theory and methods learned during the course to design and implement optimizing compilers for most programming languages.

Reasons to Take this Course
To understand better programming languages (principles & semantics)‏ computer architecture and machine code structure the relation between source programs and generated machine code To get a good balance of theory & practice To complete a substantial programming project (a compiler for a realistic language)‏ get programming experience and become a better programmer learn how to work in groups

Prerequisites Prerequisites: Students must:
CSE3302 (Programming Languages)‏ CSE3315 (Theoretical Concepts)‏ CSE3322 (Computer Architecture I)‏ Students must: have knowledge and programming experience with Java; be familiar with the functions of modern computer architectures and be able to program in an assembly language; be familiar with data structure concepts and algorithms (such as lists, trees, sorting, hashing, etc). Students without adequate preparation are at substantial risk of failing this course.

Textbook Required Textbook and Notes:
Andrew W. Appel: Modern Compiler Implementation in Java, Second Edition. Cambridge University Press, 2002. Lecture Notes, available at Lecture slides are based on notes You may find the following texts useful for additional background and explanation: A. V. Aho, M. S. Lam., R. Sethi, and J. D. Ullman: Compilers: Principles, Techniques, and Tools, 2nd edition, (this is the classic red "Dragon" book), Addison-Wesley, 2007. C. Fischer and R. LeBlanc, Crafting a compiler with C. Bejamin/Cummings, 1991.

Grading The final grade will be based on 30% project
20% first midterm exam 20% second midterm exam 30% final exam (comprehensive) The course work will be the same for graduates and undergraduates. Final grades will be assigned according to the following scale: A: score >= 90 B: 80 <= score < 90 C: 70 <= score < 80 D: 60 <= score < 70 F: score < 60 Sometimes, I use lower cutoff points, depending on the overall performance of the class. After the first grades are posted, you can check your grades online at the course web page.

Reading Assignments Completing reading assignments before the class period in which the material is discussed is essential to success in this class. Not all the assigned material will be covered in class, but you will be responsible for it on exams.

Exams All exams are closed-book and closed-notes.
The second midterm exam will cover the material of the second part of the course only, while the final exam will cover the material from the first lecture up to and including the last lecture. Makeup exams will be given only when the instructor (at least 3 days before the exam) has approved the request to change the exam time. Approval will be given for illness, sickness or death in the family only.

Project The course project is to construct a compiler for a small programming language and will involve: lexical analysis, parsing, semantic analysis (type-checking), and code generation for a MIPS architecture. This project will be done in Java. You may use your own PC but your programs should work correctly on gamma. The project is to be completed in seven stages spaced throughout the term and will be done by groups of 3 students. Project reports will be marked 20%-off per day. No further extensions will be allowed. No excuses, no exceptions.

Cheating You are allowed to collaborate with students of your project group only. No copying is permitted. Cheating involves giving assistance to or receiving assistance from members of other groups, copying code from the web, etc. The punishment for cheating is a zero in the assignment and will be subject to the university's academic dishonesty policy. If you have any questions regarding an assignment, see the instructor or teaching assistant.

Special Accommodations
If you require an accommodation based on disability, I would like to meet with you in the privacy of my office, during the first week of the semester, to make sure you are appropriately accommodated.

Project The project will be done in groups of three students. It is your responsibility to find two other students and organize a project team. The course project is to construct a compiler for a small programming language, called PCAT. It will involve: lexical analysis parsing semantic analysis (type checking)‏ code generation for a MIPS architecture. The project is to be completed in seven stages spaced throughout the term.

Survival Tips Select your teammates very carefully. Your project grade will depend on them. Choose teammates whose abilities complement yours. For example, you may be good in Java and this person may be good in computer architecture and assembly programming. That way your group will be strong in all aspects of this project. It's up to you to decide how to divide the project work among your teammates. It's highly unprofessional to come to me and complain about your teammates. You should meet, solve your differences, and divide the work as a professional team. Your project grade will not depend on your abilities alone, but on how well your team achieves all the above tasks.

Survival Tips (cont.)‏ Start working on programming assignments as soon as they are handed out. Do not wait till the day before the deadline. You will see that assignments take much more time when you work on them under pressure than when you are more relaxed. Design carefully before you code. Writing a well-designed piece of code is always easier than starting with some code that "almost works" and adding patches to make it "really work".

Platform and Tools You will do your project on your own PC (under Linux, Windows, Mac OS X, etc) or on gamma at UTA You have to use Java JDK 5 or 6. There are many on-line manuals for Java (see the project web page). To make coding easier in Java, you are required to use the Gen package to build abstract syntax trees and intermediate representation trees. You will also use a MIPS code simulator, called SPIM, to run the assembly code generated by your compiler To install the project on your own Linux or Windows PC, you install Sun's JDK 6 (the Java runtime/compiler)‏ install SPIM (the MIPS emulator)‏ download the System.jar archive that contains the CUP, JLex, and Gen classes download the project and compile it

Program Grading Programs will be graded according to their correctness, style, and readability. Programs should behave as specified in the assignment handouts. Bad data should be handled gracefully; your program should never have run-time errors like dereferencing a null pointer or using an out-of-bounds index. Special cases should be handled correctly. Unnecessarily inefficient algorithms or constructs should be avoided; however, efficiency should never be pursued at the expense of clarity or simplicity. Programs should be well documented, modular, and flexible, i.e. easy to modify. Indentation should reflect program structure. Use meaningful identifiers.

Program Grading (cont.)‏
Avoid static variables and side effects as much as possible. You should never use side effects during the semantic actions of a parser. The grader should be able to understand the program without undue strain. I will provide some test programs, but these programs will not test your compiler exhaustively. It is your responsibility to test every statement in your program by some piece of test data. Thorough testing is essential to establish the reliability of your code. Don't even think about adding fancy features until the required work is completely debugged. A correctly working simple program is worth much more (both in this class and in actual practice) than a fancy program with bugs.

Cheating You are allowed to collaborate with students of your project group only. No copying is permitted. Cheating involves giving assistance to or receiving assistance from members of other groups, copying code from the web, etc. You are required to use the Gen package (using the Meta class interface for tree construction and pattern matching). It will be taken as cheating if you use your own data structures or interface (since this would mean that you have copied the code from elsewhere). The punishment for cheating is a zero in the assignment and will be subject to the university's academic dishonesty policy. If you have any questions regarding an assignment, see the instructor or teaching assistant.

Deliverables Project phases:
Lexical Analysis: worth 6% of your project grade. Parsing: worth 14% of your project grade. Abstract Syntax: worth 14% of your project grade. Type-Checking: worth 18% of your project grade. Simple IRs: worth 18% of your project grade. Rest of IRs: worth 16% of your project grade. Instruction Selection: worth 14% of your project grade. The due time of each project is the midnight of the indicated due day You will hand-in your project source code electronically You may hand-in your source files as many times as you want; only the last one will be taken into account Projects will be marked 20%-off per day. So, there is no point submitting a project more than 4 days late! No further extensions will be allowed. No excuses, no exceptions.

Solution There is a solution jar archive Solution.jar
It provides all the classes (obfuscated), so you can compare the output of your program with that of the solution. For each project phase, you can compare the output of your program with that of the solution. You can run the solution PCAT compiler over a test PCAT program, say tests/hanoi.pcat, using the command solution 7 hanoi inside your project directory. If you mess up a project phase you can still do the next project phases by removing the appropriate source files from your directory. That way, the missing classes will be copied from the Solution.jar file, rather than compiled from your sources.

By Monday January 28 Find a team: Stay after class and talk to your classmates Each team will send one to the GTA with information about the team members (firstname and lastname only)‏ If you cannot find a team or need to add a third member, I will help you after class

What is a Compiler? We will mostly study: high-level source code
low-level machine code compiler eg, Java program easy to understand user-friendly syntax many high-level programming constructs machine-independent variables, procedures, classes, ... eg, MIPS code hard to understand specific to hardware registers & unnamed locations

Architecture Compiler: Interpreter:
assembly code machine code machine code source program result compiler assembler linker loader data libraries Interpreter: source program result interpreter data Java uses both a compiler (javac) and an interpreter (java)‏

Many Other Translators
Source Language Translator Target Language LaTeX Text Formater PostScript SQL database query optimizer Query Evaluation Plan Java javac compiler Java byte code Java cross-compiler C++ code English text Natural Lang Understanding semantics (meaning)‏ Regular Expressions JLex scanner generator a scanner in Java BNF of a language CUP parser generator a parser in Java

Challenges Many variations:
many programming languages (eg, FORTRAN, C++, Java)‏ many programming paradigms (eg, object-oriented, functional, logic)‏ many computer architectures (eg, MIPS, SPARC, Intel, alpha)‏ many operating systems (eg, Linux, Solaris, Windows)‏

Qualities of a Compiler
the compiler itself must be bug-free it must generate correct machine code the generated machine code must run fast the compiler itself must run fast (compilation time must be proportional to program size)‏ the compiler must be portable (ie, modular, supporting separate compilation)‏ it must print good diagnostics and error messages the generated code must work well with existing debuggers

Challenges Building a compiler requires knowledge of
programming languages (parameter passing, variable scoping, memory allocation, etc)‏ theory (automata, context-free languages, etc)‏ algorithms and data structures (hash tables, graph algorithms, dynamic programming, etc)‏ computer architecture (assembly programming)‏ software engineering

Addressing Portability
Suppose you want to write compilers from m source languages to n computer platforms. A naïve solution requires n*m programs: C MIPS Java SPARC Pentium FORTRAN PowerPC but we can do it with n+m programs: C MIPS Java SPARC Pentium FORTRAN PowerPC FE BE BE FE IR BE FE BE IR: Intermediate Representation FE: Front-End BE: Back-End

Phases A typical real-world compiler usually has multiple phases
The front-end consists of the following phases: scanning: a scanner groups input characters into tokens parsing: a parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs)‏ semantic analysis: performs type checking and translates ASTs into IRs optimization: optimizes IRs The back-end consists of the following phases: instruction selection: maps IRs into assembly code code optimization: optimizes the assembly code using control-flow and data-flow analyses, register allocation, etc code emission: generates machine code from assembly code

Lexical Analysis Leonidas Fegaras

Lexical Analysis A scanner groups input characters into tokens
input token value identifier x equal = star * x = x * (acc+123) left-paren ( identifier acc plus + integer 123 right-paren ) Tokens are typically represented by numbers

Communication with the Parser
get token get next character AST scanner parser source file token Each time the parser needs a token, it sends a request to the scanner the scanner reads as many characters from the input stream as necessary to construct a single token when a single token is formed, the scanner is suspended and returns the token to the parser the parser will repeatedly call the scanner to read all the tokens from the input stream

Tasks of a Scanner A typical scanner:
recognizes the keywords of the language these are the reserved words that have a special meaning in the language, such as the word class in Java recognizes special characters, such as ( and ), or groups of special characters, such as := and == recognizes identifiers, integers, reals, decimals, strings, etc ignores whitespaces (tabs, blanks, etc) and comments recognizes and processes special directives (such as the #include "file" directive in C) and macros

Scanner Generators Input: a scanner specification
describes every token using Regular Expressions (REs) eg, the RE [a-z][a-zA-Z0-9]* recognizes all identifiers with at least one alphanumeric letter whose first letter is lower-case alphabetic handles whitespaces and resolve ambiguities Output: the actual scanner Scanner generators compile regular expressions into efficient programs (finite state machines) You will use a scanner generator for Java, called JLex, for the project

Regular Expressions are a very convenient form of representing (possibly infinite) sets of strings, called regular sets eg, the RE (a | b)*aa represents the infinite set {“aa”,“aaa”,“baa”,“abaa”, ... } a RE is one of the following: name RE designation epsilon  {“”} symbol a {“a”} for some character a concatenation AB the set { rs | rA, sB }, where rs is string concatenation, and A and B designate the REs for A and B alternation A | B the set A  B, where A and B designate the REs for A and B repetition A* the set  | A | (AA) | (AAA) | ... (an infinite set) eg, the RE (a | b)c designates { rs | r{“a”}{“b”}, s {“c”} }, which is equal to {“ac”,“bc”} Shortcuts: P+ = PP*, P? = P | , [a-z] = (“a”|“b”|...|“z”)

Properties concatenation and alternation are associative
eg, ABC means (AB)C and is equivalent to A(BC) alternation is commutative eg, A | B = B | A repetition is idempotent eg, A** = A* concatenation distributes over alternation eg, (a | b)c = ac | bc

Disambiguation Rules longest match rule: from all tokens that match the input prefix, choose the one that matches the most characters rule priority: if more than one token has the longest match, choose the one listed first Examples: for8 is it the for-keyword, the identifier “f”, the identifier “fo”, the identifier “for”, or the identifier “for8”? Use rule 1: “for8” matches the most characters. for is it the for-keyword, the identifier “f”, the identifier “fo”, or the identifier “for”? Use rule 1 & 2: the for-keyword and the “for” identifier have the longest match but the for-keyword is listed first.

How Scanner Generators Work
Translate REs into a finite state machine Done in three steps: translate REs into a no-deterministic finite automaton (NFA) translate the NFA into a deterministic finite automaton (DFA) optimize the DFA (optional)

Deterministic Finite Automata
A DFA represents a finite state machine that recognizes a RE eg, the RE (abc+)+ is represented by the DFA: A finite automaton consists of a finite set of states a set of transitions (moves) one start state a set of final states (accepting states) a DFA has a unique transition for every state-character combination A DFA accepts a string if starting from the start state and moving from state to state, each time following the arrow that corresponds the current input character, it reaches a final state when the entire input string is consumed

DFA (cont.) The error state 0 is implied:
The transition table T gives the next state T[s,c] for a state s and a character c a b c

The DFA of a Scanner for-keyword = for identifier = [a-z][a-z0-9]*

Scanner Code The scanner code that uses the transition table T:
state = initial_state; current_character = get_next_character(); while ( true ) { next_state = T[state,current_character]; if (next_state == ERROR) break; state = next_state; if ( current_character == EOF ) }; if ( is_final_state(state) ) `we have a valid token' else `report an error'

With Longest Match state = initial_state; final_state = ERROR;
current_character = get_next_character(); while ( true ) { next_state = T[state,current_character]; if (next_state == ERROR) break; state = next_state; if ( is_final_state(state) ) final_state = state; if (current_character == EOF) }; if ( final_state == ERROR ) `report an error' else if ( state != final_state ) `we have a valid token but need to backtrack (to put characters back into the input stream)' else `we have a valid token'

Alternative Scanner Code
For each transition in a DFA s1 generate code: s1: current_character = get_next_character(); ... if ( current_character == 'c' ) goto s2; s2: current_character = get_next_character(); c s2

Mapping a RE into an NFA An NFA is similar to a DFA but it also permits multiple transitions over the same character and transitions over  The following rules construct NFAs with only one final state:

Example The RE (a | b)c is mapped into the NFA:

Converting an NFA to a DFA
Subset construction: assign a number to each NFA state each DFA state will be assigned a set of numbers the closure of a DFA state {n1,...,nk} is the DFA state that contains all the NFA states that can be reached by zero or more empty transitions (ie,  transitions) from the NFA states n1, ..., or nk so the closure of {n1,...,nk} is a superset of or equal to {n1,...,nk} the initial DFA state is the closure of the initial NFA state for every DFA state labelled by some set {n1,...,nk} and for every character c in the language alphabet, you find all the states reachable by n1, n2, or nk using c arrows and you union together the closures of these nodes. If this set is not the label of any other node in the DFA constructed so far, you create a new DFA node with this label

Example

Example (a | b)*(abb | a+b)

JLex Regular expressions (where e and f are regular expressions):
c any character c other than: ? * + | ( ) ^ $ . [ ] { } " \ \c any character c, but \n is newline, \^c is control-c, etc . any character except \n “...” the concatenation of all the characters in the string ef concatenation e | f alternation e* Kleene closure e+ ee* e? optional e {name} macro expansion [...] any character enclosed in [ ] (but only one character), from: c a character c (or use \c) ef any character from e or from f a-b any character from a to b “...” any character in the string [^...] any character except those enclosed by [ ]

JLex Rules A JLex rule: where action is Java code
typically, the action returns a token but you want to skip whitespaces and comments yytext() returns the part of the input that matches the RE JLex uses longest match and rule priority States and state transitions can be used for better control the initial (default) state is YYINITIAL any other state should be declared using the %state directive now a rule can take the form: <s> RE { action } which can match if we are in state s only you jump to a state s using yybegin(s)

Case Study: The Calculator Scanner
The calculator example is available at: After you download it on gamma, do: tar xfz calc.tar.gz cd calc build run then try it with some input; eg, 2*(3+8); x:=3+4; x+3; define f(n) = if n=0 then 1 else n*f(n-1); f(5); quit;

Tokens are Defined in calc.cup
terminal LP, RP, COMMA, SEMI, ASSIGN, IF, THEN, ELSE, AND, OR, NOT, QUIT, PLUS, TIMES, MINUS, DIV, EQ, LT, GT, LE, NE, GE, FALSE, TRUE, DEFINE; terminal String ID; terminal Integer INT; terminal Float REALN; terminal String STRINGT; The class constructor Symbol pairs together a terminal token with an optional value (a Java Object) if a terminal is specified with a class (a subtype of Object) then an object of this class should be provided along with the token eg, Symbol(sym.ID,“x”) eg, Symbol(sym.INT,10)

The Calculator Scanner
import java_cup.runtime.Symbol; %% %class CalcLex %public %line %char %cup DIGIT=[0-9] ID=[a-zA-Z][a-zA-Z0-9_]*

The Calculator Scanner (cont.)
{DIGIT}+ { return new Symbol(sym.INT,new Integer(yytext())); } {DIGIT}+"."{DIGIT}+ { return new Symbol(sym.REALN,new Float(yytext())); } "(" { return new Symbol(sym.LP); } ")" { return new Symbol(sym.RP); } "," { return new Symbol(sym.COMMA); } ";" { return new Symbol(sym.SEMI); } ":=" { return new Symbol(sym.ASSIGN); } "define" { return new Symbol(sym.DEFINE); } "quit" { return new Symbol(sym.QUIT); } "if" { return new Symbol(sym.IF); } "then" { return new Symbol(sym.THEN); } "else" { return new Symbol(sym.ELSE); } "and" { return new Symbol(sym.AND); } "or" { return new Symbol(sym.OR); } "not" { return new Symbol(sym.NOT); } "false" { return new Symbol(sym.FALSE); } "true" { return new Symbol(sym.TRUE); }

The Calculator Scanner (cont.)
"+" { return new Symbol(sym.PLUS); } "*" { return new Symbol(sym.TIMES); } "-" { return new Symbol(sym.MINUS); } "/" { return new Symbol(sym.DIV); } "=" { return new Symbol(sym.EQ); } "<" { return new Symbol(sym.LT); } ">" { return new Symbol(sym.GT); } "<=" { return new Symbol(sym.LE); } "!=" { return new Symbol(sym.NE); } ">=" { return new Symbol(sym.GE); } {ID} { return new Symbol(sym.ID,yytext()); } \"[^\"]*\" { return new Symbol(sym.STRINGT, yytext().substring(1,yytext().length()-1)); } [ \t\r\n\f] { /* ignore white spaces. */ } . { System.err.println("Illegal character: "+yytext()); }

Parsing #1 Leonidas Fegaras

Parser get token get next character AST scanner parser source file token A parser recognizes sequences of tokens according to some grammar and generates Abstract Syntax Trees (ASTs) A context-free grammar (CFG) has a finite set of terminals (tokens) a finite set of nonterminals from which one is the start symbol and a finite set of productions of the form: A ::= X1 X2 ... Xn where A is a nonterminal and each Xi is either a terminal or nonterminal symbol

Example Expressions: ... or equivalently: Nonterminals: E T F
E ::= E + T | E - T | T T ::= T * F | T / F | F F ::= num | id Nonterminals: E T F Start symbol: E Terminals: * / id num Example: x+2*y ... or equivalently: E ::= E + T E ::= E - T E ::= T T ::= T * F T ::= T / F T ::= F F ::= num F ::= id

Derivations Notation: Given a production:
terminals: t, s, ... nonterminals: A, B, ... symbol (terminal or nonterminal): X, Y, ... sequence of symbols: a, b, ... Given a production: A ::= X1 X2 ... Xn the form aAb => aX1 X2 ... Xnb is called a derivation eg, using the production T ::= T * F we get T / F x => T * F / F x Leftmost derivation: when you always expand the leftmost nonterminal in the sequence Rightmost derivation: ... rightmost nonterminal

Top-down Parsing It starts from the start symbol of the grammar and applies derivations until the entire input string is derived Example that matches the input sequence id(x) + num(2) * id(y) E => E + T use E ::= E + T => E + T * F use T ::= T * F => T + T * F use E ::= T => T + F * F use T ::= F => T + num * F use F ::= num => F + num * F use T ::= F => id + num * F use F ::= id => id + num * id use F ::= id You may have more than one choice at each derivation step: my have multiple nonterminals in each sequence for each nonterminal in the sequence, may have many rules to choose from Wrong predictions will cause backtracking need predictive parsing that never backtracks

Bottom-up Parsing It starts from the input string and uses derivations in the opposite directions (from right to left) until you derive the start symbol Previous example: id(x) + num(2) * id(y) <= id(x) + num(2) * F use F ::= id <= id(x) + F * F use F ::= num <= id(x) + T * F use T ::= F <= id(x) + T use T ::= T * F <= F + T use F ::= id <= T + T use T ::= F <= E + T use E ::= T <= E use E ::= E + T At each derivation step, need to recognize a handle (the sequence of symbols that matches the right-hand-side of a production)

Parse Tree Given the derivations used in the top-down/bottom-up parsing of an input sequence, a parse tree has the start symbol as the root the terminals of the input sequence as leafs for each production A ::= X1 X2 ... Xn used in a derivation, a node A with children X1 X2 ... Xn E E T T T F F F id(x) + num(2) * id(y) E => E + T => E + T * F => T + T * F => T + F * F => T + num * F => F + num * F => id + num * F => id + num * id

Ambiguous Grammars What about this grammar?
E ::= E + E | E - E | E * E | E / E | num | id Operators * / have the same precedence! It is ambiguous: has more than one parse tree for the same input sequence (depending which derivations are applied each time) E E E E E id(x) * id(y) + id(z) E E E E E id(x) * id(y) + id(z)

Predictive Parsing The goal is to construct a top-down parser that never backtracks Always leftmost derivations left recursion is bad! We must transform a grammar in two ways: eliminate left recursion perform left factoring These rules eliminate most common causes for backtracking although they do not guarantee a completely backtrack-free parsing

Left Recursion Elimination
For example, the grammar A ::= A a | b recognizes the regular expression ba*. But a top-down parser may have hard time to decide which rule to use Need to get rid of left recursion: A ::= b A' A' ::= a A' | ie, A' parses the RE a*. The second rule is recursive, but not left recursive

Left Recursion Elimination (cont.)
For each nonterminal X, we partition the productions for X into two groups: one that contains the left recursive productions the other with the rest That is: X ::= X a1 ... X ::= X an where a and b are symbol sequences. Then we eliminate the left recursion by rewriting these rules into: X ::= b1 X' X ::= bm X' X ::= b1 ... X ::= bm X' ::= a1 X' ... X' ::= an X' X' ::=

Example E ::= E + T E ::= T E' | E - T E' ::= + T E' | - T E' | T |
T ::= T * F | T / F | F F ::= num | id E ::= T E' E' ::= + T E' | - T E' | T ::= F T' T' ::= * F T' | / F T' F ::= num | id

Left Factoring Factors out common prefixes: becomes: Example:
X ::= a b1 ... X ::= a bn becomes: X ::= a X' X' ::= b1 X' ::= bn Example: E ::= T + E | T - E | T E ::= T E' E' ::= + E | - E |

Recursive Descent Parsing
E ::= T E' E' ::= + T E' | - T E' | T ::= F T' T' ::= * F T' | / F T' F ::= num | id static void E () { T(); Eprime(); } static void Eprime () { if (current_token == PLUS) { read_next_token(); T(); Eprime(); } else if (current_token == MINUS) { read_next_token(); T(); Eprime(); }; } static void T () { F(); Tprime(); } static void Tprime() { if (current_token == TIMES) { read_next_token(); F(); Tprime(); } else if (current_token == DIV) { read_next_token(); F(); Tprime(); }; static void F () { if (current_token == NUM || current_token == ID) read_next_token(); else error();

Predictive Parsing Using a Table
The symbol sequence from a derivation is stored in a stack (first symbol on top) if the top of the stack is a terminal, it should match the current token from the input if the top of the stack is a nonterminal X and the current input token is t, we get a rule for the parse table: M[X,t] the rule is used as a derivation to replace X in the stack with the right-hand symbols push(S); read_next_token(); repeat X = pop(); if (X is a terminal or '$') if (X == current_token) else error(); else if (M[X,current_token] == "X ::= Y1 Y2 ... Yk") { push(Yk); ... push(Y1); } until X == '$';

Parsing Table Example num id + - * / $ 1) E ::= T E' $
4) | 5) T ::= F T' 6) T' ::= * F T' 7) | / F T' 8) | 9) F ::= num 10) | id num id * / $ E E' T T' F

Example: Parsing x-2*y$
top Stack current_token Rule E x M[E,id] = 1 (using E ::= T E' $) $ E' T x M[T,id] = 5 (using T ::= F T') $ E' T' F x M[F,id] = 10 (using F ::= id) $ E' T' id x read_next_token $ E' T' - M[T',-] = 8 (using T' ::= ) $ E' - M[E',-] = 3 (using E' ::= - T E') $ E' T read_next_token $ E' T 2 M[T,num] = 5 (using T ::= F T') $ E' T' F 2 M[F,num] = 9 (using F ::= num) $ E' T' num 2 read_next_token $ E' T' * M[T',*] = 6 (using T' ::= * F T') $ E' T' F * * read_next_token $ E' T' F y M[F,id] = 10 (using F ::= id) $ E' T' id y read_next_token $ E' T' $ M[T',$] = 8 (using T' ::= ) $ E' $ M[E',$] = 4 (using E' ::= ) $ $ stop (accept)

Constructing the Parsing Table
FIRST[a] is the set of terminals t that result after a number of derivations on the symbol sequence a ie, a => ... => tb for some symbol sequence b FIRST[ta]={t} eg, FIRST[3+E]={3} FIRST[X]=FIRST[a1]  …  FIRST[an] for each production X ::= ai FIRST[Xa]=FIRST[X] but if X has an empty derivation then FIRST[Xa]=FIRST[X]  FIRST[a] FOLLOW[X] is the set of all terminals that follow X in any legal derivation find all productions Z ::= a X b in which X appears at the RHS; then FIRST[b] must be included in FOLLOW[X] if b has an empty derivation, FOLLOW[Z] must be included in FOLLOW[X]

Example 1) E ::= T E' $ 2) E' ::= + T E' 3) | - T E' 4) |
4) | 5) T ::= F T' 6) T' ::= * F T' 7) | / F T' 8) | 9) F ::= num 10) | id FIRST FOLLOW E {num,id} {} E' {+,-} {$} T {num,id} {+,-,$} T' {*,/} {+,-,$} F {num,id} {+,-,*,/,$}

Constructing the Parsing Table (cont.)
For each rule X ::= a do: for each t in FIRST[a], add X ::= a to M[X,t] if a can be reduced to the empty sequence, then for each t in FOLLOW[X], add X ::= a to M[X,t] FIRST FOLLOW E {num,id} {} E' {+,-} {$} T {num,id} {+,-,$} T' {*,/} {+,-,$} F {num,id} {+,-,*,/,$} 1) E ::= T E' $ 2) E' ::= + T E' 3) | - T E' 4) | 5) T ::= F T' 6) T' ::= * F T' 7) | / F T' 8) | 9) F ::= num 10) | id num id * / $ E E' T T' F

Another Example G ::= S $ S ::= ( L ) | a L ::= L , S | S 0) G := S $
2) S ::= a 3) L ::= S L' 4) L' ::= , S L' 5) L' ::= ( ) a , $ G S L L'

LL(1) A grammar is called LL(1) if each element of the parsing table of the grammar has at most one production element the first L in LL(1) means that we read the input from left to right the second L means that it uses left-most derivations only the number 1 means that we need to look one token ahead from the input

Parsing #2 Leonidas Fegaras

Bottom-up Parsing Rightmost derivations; use of rules from right to left Uses a stack to push symbols the concatenation of the stack symbols with the rest of the input forms a valid bottom-up derivation E - num x-2*y$ stack input Derivation: E-num*id$ Two operations: reduction: if a postfix of the stack matches the RHS of a rule (called a handle), replace the handle with the LHS nonterminal of the rule eg, reduce the stack x * E + E by the rule E ::= E + E new stack: x * E shifting: if no handle is found, push the current input token on the stack and read the next symbol Also known as shift-reduce parsing

Example Input: x-2*y$ Stack rest-of-the-input Action
1) id - num * id $ shift 2) id - num * id $ reduce by rule 8 3) F - num * id $ reduce by rule 6 4) T - num * id $ reduce by rule 3 5) E - num * id $ shift 6) E - num * id $ shift 7) E - num * id $ reduce by rule 7 8) E - F * id $ reduce by rule 6 9) E - T * id $ shift 10) E - T * id $ shift 11) E - T * id $ reduce by rule 8 12) E - T * F $ reduce by rule 4 13) E - T $ reduce by rule 2 14) E $ shift 15) S accept (reduce by 0) 0) S :: = E $ 1) E ::= E + T 2) E ::= E - T 3) E ::= T 4) T ::= T * F 5) T ::= T / F 6) T ::= F 7) F ::= num 8) F ::= id

Machinery Need to decide when to shift or reduce, and if reduce, by which rule use a DFA to recognize handles Example 0) S ::= R $ 1) R ::= R b 2) R ::= a state 2: accept (reduce by 0) state 3: reduce by 2 state 4: reduce by 1 The DFA is represented by an ACTION and a GOTO table The stack contains state numbers now ACTION GOTO a b $ S R 0 s s4 s2 2 a a a 3 r2 r2 r2 4 r1 r1 r1

The Shift-Reduce Parser
push(0); read_next_token(); for(;;) { s = top(); /* current state is taken from top of stack */ if (ACTION[s,current_token] == 'si') /* shift and go to state i */ { push(i); } else if (ACTION[s,current_token] == 'ri') /* reduce by rule i: X ::= A1...An */ { perform pop() n times; s = top(); /* restore state before reduction from top of stack */ push(GOTO[s,X]); /* state after reduction */ else if (ACTION[s,current_token] == 'a') success!! else error();

Example Example: parsing abb$ ACTION GOTO a b $ S R 0 s3 1 1 s4 s2
2 a a a 3 r2 r2 r2 4 r1 r1 r1 Example: parsing abb$ Stack rest-of-input Action 0 abb$ s3 0 3 bb$ r2 (pop, push GOTO[0,R] since R ::= a) 0 1 bb$ s4 0 1 4 b$ r1 (pop twice, push GOTO[0,R] since R ::= R b) 0 1 b$ s4 0 1 4 $ r1 (pop twice, push GOTO[0,R] since R ::= R b) 0 1 $ s2 accept

Table Construction Problem: given a CFG, construct the finite automaton (DFA) that recognizes handles The DFA states are itemsets (sets of items) An item is a rule with a dot at the RHS eg, possible items for the rule E ::= E + E: E ::= . E + E E ::= E . + E E ::= E + . E E ::= E + E . The dot indicates how far we have progressed using this rule to parse the input eg, the item E ::= E + . E indicates that we are using the rule E ::= E + E we have parsed E, we have seen the token +, and we are ready to parse another E

Table Construction (cont.)
The items in a itemset indicate different possibilities that will have to be resolved later by reading more input tokens eg, the itemset: T ::= ( E . ) E ::= E . + T coresponds to a DFA state where we don't know whether we are looking at an ( E ) handle or an E + T handle it will be ( E ) if the next token is ) it will be E + T if the next token is + When the dot is at the end of an item, we have found a handle eg, T ::= ( E ) . it corresponds to a reduction by T ::= ( E ) reduce/reduce conflict: an itemset should never have more than one item with a dot at the end otherwise we can't choose a handle

Closure of an Itemset The closure of an item
X ::= a . t b is the singleton set that contains the item X ::= a . t b only X ::= a . Y b is the set consisting of the item itself, plus all rules for Y with the dot at the beginning of the RHS, plus the closures of these items eg, the closure of the item E ::= E + . T is the set: E ::= E + . T T ::= . T * F T ::= . T / F T ::= . F F ::= . num F ::= . id The closure of an itemset is the union of closures of all items in the itemset

Constructing the DFA The initial state of the DFA (state 0) is the closure of the item S ::= . a, where S ::= a is the first rule of the grammar For each itemset, if there is an item X ::= a . s b in an itemset, where s is a symbol, we have a transition labeled by s to an itemset that contains X ::= a s . b But if we have more than one item with a dot before the same symbol s, say X ::= a . s b and Y ::= c . s d, then the new itemset contains both X ::= a s . b and Y ::= c s . d we need to get the closure of the new itemset we need to check if this itemset has appeared before so that we don't create it again

Example #1 0) S ::= R $ 1) R ::= R b 2) R ::= a

Example #2 0) S' ::= S $ 1) S ::= B B 2) B ::= a B 3) B ::= c

Example #3 S ::= E $ E ::= ( L ) E ::= ( ) E ::= id L ::= L , E

LR(0) If an itemset has more than one reduction (an item with the dot at the end), it is a reduce/reduce conflict If an itemset has at least one shifting (an outgoing transition to another state) and at least one reduction, it is a shift/reduce conflict A grammar is LR(0) if it doesn't have any reduce/reduce or shift/reduce conflict 1) S ::= E $ 2) E ::= E + T 3) | T 4) T ::= T * F 5) | F 6) F ::= id 7) | ( E ) S ::= . E $ T E ::= T . E ::= . E + T T ::= T . * F E ::= . T T ::= . T * F T ::= . F F ::= . id F ::= . ( E )

SLR(1) There is an easy fix for some of the shift/reduce or reduce/reduce errors requires to look one token ahead (called the lookahead token) Steps to resolve the conflict of an itemset: for each shift item Y ::= b . c you find FIRST(c) for each reduction item X ::= a . you find FOLLOW(X) if each FOLLOW(X) do not overlap with any of the other sets, you have resolved the conflict! eg, for the itemset with E ::= T . and T ::= T . * F FOLLOW(E) = { $, +, ) } FIRST(* F) = { * } no overlapping! This is a SLR(1) grammar, which is more powerful than LR(0)

LR(1) The SLR(1) trick doesn't always work
S ::= E $ E ::= L = R | R L ::= * R | id R ::= L For a reduction item X ::= a . we need a smaller (more precise) set of lookahead tokens to tell us when to reduce called expected lookahead tokens must be a subset of or equal to FOLLOW(X) (hopefully subset) they are context-sensitive => finer control S ::= . E $ E ::= . L = R L E ::= L . = R E ::= . R R ::= L . L ::= . * R L ::= . id R ::= . L

LR(1) Items Now each item is associated with expected lookahead tokens
eg, L ::= * . R =$ the tokens = and $ are the expected lookahead tokens the are only useful in a reduction item: L ::= * R . =$ it indicates that we reduce when the lookahead token from input is = or $ LR(1) grammar: at each shift/reduce reduce/reduce conflict, the expected lookahead tokens of the reductions must not overlap with the first tokens after dot (ie, the FIRST(c) in Y ::= b . c) Rules for constructing the LR(1) DFA: for a transition from A ::= a . s b by a symbol s, propagate the expected lookaheads when you add the item B ::= . c to form the closure of A ::= a . B b with expected lookahead tokens t, s, ..., the expected lookahead tokens of B ::= . c are FIRST(bt) U FIRST(bs) U ...

Example S ::= E $ E ::= L = R | R S ::= . E $ ? L ::= * R
| id R ::= L S ::= . E $ ? E ::= . L = R $ L E ::= L . = R $ E ::= . R $ R ::= L . $ L ::= . * R =$ L ::= . id =$ R ::= . L $

LALR(1) If the lookaheads s1 and s2 are different, then the items A ::= a s1 and A ::= a s2 are different this results to a large number of states since the combinations of expected lookahead symbols can be very large. We can combine the two states into one by creating an item A := a s3 where s3 is the union of s1 and s2 LALR(1) is weaker than LR(1) but more powerful than SLR(1) LALR(1) and LR(0) have the same number of states Easy construction of LALR(1) itemsets: start with LR(0) items and propagate lookaheads as in LR(1) don't create a new itemset if the LR(0) items are the same just union together the lookaheads of the corresponding items you may have to propagate lookaheads by looping through the same itemsets until you cannot add more Most parser generators are LALR(1), including CUP

Example S ::= E $ E ::= E + E | E * E | ( E ) | id | num S ::= . E $ ?
E ::= . E + E +*$ E S ::= E . $ ? * E ::= E * . E +*$ E ::= . E * E +*$ E ::= E . + E +*$ E ::= . E + E +*$ E ::= . ( E ) +*$ E ::= E . * E +*$ E ::= . E * E +*$ E ::= . id *$ E ::= . ( E ) +*$ E ::= . num +*$ E ::= E * E . +*$ E E ::= . id *$ E ::= E . + E +*$ E ::= . num +*$ E ::= E . * E +*$

Practical Considerations
How to avoid reduce/reduce and shift/reduce conflicts: left recursion is good, right recursion is bad left recursion uses less stack than right recursion left recursion produces left associative trees right recursion produces right associative trees L ::= id , L L ::= L , id | id | id Most shift/reduce errors are easy to remove by assigning precedence and associativity to operators S ::= E $ E ::= E + E | E * E | ( E ) | id | num + and * are left-associative * has higher precedence than +

Practical Considerations (cont.)
How precedence and associativity work? the precedence and associativity of a rule comes from the last terminal at the RHS of the rule eg, the rule E ::= E + E has the same precedence and associativity as + you can force the precedence of a rule in CUP: eg, E ::= MINUS E %prec UMINUS in a state with a shift/reduce conflict and you are reading a token t if the precedence of t is lower than that of the reduction rule, you reduce if the precedence of t is equal to that of the reduction rule, if the rule has left associativity, you reduce otherwise you shift Reduce/reduce conflicts are hopeless the parser generator always reduces using the rule listed first fatal error

Error Recovery All the empty entries in the ACTION and GOTO tables correspond to syntax errors We can either report it and stop the parsing continue parsing finding more errors (error recovery) Error recovery in CUP: S ::= L = E ; S ::= . L = E ; S ::= error . ; S ::= error ; . | { SL } ; S ::= . { SL } ; | error ; S ::= . error ; SL ::= S ; | SL S ; In case of an error, the parser pops out elements from the stack until it finds an error state where it can proceed then it discards tokens from the input until a restart is possible error

The Calculator Parser terminal LP, RP, COMMA, SEMI, ASSIGN, IF, THEN, ELSE, AND, OR, NOT, QUIT, PLUS, TIMES, MINUS, DIV, EQ, LT, GT, LE, NE, GE, FALSE, TRUE, DEFINE; terminal String ID; terminal Integer INT; terminal Float REALN; terminal String STRINGT; non terminal exp, string, name; non terminal expl, names; non terminal item, prog; precedence nonassoc ELSE; precedence right OR; precedence right AND; precedence nonassoc NOT; precedence left EQ, LT, GT, LE, GE, NE; precedence left PLUS, MINUS; precedence left TIMES, DIV;

Abstract Syntax Leonidas Fegaras

Abstract Syntax Tree (AST)
A parser typically generates an Abstract Syntax Tree (AST): A parse tree is not an AST get token get next character AST scanner parser source file token E T E F T E F T F id(x) + id(y) * id(z) + x * y z

Building Abstract Syntax Trees in Java
abstract class Exp { } class IntegerExp extends Exp { public int value; public IntegerExp ( int n ) { value=n; } class TrueExp extends Exp { public TrueExp () {} class FalseExp extends Exp { public FalseExp () {} class VariableExp extends Exp { public String value; public VariableExp ( String n ) { value=n; }

Exp (cont.) class BinaryExp extends Exp { public String operator;
public Exp left; public Exp right; public BinaryExp ( String o, Exp l, Exp r ) { operator=o; left=l; right=r; } } class UnaryExp extends Exp { public Exp operand; public UnaryExp ( String o, Exp e ) { operator=o; operand=e; } class ExpList { public Exp head; public ExpList next; public ExpList ( Exp h, ExpList n ) { head=h; next=n; }

Exp (cont.) class CallExp extends Exp { public String name;
public ExpList arguments; public CallExp ( String nm, ExpList s ) { name=nm; arguments=s; } } class ProjectionExp extends Exp { public Exp value; public String attribute; public ProjectionExp ( Exp v, String a ) { value=v; attribute=a; }

Exp (cont.) class RecordElements { public String attribute;
public Exp value; public RecordElements next; public RecordElements ( String a, Exp v, RecordElements el ) { attribute=a; value=v; next=el; } } class RecordExp extends Exp { public RecordElements elements; public RecordExp ( RecordElements el ) { elements=el; }

Examples The AST for the input (x-2)+3
new BinaryExp("+", new BinaryExp("-", new VariableExp("x"), new IntegerExp(2)), new IntegerExp(3)) The AST for the input f(x.A,true) new CallExp(“f”, new ExpList(new ProjectionExp(new VariableExp("x"), “A”), new ExpList(new TrueExp(),null)))

Gen A Java package for constructing and manipulating ASTs
you are required to use Gen for your project it is basically a Java preprocessor that adds syntactic constructs to the Java language to make the task of handling ASTs easier uses a universal class Ast to capture any kind of AST supports easy construction of ASTs using the #<...> syntax supports pattern matching, editing, pretty-printing, etc includes a symbol table class Architecture: file.gen file.java file.class Gen javac

The Gen Ast Class abstract class Ast { } class Number extends Ast {
public long value; public Number ( long n ) { value = n; } class Real extends Ast { public double value; public Real ( double n ) { value = n; } class Variable extends Ast { public String value; public Variable ( String s ) { value = s; } class Astring extends Ast { public Astring ( String s ) { value = s; }

AST Nodes are Instances of Node
class Node extends Ast { public String name; public Arguments args; public Node ( String n, Arguments a ) { tag = n; args = a; } } class Arguments { public Ast head; public Arguments tail; public Arguments ( Ast h, Arguments t ); public final static Arguments nil; public Arguments append ( Ast e );

Example To construct in Java, use: Ugly!
Binop(Plus,x,Binop(Minus,y,z)) in Java, use: new Node("Binop", Arguments.nil.append(new Variable("Plus")) .append(new Variable("x")) .append(new Node("Binop", Arguments.nil.append(new Variable("Minus")) .append(new Variable("y")) .append(new Variable("z"))))) Ugly! You should never use this kind of code in your project Binop Plus x Binop Minus y z

The #< > Brackets
When you write #<Binop(Plus,x,Binop(Minus,y,z))> in your Gen file, it generates the following Java code: new Node("Binop", Arguments.nil.append(new Variable("Plus")) .append(new Variable("x")) .append(new Node("Binop", Arguments.nil.append(new Variable("Minus")) .append(new Variable("y")) .append(new Variable("z"))))) which represents the AST: Binop(Plus,x,Binop(Minus,y,z)) Binop Plus x Binop Minus y z

Escaping a Value Using Backquote
Objects of the class Ast can be included into the form generated by the #< > brackets by “escaping” them with a backquote (`) The operand of the escape operator is expected to be an object of class Ast that provides the value to “fill in” the hole in the bracketed text at that point actually, an escaped string/int/double value is also lifted to an Ast For example Ast x = #<join(a,b,p)>; Ast y = #<select(`x,q)>; Ast z = #<project(`y,A)>; are equivalent to: Ast y = #<select(join(a,b,p),q)>; Ast z = #<project(select(join(a,b,p),q),A)>;

BNF of #< > bracketed ::= "#<" expr ">" an AST construction | "#[" arg "," ... "," arg "]" an Arguments construction expr ::= name the representation of a variable name | integer the repr. of an integer | real the repr. of a real number | string the repr. of a string | "`" name escaping to the value of name | "`(" code ")" escaping to the value of code | name "(" arg "," ... "," arg ")“ the repr. of an AST node with >=0 children | "`" name "(" arg "," ... "," arg ")" the repr. of an AST node with escaped name | expr opr expr an AST node that represents a binary infix opr | "`" name "[" expr "]" variable substitution arg ::= expr the repr. of an expression | "..." name escaping to a list of ASTs bound to name | "...(" code ")" escaping to a list of ASTs returned by code

“...” is for Arguments The three dots (...) construct is used to indicate a list of children in an AST node name in “...name” must be an instance of the class Arguments For example, in Arguments r = #[join(a,b,p),select(c,q)]; Ast z = #<project(...r)>; z will be bound to #<project(join(a,b,p),select(c,q))>

Example For example, is equivalent to the following Java code:
#<`f(6,...r,g("ab",`(k(x))),`y)> is equivalent to the following Java code: new Node(f, Arguments.nil.append(new Number(6)) .append(r) .append(new Node("g",Arguments.nil.append(new Astring("ab")) .append(k(x)))) .append(y) If f="h", r=#[2,z], y=#<m(1,"a")>, and k(x) returns the value #<8>, then the above term is equivalent to #<h(6,2,z,g("ab",8),m(1,"a"))>

Pattern Matching Gen provides a case statement syntax with patterns
Patterns match the Ast representations with similar shape Escape operators applied to variables inside these patterns represent variable patterns, which “bind” to corresponding subterms upon a successful match This capability makes it particularly easy to write functions that perform source-to-source transformations

Example A function that simplifies arithmetic expressions:
Ast simplify ( Ast e ) { #case e | plus(`x,0) => return x; | times(`x,1) => return x; | times(`x,0) => return #<0>; | _ => return e; #end; } where the _ pattern matches any value. For example, simplify(#<times(z,1)>) returns #<z>

BNF case_stmt ::= "#case" code case ... case "#end"
case ::= "|" expr guard "=>" code guard ::= ":" code an optional condition | expr ::= name exact match with a variable name | integer exact match with an integer | real exact match with a real number | string exact match with a string | "`" name match with the value of name | "`(" code ")" match with the value of code | name "(" arg "," ... "," arg ")“ match with an AST node with zero or more children | "`" name "(" arg "," ... "," arg ")" match with an AST node with escaped name | expr opr expr an AST node that represents a binary infix operation | "`" name "[" expr "]" second-order matching | "_" match any Ast arg ::= expr match with an Ast | "..." name match with a list of ASTs bound to name | "...(" code ")" match with a list of ASTs returned by code | "..." match the rest of the arguments

Examples The pattern `f(...r) matches any Ast Node
when it is matched with #<join(a,b,c)>, it binds f to the string "join" r to the Arguments #[a,b,c] The following function adds the terms #<8> and #<9> as children to any Node e: Ast add_arg ( Ast e ) { #case e | `f(...r) => return #<`f(8,9,...r)>; | `x => return x; #end; }

Another Example The following function switches the inputs of a binary join found as a parameter to a Node e: Ast switch_join_args ( Ast e ) { #case e | `f(...r,join(`x,`y),...s) => return #<`f(...r,join(`y,`x),...s)>; | `x => return x; #end; }

Second-Order Pattern Matching
When `f[expr] is matched against an Ast e, it traverses the entire tree representation of e (in preorder) until it finds a tree node that matches the pattern expr it fails when it does not find a match when it finds a match it succeeds it binds the variables in the pattern expr it binds the variable f to a list of Ast (of class Arguments) that represents the path from the root Ast to the Ast node that matched the pattern This is best used in conjunction with the bracketed expression `f[e], which uses the path bound in f to construct a new Ast with expr replaced with e

Misc Another syntactic construct in Gen is a for-loop that iterates over Arguments: "#for" name "in" code "do" code "#end" For example, #for v in #[a,b,c] do System.out.println(v); #end;

Adding Semantic Actions to a Parser
int E () { return Eprime(T()); }; int Eprime ( int left ) { if (current_token=='+') { read_next_token(); return Eprime(left + T()); } else if (current_token=='-') { return Eprime(left - T()); } else return left; }; int T () { if (current_token=='num') { int n = num_value; return n; } else error(); }; Grammar: E ::= T E' E' ::= + T E' | - T E' | T ::= num Recursive descent parser:

Table-Driven Predictive Parsers
use the parse stack to push/pop both actions and symbols but they use a separate semantic stack to execute the actions push(S); read_next_token(); repeat X = pop(); if (X is a terminal or '$') if (X == current_token) else error(); else if (X is an action) perform the action; else if (M[X,current_token] == "X ::= Y1 Y2 ... Yk") { push(Yk); ... push(Y1); } until X == '$';

Example Need to embed actions { code; } in the grammar rules
Suppose that pushV and popV are the functions to manipulate the semantic stack The following is the grammar of an interpreter that uses the semantic stack to perform additions and subtractions: E ::= T E' $ { print(popV()); } E' ::= + T { pushV(popV() + popV()); } E' | - T { pushV(-popV() + popV()); } E' | T ::= num { pushV(num); } For example, for 1+5-2, we have the following sequence of actions: pushV(1); pushV(5); pushV(popV()+popV()); pushV(2); pushV(-popV()+popV()); print(popV());

Bottom-Up Parsers can only perform an action after a reduction
We can only have rules of the form X ::= Y1 ... Yn { action } where the action is always at the end of the rule; this action is evaluated after the rule X ::= Y1 ... Yn is reduced How? In addition to state numbers, the parser pushes values into the parse stack If we want to put an action in the middle of the rhs of a rule, we use a dummy nonterminal, called a marker For example, X ::= a { action } b is equivalent to X ::= M b M ::= a { action }

CUP Both terminals and non-terminals are associated with typed values
these values are instances of the Object class (or of some subclass of the Object class) the value associated with a terminal is in most cases an Object, except for an identifier which is a String, for an integer which is an Integer, etc the typical values associated with non-terminals in a compiler are ASTs, lists of ASTs, etc You can retrieve the value of a symbol s at the lhs of a rule by using the notation s:x, where x is a variable name that hasn't appeared elsewhere in this rule The value of the non-terminal defined by a rule is called RESULT and should always be assigned a value in the action eg if the non-terminal E is associated with an Integer object, then E ::= E:n PLUS E:m {: RESULT = n+m; :}

Machinery The parse stack elements are of type
struct( state: int, value: Object ) int is the state number Object is the value When a reduction occurs, the RESULT value is calculated from the values in the stack and is pushed along with the GOTO state Example: after the reduction by E ::= E:n PLUS E:m {: RESULT = n+m; :} the RESULT value is stack[top-2].value + stack[top].value which is the new value pushed in the stack along with the GOTO state

ASTs in CUP Need to associate each non-terminal symbol with an AST type non terminal Ast exp; non terminal Arguments expl; exp ::= exp:e1 PLUS exp:e2 {: RESULT = new Node(plus_exp,e1,e2); :} | exp:e1 MINUS exp:e2 {: RESULT = new Node(minus_exp,e1,e2); :} | id:nm LP expl:el RP {: RESULT = new Node(call_exp,el.reverse() .cons(new Variable(nm))); :} | INT:n {: RESULT = new Number(n.intValue()); :} ; expl ::= expl:el COMMA exp:e {: RESULT = el.cons(e); :} | exp:e {: RESULT = nil.cons(e); :}

Semantic Analysis Leonidas Fegaras

Type Checking scanner parser type checking
get next character get token AST AST source file scanner parser type checking token symbol table type errors Checking whether the use of names is consistent with their declaration in the program int x; x := x+1; correct use of x x.A := 1; x[0] := 0; type errors Statically typed languages: done at compile time, not at run time Need to remember declarations Symbol Table

Symbol Table A compile-time data structure used to map names into declarations It stores: for each type name, its type definition eg. for the C type declaration typedef int* mytype, it maps the name mytype to a data structure that represents the type int* for each variable name, its type if the variable is an array, it also stores dimension information it may also store storage class, offset in activation record, etc for each constant name, its type and value for each function and procedure, its formal parameter list and its output type each formal parameter must have name type type of passing (by-reference, by-value, etc)

Symbol Table (cont.) Need to capture nested scopes, if necessary
{ int a; a = 1; }; a = 2; Interface: void insert ( String key, Object binding ) Object lookup ( String key ) begin_scope () end_scope ()

The Gen Symbol Table class SymbolCell { String name; Ast binding;
SymbolCell next; SymbolCell ( String n, Ast v, SymbolCell r ) { name=n; binding=v; next=r; } } public class SymbolTable { final int symbol_table_size = 997; SymbolCell[] symbol_table = new SymbolCell[symbol_table_size]; final int scope_stack_length = 100; int scope_stack_top = 0; int[] scope_stack = new int[scope_stack_length]; public SymbolTable () { scope_stack_top = 0; }

The Gen Symbol Table (cont.)
int hash ( String s ) { return Math.abs(s.hashCode()) % symbol_table_size; } public void insert ( String key, Ast binding ) { int loc = hash(key); symbol_table[loc] = new SymbolCell(key,binding,symbol_table[loc]); if (scope_stack_top >= scope_stack_length) fatal_error("stack overflow",new Variable(key)); else scope_stack[scope_stack_top++] = loc; public Ast lookup ( String key ) { for (SymbolCell s = symbol_table[loc]; s != null; s=s.next) if (s.name.equals(key)) return s.binding; return null;

The Gen Symbol Table (cont.)
public void begin_scope () { if (scope_stack_top >= scope_stack_length) fatal_error("stack overflow",new Number(0)); else scope_stack[scope_stack_top++] = -1; } public void end_scope () { int i = scope_stack_top-1; for (; scope_stack[i]>=0 && i>0; i--) { int loc = scope_stack[i]; symbol_table[loc] = symbol_table[loc].next; }; scope_stack_top = i;

Example hash(“a”)=12 push(-1) {
insert the binding a:int at the front of table[12] list push(12) pop() remove the head of table[12] list { int a; a = 1; }; a = 2; hash(“a”)=12

Type ASTs A typechecker is a function that maps an AST that represents an expression into its type Need to define the data structures for types: abstract class Type { } class IntegerType extends Type { public IntegerType () {} class BooleanType extends Type { public BooleanType () {} class NamedType extends Type { public String name; public NamedType ( String n ) { value=n; } class ArrayType extends Type { public Type element; public ArrayType ( Type et ) { element=et; }

Type ASTs (cont.) class RecordComponents { public String attribute;
public Type type; public RecordComponents next; public RecordComponents ( String a, Type t, RecordComponents el ) { attribute=a; type=t; next=el; } } class RecordType extends Type { public RecordComponents elements; public RecordType ( RecordComponents el ) { elements=el; }

Declarations The symbol table must contain type declarations (ie. typedefs),variable declarations, constant declarations, and function signatures: class SymbolCell { String name; Declaration binding; SymbolCell next; SymbolCell ( String n, Declaration v, SymbolCell r ) { name=n; binding=v; next=r; } } Symbol[] symbol_table = new Symbol[SIZE];

Declarations (cont.) abstract class Declaration { }
class TypeDeclaration extends Declaration { public Type declaration; public TypeDeclaration ( Type t ) { declaration=t; } class VariableDeclaration extends Declaration { public VariableDeclaration ( Type t ) { declaration=t; } class ConstantDeclaration extends Declaration { public Exp value; public ConstantDeclaration ( Type t, Exp v ) { declaration=t; value=v; }

Declarations (cont.) class TypeList { public Type head;
public TypeList next; public TypeList ( Type h, TypeList n ) { head=h; next=n; } } class FunctionDeclaration extends Declaration { public Type result; public TypeList parameters; public FunctionDeclaration ( Type t, TypeList tl ) { result=t; parameters=tl; }

Typechecking A tree traversals that checks each node of the AST tree recursively: static Type typecheck ( Exp e ) { if (e instanceof IntegerExp) return new IntegerType(); else if (e instanceof TrueExp) return new BooleanType(); else if (e instanceof FalseExp) else if (e instanceof VariableExp) { VariableExp v = (VariableExp) e; Declaration decl = lookup(v.value); if (decl == null) error("undefined variable"); else if (decl instanceof VariableDeclaration) return ((VariableDeclaration) decl).declaration; else error("this name is not a variable name");

Typechecking: BinaryExp
} else if (e instanceof BinaryExp) { BinaryExp b = (BinaryExp) e; Type left = typecheck(b.left); Type right = typecheck(b.right); switch ( b.operator ) { case "+": if (left instanceof IntegerType && right instanceof IntegerType) return new IntegerType(); else error("expected integers in addition"); ... }

Typechecking: CallExp
} else if (e instanceof CallExp) { CallExp c = (CallExp) e; Declaration decl = lookup(c.name); if (decl == null) error("undefined function"); else if (!(decl instanceof FunctionDeclaration)) error("this name is not a function name"); FunctionDeclaration f = (FunctionDeclaration) decl; TypeList s = f.parameters; for (ExpList r=c.arguments; r!=null && s!=null; r=r.next, s=s.next) if (!equal_types(s.head,typecheck(r.head))) error("wrong type of the argument in function call") if (r != null || s != null) error("wrong number of parameters"); return f.result; } equal_types(x,y) checks the types x and y for equality Two types of type equality: type equality based on type name equivalence, or based on structural equivalence

The Calculator Interpreter
Evaluate an expression e using a symbol table st: static double eval ( Ast e, SymbolTable st ) { if (e instanceof Number) return (double) ((Number) e).value(); else if (e instanceof Real) return ((Real) e).value(); else if (e instanceof Astring) return error("Strings are not permitted",e); else if (e instanceof Variable) { Ast s = st.lookup(((Variable) e).value()); if (s == null) return error("Undefined variable",e); else if (s instanceof Real) return ((Real) s).value(); else return error("Name is not a variable",e); }

The Calculator Interpreter (cont.)
else #case e | call_exp(`fnc,...args) => { double res; Ast s = st.lookup(((Variable) fnc).value()); if (s == null) return error("Undefined function",fnc); #case s | fnc_def(`body,...params) => { Arguments arguments = #[]; #for arg in args do arguments = arguments.append(new Real(eval(arg,st))); #end; if (params.length() != arguments.length()) return error("Wrong number of arguments",e); st.begin_scope(); #for param in params do st.insert(((Variable) param).value(),arguments.head()); arguments = arguments.tail(); res = eval(body); st.end_scope(); return res ; } | _ => return error("Name has not been defined as a function",fnc);

The Calculator Interpreter (cont.)
| if_exp(è1,è2,è3) => if (eval(e1,st) > 0) return eval(e2,st); else return eval(e3,st); | `f(è1,è2) => { double left = eval(e1,st); double right = eval(e2,st); #case new Variable(f) | plus_exp => return left + right; | minus_exp => return left - right; | times_exp => return left * right; | div_exp => return left / right; | and_exp => return ((left>0) && (right>0)) ? 1 : 0; | or_exp => return ((left>0) || (right>0)) ? 1 : 0; | eq_exp => return (left == right) ? 1 : 0; | ne_exp => return (left != right) ? 1 : 0; | gt_exp => return (left > right) ? 1 : 0; | lt_exp => return (left < right) ? 1 : 0; | ge_exp => return (left >= right) ? 1 : 0; | le_exp => return (left <= right) ? 1 : 0; #end; } | _ => return error("Unrecognized expression",e);

Run-Time Storage Organization
Leonidas Fegaras

Memory Layout Memory layout of an executable program:

Run-Time Stack At run-time, function calls behave in a stack-like manner when you call, you push the return address onto the run-time stack when you return, you pop the return address from the stack reason: a function may be recursive When you call a function, inside the function body, you want to be able to access formal parameters variables local to the function variables belonging to an enclosing function (for nested functions) procedure P ( c: integer ) x: integer; procedure Q ( a, b: integer ) i, j: integer; begin x := x+a+j; end; Q(x,c);

Activation Records (Frames)
When we call a function, we push an entire frame onto the stack The frame contains the return address from the function the values of the local variables temporary workspace ... The size of a frame is not fixed need to chain together frames into a list (via dynamic link) need to be able to access the variables of the enclosing functions efficiently A B C top

A Typical Frame Organization

Static Links The static link of a function f points to the latest frame in the stack of the function that statically contains f If f is not lexically contained in any other function, its static link is null procedure P ( c: integer ) x: integer; procedure Q ( a, b: integer ) i, j: integer; begin x := x+a+j; end; Q(x,c); If P called Q then the static link of Q will point to the latest frame of P in the stack Note that we may have multiple frames of P in the stack; Q will point to the latest there is no way to call Q if there is no P frame in the stack, since Q is hidden outside P in the program

The Code for Function Calls
When a function (the caller) calls another function (the callee), it executes the following code: pre-call: do before the function call allocate the callee frame on top of the stack evaluate and store function parameters in registers or in the stack store the return address to the caller in a register or in the stack post-call: do after the function call copy the return value deallocate (pop-out) the callee frame restore parameters if they passed by reference

The Code for Function Calls (cont.)
In addition, each function has the following code: prologue: to do at the beginning of the function body store frame pointer in the stack or in a display set the frame pointer to be the top of the stack store static link in the stack or in the display initialize local variables epilogue: to do at the end of the function body store the return value in the stack restore frame pointer return to the caller

Storage Allocation We can classify the variables in a program into four categories: statically allocated data that reside in the static data part of the program these are the global variables. dynamically allocated data that reside in the heap these are the data created by malloc in C register allocated variables that reside in the CPU registers these can be function arguments, function return values, or local variables frame-resident variables that reside in the run-time stack

Frame-Resident Variables
Every frame-resident variable (ie. a local variable) can be viewed as a pair of (level,offset) the variable level indicates the lexical level in which this variable is defined the offset is the location of the variable value in the run-time stack relative to the frame pointer procedure P ( c: integer ) x: integer; procedure Q ( a, b: integer ) i, j: integer; begin x := x+a+j; end; Q(x,c); level 1 level offset a b i j c x level 2

Variable Offsets procedure P ( c: integer ) x: integer;
procedure Q ( a, b: integer ) i, j: integer; begin x := x+a+j; end; Q(x,c);

Accessing a Variable Let $fp be the frame pointer
You are generating code for the body of a function at the level L1 For a variable with (level,offset)=(L2,O) you generate code: you traverse the static link (at offset -8) L1-L2 times to get the containing frame you accesss the location at the offset O in the containing frame eg, for L1=5, L2=2, and O=-16, we have Mem[Mem[Mem[Mem[$fp-8]-8]-8]-16] eg: a: Mem[$fp+8] b: Mem[$fp+4] i: Mem[$fp-12] j: Mem[$fp-16] c: Mem[Mem[$fp-8]+4] x: Mem[Mem[$fp-8]-12] level offset a b i j c x

The Code for the Call Q(x,c)
Mem[$sp] = Mem[$fp-12] ; push x $sp = $sp-4 Mem[$sp] = Mem[$fp+4] ; push c static_link = $fp call Q $sp = $sp+8 ; pop arguments

The Code for a Function Body
Prologue: Mem[$sp] = $fp ; store $fp $fp = $sp ; new beginning of frame $sp = $sp+frame_size ; create frame save return_address save static_link Epilogue: restore return_address $sp = $fp ; pop frame $fp = Mem[$fp] ; follow dynamic link return using the return_address

Finding Static Link The caller set the static_link of the callee before the call this is because the caller knows both the caller and callee the callee doesn't know the caller Suppose that L1 and L2 are the nesting levels of the caller and the callee procedures When the callee is lexically inside the caller's body, that is, when L2=L1+1, we have: static_link = $fp Otherwise, we follow the static link of the caller L1-L2+1 times For L1=L2, that is, when both caller and callee are at the same level, we have static_link = Mem[$fp-8] For L1=L2+2 we have static_link = Mem[Mem[Mem[$fp-8]-8]-8]

Finding Static Link (cont.)

Intermediate Representation
Leonidas Fegaras

Intermediate Representation (IR)
The semantic phase of a compiler translates parse trees into an intermediate representation (IR), which is independent of the underlying computer architecture generates machine code from the IRs This makes the task of retargeting the compiler to another computer architecture easier to handle The IR data model includes raw memory (a vector of words/bytes), infinite size registers (unlimited number) data addresses The IR programs are trees that represent instructions in a universal machine architecture

IR (cont.) Some IR specs are actually machine-dependent:
32bit, instead of 64bit addresses some registers have a special meaning (sp, fp, gp, ra) Most IR specs are left unspecified and must be designed: frame layout variable allocation in the static section, in a frame, as a register, etc data layout eg, strings can be designed to be null-terminated (as in C) or with an extra length (as in Java)

IR Example Represents the IR: which evaluates the program:
MOVE MEM MEM CONST TEMP CONST fp TEMP CONST fp Represents the IR: MOVE(MEM(+(TEMP(fp),CONST(-16))), +(MEM(+(TEMP(fp),CONST(-20))), CONST(10))) which evaluates the program: M[fp-16] := M[fp-20]+10

Expression IRs CONST(i): the integer constant i
MEM(e): if e is an expression that calculates a memory address, then this is the contents of the memory at address e (one word) NAME(n): the address that corresponds to the label n eg. MEM(NAME(x)) returns the value stored at the location X TEMP(t): if t is a temporary register, return the value of the register, eg. MEM(BINOP(PLUS,TEMP(fp),CONST(24))) fetches a word from the stack located 24 bytes above the frame pointer BINOP(op,e1,e2): evaluate e1, evaluate e2, and perform the binary operation op over the results of the evaluations of e1 and e2 op can be PLUS, AND, etc we abbreviate BINOP(PLUS,e1,e2) by +(e1,e2) CALL(f,[e1,e2,...,en]): evaluate the expressions e1, e2, etc (in that order), and at the end call the function f over these n parameters eg. CALL(NAME(g),ExpList(MEM(NAME(a)),ExpList(CONST(1),NULL))) represents the function call g(a,1) ESEQ(s,e): execute statement s and then evaluate and return the value of the expression e

Statement IRs MOVE(TEMP(t),e): store the value of the expression e into the register t MOVE(MEM(e1),e2): evaluate e1 to get an address, then evaluate e2, and then store the value of e2 in the address calculated from e1 eg, MOVE(MEM(+(NAME(x),CONST(16))),CONST(1)) computes x[4] := 1 (since 4*4 bytes = 16 bytes). EXP(e): evaluate e and discard the result JUMP(L): Jump to the address L L must be defined in the program by some LABEL(L) CJUMP(o,e1,e2,t,f): evaluate e1 & e2. If the values of e1 and e2 are related by o, then jump to the address calculated by t, else jump the one for f the binary relational operator o must be EQ, NE, LT etc SEQ(s1,s2,...,sn): perform statement s1, s2, ... sn is sequence LABEL(n): define the name n to be the address of this statement you can retrieve this address using NAME(n)

Local Variables Local variables located in the stack are retrieved using an expression represented by the IR MEM(+(TEMP(fp),CONST(offset))) If a variable is located in an outer static scope k levels higher than the current scope, we follow the static chain k times, and then we retrieve the variable using the offset of the variable eg, if k=3: MEM(+(MEM(+(MEM(+(MEM(+(TEMP(fp),CONST(static))), CONST(static))), CONST(offset))) where static is the offset of the static link (for our frame layout, static = -8)

L-values An l-value is the result of an expression that can occur on the left of an assignment statement eg, x[f(a,6)].y is an l-value It denotes a location where we can store a value It is basically constructed by deriving the IR of the value and then dropping the outermost MEM call For example, if the value is MEM(+(TEMP(fp),CONST(offset))) then the l-value is: +(TEMP(fp),CONST(offset))

Data Layout: Vectors Usually stored in the heap
Fixed-size vectors are usually mapped to n consecutive elements otherwise, the vector length is also stored before the elements In Tiger, vectors start from index 0 and each vector element is 4 bytes long (one word), which may represent an integer or a pointer to some value To retrieve the ith element of an array a, we use MEM(+(A,*(I,CONST(4)))) where A is the address of a and I is the value of i But this is not sufficient. The IR should check whether I<size(a): ESEQ(SEQ(CJUMP(gt,I,CONST(size_of_A), NAME(next),NAME(error_label)), LABEL(next)), MEM(+(A,*(I,CONST(4))))) 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9

Records For records, we need to know the byte offset of each field (record attribute) in the base record Since every value is 4 bytes long, the ith field of a structure a can be retrieved using MEM(+(A,CONST(i*4))), where A is the address of a here i is always a constant since we know the field name

Records (cont.) For example, suppose that i is located in the local frame with offset -24 and a is located in the immediate outer scope and has offset -40. Then, the statement a[i+1].first := a[i].second+2 is translated into the IR: MOVE(MEM(MEM(+(A,*(+(I,CONST(1)),CONST(4))))), +(MEM(+(MEM(+(A,*(I,CONST(4)))),CONST(4))), CONST(2))) where I = MEM(+(TEMP(fp),CONST(-24))) and A = MEM(+(MEM(+(TEMP(fp),CONST(-8))),CONST(-40))) since the offset of first is 0 and the offset of second is 4

Strings In Tiger, strings of size n are allocated in the heap in n+4 consecutive bytes, where the first 4 bytes contain the size of the string The string is simply a pointer to the first byte String literals are statically allocated Other languages, such as C, store a string of size n into the heap in n+1 consecutive bytes the last byte has a null value to indicate the end of string Then, you can allocate a string with address A of size n in the heap by adding n+1 to the global pointer (gp): MOVE(A,ESEQ(MOVE(TEMP(gp), +(TEMP(gp),CONST(n+1))), TEMP(gp)))

Control Statements The while loop is evaluated in the following way:
while c do body; is evaluated in the following way: loop: if c goto cont else goto done cont: body goto loop done: which corresponds to the following IR: SEQ(LABEL(loop), CJUMP(EQ,c,0,NAME(done),NAME(cont)), LABEL(cont), body, JUMP(NAME(loop)), LABEL(done))

For-Loops The for statement is evaluated in the following way:
for i:=lo to hi do body is evaluated in the following way: i := lo j := hi if i>j goto done loop: body i := i+1 if i<=j goto loop done:

Other Control Statements
The break statement is translated into a JUMP The compiler keeps track which label to JUMP to on a “break” statement by maintaining a stack of labels that holds the “done:” labels of the for- or while-loop When it compiles a loop, it pushes the label in the stack, and when it exits a loop, it pops the stack The break statement is thus translated into a JUMP to the label at the top of the stack. A function call f(a1,...,an) is translated into the IR CALL(NAME(L),[sl,e1,...,en]) where L is the label of the first statement of the f code, sl is the static link, and ei is the IR for ai For example, if the difference between the static levels of the caller and callee is one, then sl is MEM(+(TEMP(fp),CONST(-8)))

Example Suppose that records and vectors are implemented as pointers (i.e. memory addresses) to dynamically allocated data in the heap. Consider the following declarations: struct { X: int, Y: int, Z: int } S; /* a record */ int i; int V[10][10]; /* a vector of vectors */ where the variables S, i, and V are stored in the current frame with offsets -16, -20, and -24 respectively We will the following abbreviations: S = MEM(+(TEMP(fp),CONST(-16))) I = MEM(+(TEMP(fp),CONST(-20))) V = MEM(+(TEMP(fp),CONST(-24)))

Example (cont.) S.Z+S.X if (i<10) then S.Y := i else i := i-1
+(MEM(+(S,CONST(8))),MEM(S)) if (i<10) then S.Y := i else i := i-1 SEQ(CJUMP(LT,I,CONST(10),trueL,falseL), LABEL(trueL), MOVE(MEM(+(S,CONST(4))),I), JUMP(exit), LABEL(falseL), MOVE(I,-(I,CONST(1))), LABEL(exit))

Example (cont.) V[i][i+1] := V[i][i]+1 for i:=0 to 9 do V[0][i] := i
MOVE(MEM(+(MEM(+(V,*(I,CONST(4)))), *(+(I,CONST(1)),CONST(4)))), +(MEM(+(MEM(+(V,*(I,CONST(4)))),*(I,CONST(4)))), CONST(1))) for i:=0 to 9 do V[0][i] := i SEQ(MOVE(I,CONST(0)), MOVE(TEMP(t1),CONST(9)), CJUMP(GT,I,TEMP(t1),done,loop), LABEL(loop), MOVE(MEM(+(MEM(V),*(I,CONST(4)))),I), MOVE(I,+(I,CONST(1))), CJUMP(LEQ,I,TEMP(t1),loop,done), LABEL(done))

Instruction Selection
Leonidas Fegaras

Basic Blocks and Traces
Many computer architectures have instructions that do not exactly match our IR representations they do not support two-way branching as in CJUMP(op,e1,e2,l1,l2) nested calls, such as CALL(f,[CALL(g,[...])]), will cause interference between register arguments and returned results the nested SEQs, such as SEQ(SEQ(s1,s2),s3), impose an order of a evaluation, which restricts optimization if s1 and s2 do not interfere with each other, we want to be able to switch the SEQ(s1,s2) with the SEQ(s2,s1) because it may result to a more efficient program We will fix these problems in two phases: transforming IR trees into a list of canonical trees, and transforming unrestricted CJUMPs into CJUMPs that are followed by their false target label

Canonical Trees An IR is a canonical tree if it does not contain SEQ or ESEQ and the parent node of each CALL node is either an EXP or a MOVE(TEMP(t),...) node Method: we transform an IR in such a way that all ESEQs are pulled up in the IR and become SEQs at the top of the tree. At the end, we are left with nested SEQs at the top of the tree, which are eliminated to form a list of statements For example, the IR: SEQ(MOVE(NAME(x),ESEQ(MOVE(TEMP(t),CONST(1)),TEMP(t))), JUMP(ESEQ(MOVE(NAME(z),NAME(L)),NAME(z)))) is translated into: SEQ(SEQ(MOVE(TEMP(t),CONST(1)), MOVE(NAME(x),TEMP(t))) SEQ(MOVE(NAME(z),NAME(L)), JUMP(NAME(z))) which corresponds to a list of statements: [ MOVE(TEMP(t),CONST(1)), MOVE(NAME(x),TEMP(t)), MOVE(NAME(z),NAME(L)), JUMP(NAME(z)) ]

Some Rules ESEQ(s1,ESEQ(s2,e)) = ESEQ(SEQ(s1,s2),e)
BINOP(op,ESEQ(s,e1),e2) = ESEQ(s,BINOP(op,e1,e2)) MEM(ESEQ(s,e)) = ESEQ(s,MEM(e)) JUMP(ESEQ(s,e)) = SEQ(s,JUMP(e)) CJUMP(op,ESEQ(s,e1),e2,l1,l2) = SEQ(s,CJUMP(op.e1,e2,l1,l2)) BINOP(op,e1,ESEQ(s,e2)) = ESEQ(MOVE(temp(t),e1),ESEQ(s,BINOP(op,TEMP(t),e2))) CJUMP(op,e1,ESEQ(s,e2),l1,l2) = SEQ(MOVE(temp(t),e1),SEQ(s,CJUMP(op,TEMP(t),e2,l1,l2))) MOVE(ESEQ(s,e1),e2) = SEQ(s,MOVE(e1,e2)) To handle function calls, we store the function results into a new register: CALL(f,a) = ESEQ(MOVE(TEMP(t),CALL(f,a)),TEMP(t)) That way expressions, such as +(CALL(f,a),CALL(g,b)), would not rewrite each others result register

Basic Blocks Need to transform any CJUMP into a CJUMP whose false target label is the next instruction after CJUMP this reflects the conditional JUMP found in most architectures We will do that using basic blocks A basic block is a sequence of statements whose first statement is a LABEL, the last statement is a JUMP or CJUMP, and does not contain any other LABELs, JUMPs, or CJUMPs we can only enter at the beginning of a basic block and exit at the end

Algorithm We first create the basic blocks for an IR tree
then we reorganize the basic blocks in such a way that every CJUMP at the end of a basic block is followed by a block the contains the CJUMP false target label A secondary goal is to put the target of a JUMP immediately after the JUMP that way, we can eliminate the JUMP (and maybe merge the two blocks) The algorithm is based on traces

Traces You start a trace with an unmark block and you consider the target of the JUMP of this block or the false target block of its CJUMP then, if the new block is unmarked, you append the new block to the trace, you use it as your new start, and you apply the algorithm recursively otherwise, you close this trace and you start a new trace by going back to a point where there was a CJUMP and you choose the true target this time You continue until all blocks are marked

Traces (cont.) This is a greedy algorithm
At the end, there may be still some CJUMPs that have a false target that does not follow the CJUMP this is the case where this false target label was the target of another JUMP or CJUMP found earlier in a trace in that case: if we have a CJUMP followed by a true target, we negate the condition and switch the true and false targets otherwise, we create a new block LABEL(L) followed by JUMP(F) and we replace CJUMP(op,a,b,T,F) with CJUMP(op,a,b,T,L) Also, if there is a JUMP(L) followed by a LABEL(L), we remove the JUMP

Instruction Selection
After IR trees have been put into a canonical form, they are used in generating assembly code The obvious way to do this is to macro-expand each IR tree node For example, MOVE(MEM(+(TEMP(fp),CONST(10))),CONST(3)) is macro-expanded into the pseudo-assembly code: TEMP(fp) t1 := fp CONST(10) t2 := 10 +(TEMP(fp),CONST(10)) t3 := t1+t2 CONST(3) t4 := 3 MOVE(MEM(...),CONST(3)) M[t3] := t4 where ti stands for a temporary variable This method generates very poor quality code It can be done using only one instruction in most architectures M[fp+10] := 3

Maximum Munch Maximum munch generates better code, especially for RISC machines The idea is to use tree pattern matching to map a tree pattern (a fragment of an IR tree) into a list of assembly instructions these tree patterns are called tiles For RISC we always have one-to-one mapping (one tile to one assembly instruction) for RISC machines the tiles are small (very few number of IR nodes) for CISC machines the tiles are usually large since the CISC instructions are very complex

Tiles The following is the mapping of some tiles into MIPS code: IR
IR Tile CONST(c) li 'd0, c +(e0,e1) add 'd0, 's0, 's1 +(e0,CONST(c)) add 'd0, 's0, c *(e0,e1) mult 'd0, 's0, 's1 *(e0,CONST(2^k)) sll 'd0, 's0, k MEM(e0) lw 'd0, ('s0) MEM(+(e0,CONST(c))) lw 'd0, c('s0) MOVE(MEM(e0),e1) sw 's1, ('s0) MOVE(MEM(+(e0,CONST(c))),e1) sw 's1, c('s0) JUMP(NAME(X)) b X JUMP(e0) jr 's0 LABEL(X) X: nop IR e0 e1 en d0 tile s0 s1 sn

Tiling To translate an IR tree into assembly code, we perform tiling:
we cover the IR tree with non-overlapping tiles we can see that there are many different tilings eg, the IR for a[i]:=x is: MOVE(MEM(+(MEM(+(TEMP(fp),CONST(20))), *(TEMP(i),CONST(4)))), MEM(+(TEMP(fp),CONST(10)))) The following are two possible tilings of the IR: lw r1, 20($fp) add r1, $fp, 20 lw r2, i lw r1, (r1) sll r2, r2, 2 lw r2, i add r1, r1, r2 sll r2, r2, 2 lw r2, 10($fp) add r1, r1, r2 sw r2, (r1) add r2, $fp, x lw r2, (r2) sw r2, (r1) The left tiling is obviously better since it can be executed faster

Optimum Tiling It's highly desirable to do optimum tiling:
to generate the shortest instruction sequence alternatively the sequence with the fewest machine cycles This is not easy to achieve Two main ways of performing optimum tiling using maximal munch (a greedy algorithm): you start from the IR root and from all matching tiles you select the one with the maximum number of IR nodes you go to the children of this tile and apply the algorithm recursively until you reach the tree leaves using dynamic programming: it works from the leaves to the root it assigns a cost to every tree node by considering every tile that matches the node and calculating the minimum value of: cost of a node = (number of nodes in the tile) + (total costs of all the tile children)

Maximal Munch Example A lw r1, fp B lw r2, 8(r1) C lw r3, i
D sll r4, r3, 2 E add r5, r2, r4 F lw r6, fp G lw r7, 16(r6) H add r8, r7, 1 I sw r8, (r5)

Liveness Analysis and Register Allocation
Leonidas Fegaras

Liveness Analysis So far we assumed that we have a very large number of temporary variables stored in registers This is not true for real machines CISC machines have very few registers Pentium has 6 general registers only It's highly desirable to use one machine register for multiple temporary variables eg, variables a and b do not interfere so they can be assigned to the same register R1: a := 0 R1 := 0 L1: b := a L1: R1 := R1+1 c := c+b R2 := R2+R1 a := b*2 R1 := R1*2 if a<10 goto L1 if R1<10 goto L1 return c return R2

Checking Variable Liveness
A variable x is live at a particular point (statement) in a program, if it holds a value that may be needed in the future That is, x is live at a particular point if there is a path (possibly following gotos) from this point to a statement that uses x and there is no assignment to x in any statement in the path because if there was an assignment to x, the old value of x is discarded before it is used x := ... use of x a b c a := 0 L1: b := a+1 X X c := c+b X X X means live a := b*2 X X if a<10 goto L1 X X return c X no assignments to x x is live

Control Flow Graph (CFG)
The CFG nodes are individual statements (or maybe basic blocks) The CFG edges represent potential flow of control the outgoing edges of a node n are succ[n] the ingoing edges are pred[n] For each CFG node n we define use[n] to be all the variables used (read) in this statement def[n] all the variables assigned a value (written) in this statement For example, a := 0 2. L1: b := a+1 c := c+b a := b*2 if a<10 goto L1 return c succ[5]=[6,2] pred[5]=[4] use[3]=[b,c] def[3]=[c]

Using CFG A variable v is live at a statement n if there is a path in the CFG from this statement to a statement m such that vÎuse[m] and for each nk< m: vdef[k] That is, there is no assignment to v in the path from n to m For example, c is live in 4 since it is used in 6 and there is no assignment to c in the path from 4 to 6 Liveness analysis analyzes a CFG to determine which places variables are live or not it is a data flow analysis since it flows around the edges of the CFG information the liveness of variables For each CFG node n we derive two sets: Live-in: in[n] gives all variables that are live before the execution of statement n Live-out: out[n] gives all variables that are live after the execution of statement n

in/out We compute in/out from the sets: succ, use and def using the following properties of in and out: vÎuse[n]  vÎin[n] ie, if v is used in n then v is live-in in n (regardless whether it is defined in n) vÎ(out[n]-def[n])  vÎin[n] ie, if v is live after the execution of n and is not defined in n, then v is live before the execution of n for each sÎsucc[n]: vÎin[s]  vÎout[n] this reflects the formal definition of the liveness of variable v

Algorithm We repeatedly execute the loop until we can't add any more elements: foreach n: in[n]:=; out[n]:=; repeat foreach n: in'[n] := in[n] out'[n] := out[n] in[n] := use[n]  (out[n]-def[n]) out[n] := sÎsucc[n] in[s] until in'=in and out'=out The algorithm converges very fast if we consider the CFG nodes in the reverse order (when is possible) The life of a variable can be directly derived from vector in[]: if vÎin[n] then v is live at statement n

Example 1st 2nd use def out in out in c c c a c ac c ac
2. L1: b := a+1 c := c+b a := b*2 if a<10 goto L1 return c 1st 2nd use def out in out in c c c a c ac c ac b a ac bc ac bc bc c bc bc bc bc a b bc ac bc ac 1 a ac c ac c

Interference Graph Nodes are the program variables
For each node v and w there is an interference edge if the lives of the variables v and w overlap on at least one program point (statement) For each program point n, and for each xin[n] and yin[n], we draw an edge (x,y) For example, the previous program has an interference graph: a b c

Example x y z w u v 1. v := 1 2. z := v+1 X 3. x := z * v X X
4. y := x * 2 X X 5. w := x+z*y X X X 6. u := z X X X 7. v := u+w+y X X X 8. return v * u X X

Register Allocation Recall: if there is an edge between two variables in the interference graph then these variables interfere The interference graph is used for assigning registers to temporary variables If two variables do not interfere then we can use the same register for both of them, thus reducing the number of registers needed if there is a graph edge between two variables, then we should not assign the same register to them since this register needs to hold the values of both variable at one point of time

Graph Coloring Graph coloring problem: we try to assign one of the n different colors to graph nodes so that no two adjacent nodes have the same colors Used in map drawing where we have countries or states on the map and we want to colors them using a small fixed number of colors so that states that have a common border are not painted with the same color The graph in this case has one node for each state and an edge between two states if they have common borders WA ID OR NV CO OK UT CA AZ NM TX

Register Allocation Is basically graph coloring: registers are colors
We use a stack of graph nodes. Each time: we select a node from the interference graph that has fewer than n neighbours we remove the selected node from the graph (along with its edges) we push the selected node in the stack We continue selecting nodes until we remove all nodes This is called simplification of the graph The idea is that if we can color the graph after we remove this node, then we can color the original graph (with the node included) Why? because the neighbours of this node can have n-1 different colors in the worst case; so we can just assign the available nth color to the node

Spilling Sometimes though we cannot simplify any further because all nodes have n or more neighbours In that case, we select one node (ie. variable) to be spilled into memory instead of assigning a register to it This is called spilling and the spilled victim can be selected based on priorities, eg which variable is used less frequently is it outside a loop, etc The spilled node is also pushed on the stack

Selection When the graph is completely reduced
we pop the stack one node at a time we rebuild the interference graph and at the same time we assign a color to the popped-out node so that its color is different from the colors of its neighbours This is called the selection phase If we can't assign a color to a node, we spill out the node into memory a node selected to be spilled out during the spill phase does not necessarily mean that it will actually spilled into memory at the end If there are spilled nodes, we use a memory access for each spilled variable eg. we can use the frame location $fp-24 to store the spilled temporary variable and we replace all occurrences of this variable in the program with M[$fp-24]

Example C F A B D E Selection: x v z w u y
Registers: x=R2, y=R0, z=R1, w=R2, u=R1, v=R0 x x x x x y z y z y z y z y z R0 R0 R0 R0 R1 R0 R1 w w w R2 w R2 w R2 v u v v v v u u u u R0 R1 R1 R1 R1

Coalescing If there is a move instruction in a program X:=Y and there is no conflict between X and Y, we can use the same register for both X and Y and remove the move entirely from the program we merge the graph nodes for X and Y in the graph into one node nodes are now labelled by sets of variables, instead of just one variable It is a good: it reduces the number of registers needed and it removes the move instructions It is bad: it increases the number of neighbours of the merged nodes, which may lead to an irreducible graph and a potential spilling We add another phase to the register allocation algorithm, called coalescing, that coalesces move related nodes If we derive an irreducible graph at some point of time, we do freezing, that de-coalesces one node

Why is it Useful Coalescing is very useful when handling callee-save registers in a procedure Suppose that r3 is a callee-save register. The procedure needs to save this register into a temporary variable at the beginning of the procedure (eg. A := r3) restore it at the end of the procedure (ie. r3 := A) That way, if r3 is not used at all during the procedure body, it will be coalesced with A and the move instructions will be removed Coalescing can happen in many other different situations as long as there is no interference Note that registers in a program are handled as temporary variables with a preassigned color (precolored nodes) This means that precolored nodes can only be coalesced with other nodes (they cannot be simplified or spilled)

Criteria for Coalescing
Let n be the number of available registers Briggs criterion: we coalesce two nodes if the merged node has fewer than n neighbours of degree greater than or equal to n George criterion: we coalesce nodes if all the neighbours of one of the nodes with degree greater than or equal to n already interfere with the other node

Example r1, r2 are caller-save registers r3 is callee-save register
int f ( int a, int b ) { int d = 0; int e = a; do { d = d+b; e = e-1; } while (e>0); return d; } enter: c = r3 a = r1 b = r2 d = 0 e = a loop: d = d+b e = e-1 if e>0 goto loop r1 = d r3 = c return (r1, r3 live out)

Example (cont.) Cannot simplify now Need to spill a variable
enter: c = r3 a = r1 b = r2 d = 0 e = a loop: d = d+b e = e-1 if e>0 goto loop r1 = d r3 = c return (r1, r3 live out) c r3 b r2 e r1 a d Cannot simplify now Need to spill a variable

Calculating Spill Priorities
Assume that the loop is done 10 times Spill priority = (uses+defs) / degree node uses+defs degree spill priority a b c d 2+2* e 1+3* c has the lowest priority (ie, is used the least) So we spill c

After Spilling c We coalesce a and e because
the merged node (ae) will have fewer than n neighbours of degree >= n r2 e r1 a d r3 b We can now coalesce r2 with b and r1 with ae r2 r1 ae d r3 r2b We can now simplify r1ae r3 a = r1 b = r2 d = r3 e = r1 d r2b r1ae

Code Generation for Trees
Goal: generate assembly code for complex expression trees using the fewest number of registers to store intermediate results Suppose that we have two-address instructions of the form op Ri, T where op is an operation (add, sub, mult, etc) Ri is a register (R1, R2, R3, etc) T is an address mode such as a memory reference, a register, indirect access, indexing etc We also have a move instruction of the form: load Ri, T

Example For example, for the expression (A-B)+((C+D)+(E*F)), which corresponds to the AST: + / \ / \ / \ / \ A B * / \ / \ C D E F we want to generate the assembly code at the right That is, we used only two register load R2, C add R2, D load R1, E mult R1, F add R2, R1 load R1, A sub R1, B add R1, R2

Sethi-Ullman Algorithm
Generates code with the least number of registers two phases: The numbering phase assigns a number to each tree node that indicates how many registers are needed to evaluate the subtree of this node The code generation phase generates code for each subtree recursively (bottom-up)

How do we Know How Many Registers we Need?
Suppose that for a tree node T, we need l registers to evaluate its left subtree and r registers to evaluate its right subtree Then if one of these numbers is larger, say l > r, then we can evaluate the left subtree first and store its result into one of the registers, say Rk Now we can evaluate the right subtree using the same registers we used for the left subtree, except of course Rk since we want to remember the result of the left subtree This means that we need l registers to evaluate T too The same happens if r > l but now we need to evaluate the right subtree first and store the result to a register If l = r we need an extra register r+1 to remember the result of the left subtree If T is a tree leaf, then the number of registers to evaluate T is either 1 or 0 depending whether T is a left or a right subtree

Numbering Phase Algorithm: Example: + / \ / \ - + / \ / \ A B + *
if T is a left leaf then regs(T) = 1 else if T is a right leaf then regs(T) = 0 else let l=regs(T.left), r=regs(T.right) if (l=r) then regs(T) = r+1 else regs(T) = max(l,r) Example: + / \ / \ / \ / \ A B * / \ / \ C D E F 2 / \ / \ / \ / \ / \ / \

Code Generation We use a stack of available registers
it contains all the registers in order (lower register at the top) generate(T) = if T is a leaf write “load top(), T” if T is an internal node with children l and r then if regs(r) = 0 then { generate(l); write “op top(), r” } if regs(l ) > = regs(r) then { generate(l) R := pop() generate(r) write “op R, top()” push(R) } if regs(l ) < regs(r) then { swap the top two elements of the stack generate(l) write “op top(), R” push(R) swap the top two elements of the stack }

Storage Allocation Leonidas Fegaras

Heap-Allocated Data The simplest heap allocator that does not reclaim storage similar to the one used in the project char heap[heap_size]; int end_of_heap = 0; void* malloc ( int size ) { void* loc = (void*) &heap[end_of_heap]; end_of_heap += size; return loc; };

With a Free List Need to recycle the dynamically allocated data that are not used This is done using free in C or delete in C++ Need to link all deallocated data in the heap into a list Initially, the free list contains only one element that covers the entire heap (ie, it's heap_size bytes long) typedef struct Header { struct Header *next; int size; } Header; Header* free_list; free simply puts the recycled cell at the beginning of the free list: void free ( void* x ) { if ("size of *x" <= sizeof(Header)) return; ((Header*) x)->next = free_list; ((Header*) x)->size = "size of *x"; free_list = (Header*) x; };

Malloc malloc first searches the free list to find a cell large enough to fit the given number of bytes. If it finds one, it gets a chunk out of the cell leaving the rest untouched: void* malloc ( int size ) { Header* prev = free_list; for (Header* r=free_list; r!=0; prev=r, r=r->next) if (r->size > size+sizeof(Header)) { Header* new_r = (Header*) (((char*) r)+size); new_r->next = r->next; new_r->size = r->size; if (prev==free_list) free_list = new_r; else prev->next = new_r; return (void*) r; }; void* loc = (void*) &heap[end_of_heap]; end_of_heap += size; return loc; };

Problems with Manual Allocation
lots of overhead in malloc since the free list may be very long fragmentation of the heap into tiny cells even though the total free space in the free list may be plenty, it is useless for large object allocation improvement: we can keep a vector of free lists, so that the nth element of the vector is a free list that links all the free cells of size n the programmer is responsible for allocating and deleting objects explicitly it is the source of the worst and hardest to find bugs it's also the source of most of the mysterious program crashes it causes horrible memory problems due to “overflow”, “fence past errors”, “memory corruption”, “step-on-others-toe” (hurting other variable's memory locations) or “memory leaks” the memory problems are extremely hard to debug and are very time consuming to fix and troubleshoot

Problems (cont.) Why memory management is so hard to do correctly?
memory problems bring down the productivity of programmers memory related bugs are very tough to crack, and even experienced programmers take several days or weeks to debug memory related problems memory bugs may be hidden inside the code for several months and can cause unexpected program crashes a program may work fine in a platform but have memory bugs when ported in a new platform, making programs non-portable it is estimated that the memory bugs due to usage of char* and pointers in C/C++ is costing $2 billions every year in time lost due to debugging and downtime of programs Why memory management is so hard to do correctly? you need to have a global view of how dynamic instances of a type are created and passed around this destroys the good software engineering principle that programs should be developed in small independent components

Reference Counting Keeping track of pointer assignments
If more than one objects point to a dynamically allocated object, then the latter object should be deleted only if all objects that point to it do not need it anymore you need to keep track of how many pointers are pointing to each object Used to be popular for OO languages like C++ Note: this is not automatic garbage collection because the programmer again is responsible of putting counters to every object to count references This method is easy to implement for languages like C++ where you can redefine the assignment operator dest=source, when both dest and source are pointers to data

Reference Counting (cont.)
Instead of using C++ for a pointer to an object C, we use Ref<C>, where the template Ref provides reference counting to C: template< class T > class Ref { private: int count; T* pointer; void MayDelete () { if (count==0) delete pointer; }; void Copy ( const Ref &sp ) { ++sp.count; count = sp.count; pointer = sp.pointer; public: Ref ( T* ptr = 0 ) : count(1), pointer(ptr) {}; Ref ( const Ref &sp ) { Copy(sp); }; ~Ref () { MayDelete(); }; T* operator-> () { return pointer; }; Ref& operator= ( const Ref &sp ) { if (this != &sp) { count--; MayDelete(); Copy(sp); }; return *this;

Problems with Reference Counting
Reference counting avoids some misuses of the heap but it comes with a high cost: every assignment takes many cycles to be completed and some of the work may be unnecessary since you may pass a pointer around causing many unnecessary counter increments/decrements we cannot get rid of cyclic objects (eg. when A points to B and B points to A) using reference counting all objects that participate in a cyclic chain of references will always have their counters greater than zero, so they will never be deleted, even if they are garbage

Automatic Garbage Collection
Heap-allocated records that are not reachable by any chain of pointers from program variables are garbage Garbage collection: a program does not reclaim memory manually when the heap is full, the run-time system suspends the program and starts garbage collection char heap[heap_size]; int end_of_heap = 0; void* malloc ( int size ) { if (size+end_of_heap > heap_size) GC(); void* loc = (void*) &heap[end_of_heap]; end_of_heap += size; return loc; };

Not Reachable => Garbage
Conservative approximation: if we can reach an object by following pointers from variables, then the object is live (not garbage) Roots = program variables (frame-allocated or static) need to check all frames in the run-time stack for pointers to heap conservative approach: if a word has a value between the minimum and maximum address of the heap, then it is a pointer to the heap An object is live if it is pointed by either a root or by a live object a garbage collector needs to start from each root and following pointers recursively

Mark-and-Sweep Collection
Two phases: Mark: starting from roots, mark all reachable objects by using a depth- first-search pointer traversal Sweep: scan the heap from the beginning to the end and reclaim the unmarked objects (and unmark the marked objects) DFS ( p ) { if (*p record is unmarked) then { mark *p; for each pointer p->fi of the record *p do DFS(p->fi) } for each p in roots DFS(p) p = 'first object in the heap' while p is in the heap do { if *p is marked then unmark *p else insert *p into the free list p = p+(size of record *p) }

Example

Example (cont.) free list

Pointer Reversal Trick: use the objects themselves as a stack previous
current current

Copying Collection Need two heaps Copying garbage collection:
from-space: the current working heap to-space: needs to be in memory during garbage collection Copying garbage collection: create the to-space heap in memory copy the live objects from the from-space to the to-space must make sure that pointers are referring to the to-space (pointer forwarding) dispose the from-space and use the to-space as the new from-space

Forwarding a Pointer forward (p) { if p points to from-space
then if p.f1 points to to-space then return p.f1 else { for each field fi of p do next.fi := p.fi p.f1 := next next.f1 := next next := next + (size of *p) return p.f1 } else return p

Cheney's Algorithm Breadth-first-search Locality of reference
scan := begin-of-to-space next := scan for each root r r := forward(r) while scan < next { for each field fi of *scan scan.fi := forward(scan.fi) scan := scan + (size of *scan) }

Example

Forwarding the Roots After we forward the roots from the from-space to the to-space, the to-space will contain the forwarded roots, and the roots and the forward pointers of the root elements in the from-space will point to the to-space

Example (cont.) Then, we forward the pointers of the first element of the to-space pointed by the scan pointer (element 51). The first pointer 6 has already been forwarded to 52. The second pointer 3 is forwarded at the end of the to-space to 54

Example (cont.) Now we forward the pointer 8 of element 52

Example (cont.) and the pointer 7 of element 52

Example (cont.) Now we forward the pointer 10 of element 53

Example (cont.) Then we forward the pointer 5 of element 54

Example (cont.) The pointer 10 of element 56 has already been forwarded

Cheney’s Copying Collector
It is good Very simple allocation algorithm No need for stack (since is not recursive) Its run time depends on the number of live objects, not on the heap size No fragmentation; compact memory Allows incremental (concurrent) garbage collection It is bad Needs double the amount of memory Needs to recognize pointers to heap

Baker’s Concurrent GC Based on the copying garbage collector
Does GC incrementally Avoids long pauses during GC Both from-space and to-space are used during program execution On a pointer dereference: if the pointer points to the from-space, forward the pointer (copy the object to the to-space) On GC: forward roots only and swap the names of the two spaces

Generational GC Observations:
If an object has survived a GC, it is likely to remain reachable for longer time New objects are more likely to become garbage than older objects Typically, <10% of new objects are live at GC GC should not waste time working on older objects Generational GC: assign objects to different generations G0, G1, G2, … G0: newest objects Gi is garbage collected more often than Gi+1 After GC, Gi becomes Gi+1 and we create a new generation G0 Special case: two generations New objects Tenured objects

Functional Languages and Higher-Order Functions
Leonidas Fegaras

First-Class Functions
Data values are first-class if they can be assigned to local variables be components of data structures be passed as arguments to functions be returned from functions be created at run-time How functions are treated by programming languages? Language passed as arguments returned from functions nested scope Java No No No C Yes Yes No C++ Yes Yes No Pascal Yes No Yes Modula-3 Yes No Yes Scheme Yes Yes Yes ML Yes Yes Yes

Function Types A new type constructor Example:
(T1,T2,...,Tn)  T0 Takes n arguments of type T1, T2, ..., Tn and returns a value of type T0 Unary function: T1  T Nullary function: ()  T0 Example: sort ( A: int[], order: (int,int)  boolean ) { for (int i = 0; i<A.size; i++) for (int j=i+1; j<A.size; j++) if (order(A[i],A[j])) switch A[i] and A[j]; } boolean leq ( x: int, y: int ) { return x <= y; } boolean geq ( x: int, y: int ) { return x >= y; } sort(A,leq) sort(A,geq)

How can you do this in Java?
interface Comparison { boolean compare ( int x, int y ); } void sort ( int[] A, Comparison cmp ) { for (int i = 0; i<A.length; i++) for (int j=i+1; j<A.length; j++) if (cmp.compare(A[i],A[j])) ... class Leq implements Comparison { boolean compare ( int x, int y ) { return x <=y; } sort(A,new Leq());

... or better class Comparison {
abstract boolean compare ( int x, int y ); } sort(A,new Comparison() { boolean compare ( int x, int y ) { return x <=y; } })

Nested Functions Without nested scopes, a function may be represented as a pointer to its code Functional languages (Scheme, ML, Haskell), as well as Pascal and Modula-3, support nested functions They can access variables of the containing lexical scope plot ( f: (float)  float ) { ... } plotQ ( a, b, c: float ) { p ( x: float ) { return a*x*x + b*x + c; } plot(p); } Nested functions may access and update free variables from containing scopes Representing functions as pointers to code is not good any more

Closures Nested functions may need to access variables in previous frames in the stack Function values is a closure that consists of a pointer to code an environment (dictionary) for free variables Implementation of the environment: It is simply a static link to the beginning of the frame that defined the function plot ( f: (float)  float ) { ... } plotQ ( a, b, c: float ) { p ( x: float ) { return a*x*x + b*x + c; } plot(p); } bottom code for p p plotQ f closure of p plot top Run-time stack

What about Returned Functions?
If the frame of the function that defined the passing function has been popped out from the run-time stack, the static link will be a dangling pointer ()  int make_counter () { int count = 0; int inc () { return count++; } return inc; } make_counter()() + make_counter()(); c = make_counter(); c()+c();

Frames in Heap! Solution: heap-allocate function frames
No need for run-time stack Frames of all lexically enclosing functions are reachable from a closure via static link chains The GC will collect unused frames Problem: Frames will make a lot of garbage look reachable

Escape Analysis Local variables need to be It happens only if
stored in heap only if they can escape accessed after the defining function returns It happens only if the variable is referenced from within some nested function the nested function is returned or passed to some function that might store it in a data structure Variables that do not escape are allocated on a stack frame rather than on heap No escaping variable => no heap allocation Escape analysis must be global Often approximate (conservative analysis)

Functional Programming Languages
Programs consist of functions with no side-effects Functions are first class values Build modular programs using function composition No accidental coupling between components No assignments, statements, for-loops, while-loops, etc Supports higher-level, declarative programming style Automatic memory management (garbage collection) Emphasis on types and type inference Built-in support for lists and other recursive data types Type inference is like type checking but no type declarations are required Types of variables and expressions can be inferred from context Parametric data types and polymorphic type inference Strict vs lazy functional programming languages

Lambda Calculus The theoretical foundation of functional languages is lambda calculus Formalized by Church in 1941 Minimal in form Turing-complete Syntax: if e1, e2, and e are expressions in lambda calculus, so are Variable: v Application: e1 e2 Abstraction: v. e Bound vs free variables Beta reduction: (v. e1) e2  e1[e2/v] (e1 but with all free occurrences of v in e1 replaced by e2) need to be careful to avoid the variable capturing problem (name clashes)

Church encoding: Integers
0 = s. z. z 1 = s. z. s z 2 = s. z. s s z 6 = s. z. s s s s s s z ... they correspond to successor (s) and zero (z) Simple arithmetic: add = n. m. s. z. n s (m s z) add 2 3 = (n. m. s. z. n s (m s z)) 2 3 = s. z. 2 s (3 s z) = s. z. (s. z. s s z) s ((s. z. s s s z) s z) = s. z. (s. z. s s z) s (s s s z) = s. z. s s s s s z = 5

Other Types Booleans Lists Pairs true = t. f. t false = t. f. f
if pred e1 e2 = pred e1 e2 eg, if pred is true, then (t. f. t) e1 e2 = e1 Lists nil = c. n. n [2,5,8] = c. n. c 2 (c 5 (c 8 n)) cons = x. r. c. n. c x (r c n) cons 2 (cons 5 (cons 8 nil)) = … = c. n. c 2 (c 5 (c 8 n)) append = r. s. c. n. r c (s c n) head = s. s (x. r. x) ? Pairs pair = x. y. p. p x y first = s. s (x. y. x)

Reductions REDucible EXpression (redex) Use beta reduction to reduce
an application expression is a redex abstractions and variables are not redexes Use beta reduction to reduce (x. add x x) is reduced to Normal form = no reductions are Reduction is confluent (has the Church-Rosser property) normal forms are unique regardless of the order of reduction Weak normal forms (WNF) no redexes outside of abstraction bodies Call by value (eager evaluation): WNF + leftmost innermost reductions Call by name: WNF + leftmost outermost reductions (normal order) Call by need (lazy evaluation): call by name, but each redex is evaluated at most once terms are represented by graphs and reductions make shared subgraphs

Recursion Infinite reduction: (x. x x) (x. x x)
no normal form; no termination A fixpoint combinator Y satisfies: Y f is reduced to f (Y f) Y = (g. (x. g (x x)) (x. g (x x))) Y is always built-in Implements recursion factorial = Y (f. n. if (= n 0) 1 (* n (f (- n 1))))

Second-Order Polymorphic Lambda Calculus
Types are: Type variable: v Universal quantification: v. t Function: t1  t2 Lambda terms are: Variable: v Application: e1 e2 Abstraction: v:t. e Type abstraction: v. e Type instantiation: e[t] Integers int = a. (a  a)  a  a succ = x:int. a. s:(a  a). z:a. s (x[a] s z) plus = x:int. y:int. x[int] succ y

Type Checking

Functional Languages Functional languages = typed lambda calculus + syntactic sugar Functional languages support parametric (generic) data types data List a = Nil | Cons a (List a) data Tree a b = Leaf a | Node b (Tree a b) (Tree a b) Cons 1 (Cons 2 Nil) Cons “a” (Cons “b” Nil) Lists are built-in Haskell: [1,2,3] = 1:2:3:[] Polymorphic functions: append (Cons x r) s = Cons x (append r s) append Nil s = s The type of append is a. (List a)  (List a)  (List a) Parametric polymorphism vs ad-hoc polymorphism (overloading)

Type Inference Functional languages need type inference rather than type checking v:t. e requires type checking v. e requires type inference (need to infer the type of v) Type inference is undecidable in general Solution: type schemes (shallow types): a1. a2. … an. t no other universal quantification in t (b. b  int)  (b. b  int) is not shallow When a type is missing, then a fresh type variable is used Type checking is based on type equality; type inference is based on type unification A type variable can be unified with any type Example in Haskell: let f = x. x in (f 5, f “a”) x. x has type a. a  a Cost of polymorphism: polymorphic values must be boxed (pointers to heap)

Higher-Order Functions
Map a function f over every element in a list map:: (a b)  [a]  [b] map f [] = [] map f (a:s) = (f a):(map f s) e.g. map (x. x+1) [1,2,3,4] = [2,3,4,5] Replace all cons list constructions with the function c and the nil with the value z foldr:: (a  b  b)  b  [a]  b foldr c z [] = z foldr c z (a:s) = c a (foldr c z s) e.g. foldr (+) 0 [1,2,3] = 6 e.g. append x y = foldr (++) y x e.g. map f x = foldr (a. r. (f a):r) [] x

Theorems for free! Any polymorphic function satisfies a parametricity theorem that is derived directly from its type since a value that corresponds to an unbound type parameter is like a black box that can only be passed around as is Every type corresponds to a theorem (proposition) Each type variable a is associated with a functional fa A function type  is mapped to  A type constructor, such as List(a), is mapped to a (map fa) Examples append: a. List(a)  List(a)  List(a) fa: append (map fa x) (map fa y) = map fa (append x y) flat: a. List(List(a))  List(a) fa: flat(map (map fa) x) = map fa (flat x) foldr: ab. (a  b  b)  b  List(a)  b fa fb: (fa x) * (fb y) = fb(x+y)  foldr * (fb x) (map fa y) = fb(foldr + x y) Used in program fusion

Deforestation Common problem in functional languages
When composing functions, intermediate data are generated to be consumed immediately foldr(+) 0 (map(+1) x) = foldr (a.r. a+r+1) 0 x Deforestation = elimination of intermediate data structures done using program fusion compliments lazy evaluation Shortcut to deforestation List producers must return a new type variable b instead of List(a) Wrap a list producer by Build: (b. (a  b  b)  b  b)  List(a) [1,2,3] = Build(c. n. c 1 (c 2 (c 3 n))) Express all list consumers using foldr Then, use the fusion law: foldr c n (Build f) = f c n Much like Church encoding of lists Used extensively in Haskell

Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras.

Similar presentations

Presentation on theme: "Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras.

Similar presentations

Presentation on theme: "Compilers for Algorithmic Languages Design and Construction of Compilers Leonidas Fegaras."— Presentation transcript:

Similar presentations

About project

Feedback