1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction
2 A Simple Syntax-Directed Translator A Simple Syntax-Directed Translator -- One-Pass Compiler to Generate Bytecode for the JVM Chapter 2 February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction
3 This chapter contains introductory material to Chapters 3 to 8 of the Dragon book Text: Compilers -- Principles, Techniques, & Tools (2nd Ed) by Alfred V. Aho, Monica S. Lam, Ravi Sethi, and Jeffrey D. Ullman Combined with material on the JVM to prepare for the laboratory assignments FYI: “The Java TM Virtual Machine Specification”, 2nd edition and class handouts February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Overview
4 Building a compiler involves: –Defining the syntax of a programming language –Develop a source code parser: for our compiler we will use predictive parsing –Implementing syntax directed translation to generate intermediate code: our target is the JVM abstract stack machine –Generating Java bytecode for the JVM –Optimize the Java bytecode (optional) February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Building a Simple Compiler
5 Lexical analyzer Syntax-directed translator Character stream Token stream Java bytecode Syntax definition (BNF grammar) Develop parser and code generator for translator JVM specification February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction The Structure of the Compiler
6 February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Keep in mind following questions Syntax definition –What is it? –What is ambiguity? –How to remove ambiguity? Why we need syntax –How to derive from syntax? –What inspires you? What is your reflection –Like it, why? –Hate it, why?
7 Context-free grammar is a 4-tuple with –A set of tokens (terminal symbols) –A set of nonterminals –A set of productions –A designated start symbol February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Syntax Definition
8 list list + digit list list - digit list digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 G = with productions P = Context-free grammar for simple expressions: February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Example Grammars list [list +|-] digit digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 list [list + | -] (0 | 1 | … | 9)
9 Given a CF grammar we can determine the set of all strings (sequences of tokens) generated by the grammar using derivation –We begin with the start symbol –In each step, we replace one nonterminal in the current sentential form with one of the right- hand sides of a production for that nonterminal February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Derivation
10 list list + digit list - digit + digit digit - digit + digit 9 - digit + digit digit This is an example leftmost derivation, because we replaced the leftmost nonterminal (underlined) in each step. Likewise, a rightmost derivation replaces the rightmost nonterminal in each step February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Derivation for the Example Grammar
11 The root of the tree is labeled by the start symbol Each leaf of the tree is labeled by a terminal (=token) or Each interior node is labeled by a nonterminal If A X 1 X 2 … X n is a production, then node A has immediate children X 1, X 2, …, X n where X i is a (non)terminal or ( denotes the empty string) February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Parse Trees
12 Parse tree of the string using grammar G list digit list digit The sequence of leafs is called the yield of the parse tree February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Parse Tree for the Example Grammar
13 string string + string | string - string | 0 | 1 | … | 9 G = with production P = Consider the following context-free grammar: This grammar is ambiguous, because more than one parse tree represents the string February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Ambiguity
14 string February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Ambiguity
15 right term = right | term left left + term | term Left-associative operators have left-recursive productions Right-associative operators have right-recursive productions String a=b=c has the same meaning as a=(b=c) String a+b+c has the same meaning as (a+b)+c February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Associativity of Operators
16 expr expr + term | term term term * factor | factor factor number | ( expr ) Operators with higher precedence “bind more tightly” String 2+3*5 has the same meaning as 2+(3*5) expr term factor +23*5 term factor term factor number February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Precedence of Operators
17 stmt id := expr | if expr then stmt | if expr then stmt else stmt | while expr do stmt | begin opt_stmts end opt_stmts stmt ; opt_stmts | February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Syntax of Statements Assignment Selective Iterative Block
18 February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Got it with following questions Syntax definition –W–What is it? –W–What is ambiguity? –H–How to remove ambiguity? Why we need syntax –H–How to derive from syntax? –W–What inspires you? What is your reflection –L–Like it, why? –H–Hate it, why?
19 Thank you very much! Questions? February 23, Azusa Pacific University, Azusa, CA 91702, Tel: (800) Department of Computer Science, CS400 Compiler Construction Syntax Definition