CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak www.cs.sjsu.edu/~mak.

Slides:



Advertisements
Similar presentations
Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
Advertisements

ISBN Chapter 3 Describing Syntax and Semantics.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
Fall 2007CS 2251 Miscellaneous Topics Deque Recursion and Grammars.
Parsing III (Eliminating left recursion, recursive descent parsing)
Slide 1 Chapter 2-b Syntax, Semantics. Slide 2 Syntax, Semantics - Definition The syntax of a programming language is the form of its expressions, statements.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
CS 153: Concepts of Compiler Design August 25 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
CS 160: Software Engineering October 20 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Invitation to Computer Science 5th Edition
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
CS 153: Concepts of Compiler Design August 24 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 Week 3 Questions / Concerns What’s due: Lab1b due Friday at midnight Lab1b check-off next week (schedule will be announced on Monday) Homework #2 due.
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Programming Languages Third Edition Chapter 6 Syntax.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
CS 235: User Interface Design September 22 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 5 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
CS 152: Programming Language Paradigms April 2 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design August 26 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design September 16 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CPS 506 Comparative Programming Languages Syntax Specification.
Looking ahead in javacc 2/28/06. 2 What’s LOOKAHEAD? The job of a parser is to read an input stream and determine whether or not the input stream is in.
D Goforth COSC Translating High Level Languages Note error in assignment 1: #4 - refer to Example grammar 3.4, p. 126.
CS 153: Concepts of Compiler Design October 10 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
1 Parsers and Grammar. 2 Categories of Grammar Rules  Declarations or definitions. AttributeDeclaration ::= [ final ] [ static ] [ access ] datatype.
Chapter 4 Top-Down Parsing Recursive-Descent Gang S. Liu College of Computer Science & Technology Harbin Engineering University.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 4.
CS 153: Concepts of Compiler Design October 21 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
CS 153: Concepts of Compiler Design November 23 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
CS 153: Concepts of Compiler Design September 30 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Parser Generation Using SLK and Flex++ Copyright © 2015 Curt Hill.
Top-Down Parsing.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
Parser: CFG, BNF Backus-Naur Form is notational variant of Context Free Grammar. Invented to specify syntax of ALGOL in late 1950’s Uses ::= to indicate.
CS 152: Programming Language Paradigms April 7 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak
CS 153: Concepts of Compiler Design October 12 Class Meeting Department of Computer Science San Jose State University Fall 2015 Instructor: Ron Mak
Parsing III (Top-down parsing: recursive descent & LL(1) )
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS 152: Programming Language Paradigms April 16 Class Meeting Department of Computer Science San Jose State University Spring 2014 Instructor: Ron Mak.
CS 154 Formal Languages and Computability March 8 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
CS 614: Theory and Construction of Compilers Lecture 4 Fall 2002 Department of Computer Science University of Alabama Joel Jones.
CS 154 Formal Languages and Computability March 22 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Chapter 3 – Describing Syntax
CS 153: Concepts of Compiler Design September 14 Class Meeting
Chapter 3 – Describing Syntax
PROGRAMMING LANGUAGES
CS 153: Concepts of Compiler Design December 5 Class Meeting
ENERGY 211 / CME 211 Lecture 15 October 22, 2008.
CSE 3302 Programming Languages
CMPE 152: Compiler Design September 11/13 Lab
CMPE 152: Compiler Design August 21/23 Lab
High-Level Programming Language
CMPE 152: Compiler Design February 21/26 Lab
Chapter 10: Compilers and Language Translation
CMPE 152: Compiler Design December 4 Class Meeting
Presentation transcript:

CS 153: Concepts of Compiler Design October 27 Class Meeting Department of Computer Science San Jose State University Fall 2014 Instructor: Ron Mak 1

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak Tesla Motors Headquarters Visit  Palo Alto  Friday afternoon, November 14  See Piazza for details! 2

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 3 Review: JavaCC Compiler-Compiler  Feed JavaCC the grammar for a source language and it will automatically generate a scanner and a parser. Specify the source language tokens with regular expressions  JavaCC generates a scanner for the source language. Specify the source language syntax rules with Extended BNF  JavaCC generates a parser for the source language.

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 4 Review: JavaCC Compiler-Compiler, cont’d  The generated scanner and parser are written in Java.  Note: JavaCC calls the scanner the “tokenizer”. _

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 5 Review: JavaCC Regular Expressions  Literals  Character classes  Character ranges  Alternates Token name Token string

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 6 Review: JavaCC Regular Expressions, cont’d  Negation  Repetition  Quantifiers

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 7 JavaCC Parser Specification  Use JavaCC regular expressions to specify tokens.  Use EBNF to specify JavaCC production rules.  Phone number example from Chapter 3 of the JavaCC book. Example phone number: EBNF: ::= 0|1|2|3|4|5|6|7|8|9 ::= ::= - -

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 8 JavaCC Parser Specification, cont’d EBNF : JavaCC : TOKEN : { ){4}> | ){3}> | } void PhoneNumber() : {} { "-" "-" } Token specifications Production rule Java statements can go in here! ::= 0|1|2|3|4|5|6|7|8|9 ::= ::= - - phone.jj Terminal Literal Terminal Nonterminal

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 9 JavaCC Production Rule Methods  JavaCC generates a top-down recursive-descent parser. Each production rule becomes a Java method of the parser class. You can pass parameters to the methods. void PhoneNumber() : { StringBuffer sb = new StringBuffer(); } { AreaCode(sb) "-" {sb.append(token.image);} "-" {sb.append(token.image);} {System.out.println("Number: " + sb.toString());} } void AreaCode(StringBuffer buf) : {} { {buf.append(token.image);} } Java statement. phone_method_param.jj w/ and w/o parser debug Syntactic action.

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 10 Grammar Problems  Be very careful when specifying grammars!  JavaCC will not be able to generate a correct parser for a faulty grammar.  Common grammar faults include choice conflict left recursion _

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 11 Choice Conflict  Suppose we want to parse both local phone numbers and long-distance phone numbers: Local: Long-distance: ::= - ::= - - ::=  Choice conflict! While attempting to parse “ ”, the parser cannot tell whether the initial “123” is a or an since they are both. phone_choice.jj

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 12 Choice Conflict Resolution: Left Factoring  One way to resolve a choice conflict is by left factoring. Factor out the common head from the productions. void PhoneNumber() : {} { Head() "-" ( LocalNumber() | LongDistanceNumber() ) } void LocalNumber() : {} { } void LongDistanceNumber() : {} { "-" } void Head() : {} { } phone_left_factored.jj How does this fix the problem?

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 13 Lookahead  A top-down parser naturally “looks ahead” one token. This token tells the parser which nonterminal it will parse next. “ IF ” : next parse an IF statement “ REPEAT ” : next parse a REPEAT statement  A choice conflict occurs if a one-token lookahead is not sufficient to determine which nonterminal to parse next. Next parse a local number or a long-distance number?

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 14 Backtracking  The parser cannot backtrack.  Suppose the parser has parsed “123-” It decides that’s an area code, so it must be parsing a long-distance number.  Now it sees “4567”. Oops! It cannot backtrack and reparse “123-” as the prefix to a local number. _

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 15 Choice Conflict Resolution: Lookahead  Another way to resolve a choice conflict is by telling the parser to look ahead more than just one token.  To decide between parsing a local number and a long-distance telephone number: One-token lookahead is insufficient: “123” Two-token lookahead is insufficient: “123-” Three-token lookahead will distinguish a local number from a long-distance number: “ ” void PhoneNumber() : {} { ( LOOKAHEAD(3) LocalNumber() | LongDistanceNumber() ) } By looking ahead three tokens, the parser can successfully choose between LocalNumber() and LongDistanceNumber(). phone_lookahead.jj

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 16 Lookahead  Global lookahead Major performance penalty. Avoid if possible!  Syntactic lookahead  Semantic lookahead  Nested lookahead  Too convoluted! Minimize the need for these.  Why would you design a grammar that needed these?

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 17 Lookahead  Lookahead will slow down parsing.  Try to design grammars that do not require more than one token of lookahead.  For example, Pascal only requires one-token lookahead.

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 18 Left Recursion  Suppose we want to parse very simple expressions like “1+2”, “1+2+3”, “ ”, etc. ::= + | ::=  Left recursion! The nonterminal refers to itself recursively such that the recursion will never end.  Because the recursive reference is at the left end of the rule, no tokens are consumed. expression_left_recursion.jj ::= +

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 19 Left Recursion Resolution: Iteration  Resolve left recursion by replacing it with iteration. Instead of: ::= + | ::= Use EBNF: ::= { + } ::= void Expression() : {} { Term() ("+" Term())* { System.out.println("Parsed expression"); } } expression_iteration.jj

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 20 Right Recursion  Right recursion: ::= + | ::=  Right recursion is not a problem for JavaCC. Because there are non-recursive references to the left of the recursive reference, tokens are consumed by the scanner.  The parser continues to make forward progress.  The recursion ends as soon as the parser sees a token that doesn’t fit the production rule.

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 21 Right Recursion expression_right_recursion.jj  However, there may be choice conflicts.  Does a start + or simply ?  How much lookahead do we need?

Computer Science Dept. Fall 2014: October 27 CS 153: Concepts of Compiler Design © R. Mak 22 JJDoc  JJDoc produces documentation for your grammar.  Right-click in the.jj edit window.  It generates an HTML file from a.jj grammar file.  Read Chapter 5 of the JavaCC book. Ideal for your project documentation! Demo