Mooly Sagiv and Roman Manevich School of Computer Science

Slides:



Advertisements
Similar presentations
JavaCUP JavaCUP (Construct Useful Parser) is a parser generator
Advertisements

1 Assignment 3 Jianguo Lu. 2 Task: check whether the a program is syntactically correct /** this is a comment line in the sample program **/ INT f2(INT.
Abstract Syntax Mooly Sagiv html:// 1.
1 Compiler Construction Intermediate Code Generation.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
Winter Compiler Construction T7 – semantic analysis part II type-checking Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv.
Mooly Sagiv and Roman Manevich School of Computer Science
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Compiler Construction Abstract Syntax Tree Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Context-Free Grammars Lecture 7
Compiler Construction Parsing Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Chapter 2 A Simple Compiler
Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Compiler Construction Parsing I Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
1 Material taught in lecture Scanner specification language: regular expressions Scanner generation using automata theory + extra book-keeping.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Attribute Grammars They extend context-free grammars to give parameters to non-terminals, have rules to combine attributes Attributes can have any type,
Parser construction tools: YACC
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
LEX and YACC work as a team
LR Parsing Compiler Baojian Hua
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Language Translators - Lee McCluskey LANGUAGE TRANSLATORS: WEEK 21 LECTURE: Using JavaCup to create simple interpreters
Automated Parser Generation (via CUP)CUP 1. High-level structure JFlexjavac Lexer spec Lexical analyzer text tokens.java CUPjavac Parser spec.javaParser.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CS 614: Theory and Construction of Compilers Lecture 10 Fall 2002 Department of Computer Science University of Alabama Joel Jones.
Winter Compiler Construction T2 – Lexical Analysis (Scanning) Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
Lab 3: Using ML-Yacc Zhong Zhuang
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
10/25/20151 Programming Languages and Compilers (CS 421) Grigore Rosu 2110 SC, UIUC Slides by Elsa Gunter, based.
LANGUAGE TRANSLATORS: WEEK 14 LECTURE: REGULAR EXPRESSIONS FINITE STATE MACHINES LEXICAL ANALYSERS INTRO TO GRAMMAR THEORY TUTORIAL: CAPTURING LANGUAGES.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
CPS 506 Comparative Programming Languages Syntax Specification.
Abstract Syntax Trees Compiler Baojian Hua
Compiler Construction Lexical Analysis. 2 Administration Project Teams Project Teams Send me your group Send me your group
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
CS 614: Theory and Construction of Compilers Lecture 9 Fall 2002 Department of Computer Science University of Alabama Joel Jones.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
CS 614: Theory and Construction of Compilers Lecture 7 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
1 JavaCUP JavaCup (Construct Useful Parser) is a parser generator; Produce a parser written in java, itself is also written in Java; There are many parser.
Compiler Principles Winter Compiler Principles Syntax Analysis (Parsing) – Part 3 Mayer Goldberg and Roman Manevich Ben-Gurion University.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
CPSC 388 – Compiler Design and Construction Parsers – Syntax Directed Translation.
Syntax Analysis (chapter 4) SLR, LR1, LALR: Improving the parser From the previous examples: => LR0 parsing is rather weak. Cannot handle many languages.
Syntax-Directed Definitions CS375 Compilers. UT-CS. 1.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
Announcements/Reading
Chapter 3 – Describing Syntax
JavaCUP JavaCUP (Construct Useful Parser) is a parser generator
CUP: An LALR Parser Generator for Java
Java CUP.
Fall Compiler Principles Lecture 4: Parsing part 3
CPSC 388 – Compiler Design and Construction
Bison Marcin Zubrowski.
Syntax-Directed Translation
Fall Compiler Principles Lecture 4: Parsing part 3
Presentation transcript:

Mooly Sagiv and Roman Manevich School of Computer Science Winter 2007-2008 Compiler Construction T4 – Syntax Analysis (Parsing, part 2 of 2) Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University TODO: ensure consistency in displaying the values of instrumentation predicates for concrete heaps

javadoc Enter at src directory: javadoc IC/*.java /** The entry point for the IC compiler. */ class Compiler { /** Determines whether the tokens discovered by the scanner should be written to System.out. */ public static boolean emitTokens = false; /** Print usage information for the compiler. * @param includeSpecialOptions Should special options be printed. */ public void printUsage(boolean includeSpecialOptions) { ... } } Enter at src directory: javadoc IC/*.java Or use Ant, enter: ant javadocs

Today Today: LR(0) parsing algorithms JavaCup AST intro PA2 IC Language ic Lexical Analysis Syntax Analysis Parsing AST Symbol Table etc. Inter. Rep. (IR) Code Generation Executable code exe Today: LR(0) parsing algorithms JavaCup AST intro PA2 Missing: error recovery

High-level structure IC/Parser/Lexer.java IC.lex (Token.java) text Lexer spec .java Lexical analyzer JFlex javac IC.lex IC/Parser/Lexer.java tokens (Token.java) Parser spec .java Parser JavaCup javac IC/Parser/sym.java Parser.java LibraryParser.java IC.cup Library.cup AST

Expression calculator expr  expr + expr | expr - expr | expr * expr | expr / expr | - expr | ( expr ) | number Goals of expression calculator parser: Is 2+3+4+5 a valid expression? What is the meaning (value) of this expression?

Syntax analysis with JavaCup JavaCup – parser generator Generates an LALR(1) Parser Input: spec file Output: a syntax analyzer tokens Parser spec .java Parser JavaCup javac AST

JavaCup spec file Package and import specifications User code components Symbol (terminal and non-terminal) lists Terminals go to sym.java Types of AST nodes Precedence declarations The grammar Semantic actions to construct AST

Expression Calculator – 1st Attempt terminal Integer NUMBER; terminal PLUS, MINUS, MULT, DIV; terminal LPAREN, RPAREN; non terminal Integer expr; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr | LPAREN expr RPAREN | NUMBER ; Symbol type explained later

Ambiguities a b c + * a b c * + a * b + c a b c + a b c + a + b + c

Expression Calculator – 2nd Attempt terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV; terminal LPAREN, RPAREN; terminal UMINUS; non terminal Integer expr; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; Increasing precedence Contextual precedence

Parsing ambiguous grammars using precedence declarations Each terminal assigned with precedence By default all terminals have lowest precedence User can assign his own precedence CUP assigns each production a precedence Precedence of last terminal in production or user-specified contextual precedence On shift/reduce conflict resolve ambiguity by comparing precedence of terminal and production and decides whether to shift or reduce In case of equal precedences left/right help resolve conflicts left means reduce right means shift More information on precedence declarations in CUP’s manual

Resolving ambiguity precedence left PLUS a b c + a b c + a + b + c

Resolving ambiguity a b c + * a b c * + a * b + c precedence left PLUS precedence left MULT a b c + * a b c * + a * b + c

Resolving ambiguity MINUS expr %prec UMINUS - - - - b a a b - a - b

Resolving ambiguity terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV; terminal LPAREN, RPAREN; terminal UMINUS; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; UMINUS never returned by scanner (used only to define precedence) Rule has precedence of UMINUS

More CUP directives precedence nonassoc NEQ start non-terminal Non-associative operators: < > == != etc. 1<2<3 identified as an error (semantic error?) start non-terminal Specifies start non-terminal other than first non-terminal Can change to test parts of grammar Getting internal representation Command line options: -dump_grammar -dump_states -dump_tables -dump

CUP API Link on the course web page to API Parser extends java_cup.runtime.lr_parser Various methods to report syntax errors, e.g., override syntax_error(Symbol cur_token)

Scanner integration Parser gets terminals from the scanner import java_cup.runtime.*; %% %cup %eofval{ return new Symbol(sym.EOF); %eofval} NUMBER=[0-9]+ <YYINITIAL>”+” { return new Symbol(sym.PLUS); } <YYINITIAL>”-” { return new Symbol(sym.MINUS); } <YYINITIAL>”*” { return new Symbol(sym.MULT); } <YYINITIAL>”/” { return new Symbol(sym.DIV); } <YYINITIAL>”(” { return new Symbol(sym.LPAREN); } <YYINITIAL>”)” { return new Symbol(sym.RPAREN); } <YYINITIAL>{NUMBER} { return new Symbol(sym.NUMBER, new Integer(yytext())); } <YYINITIAL>\n { } <YYINITIAL>. { } Generated from token declarations in .cup file Parser gets terminals from the scanner

Recap Package and import specifications and user code components Symbol (terminal and non-terminal) lists Define building-blocks of the grammar Precedence declarations May help resolve conflicts The grammar May introduce conflicts that have to be resolved

Assigning meaning So far, only validation expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; So far, only validation Add Java code implementing semantic actions

Assigning meaning Symbol labels used to name variables expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Integer(e1.intValue() + e2.intValue()); :} | expr:e1 MINUS expr:e2 {: RESULT = new Integer(e1.intValue() - e2.intValue()); :} | expr:e1 MULT expr:e2 {: RESULT = new Integer(e1.intValue() * e2.intValue()); :} | expr:e1 DIV expr:e2 {: RESULT = new Integer(e1.intValue() / e2.intValue()); :} | MINUS expr:e1 {: RESULT = new Integer(0 - e1.intValue(); :} %prec UMINUS | LPAREN expr:e1 RPAREN {: RESULT = e1; :} | NUMBER:n {: RESULT = n; :} ; Symbol labels used to name variables RESULT names the left-hand side symbol

Building an AST More useful representation of syntax tree Less clutter Actual level of detail depends on your design Basis for semantic analysis Later annotated with various information Type information Computed values

Parse tree vs. AST expr + expr + expr expr expr 1 + ( 2 ) + ( 3 ) 1 2

AST construction AST Nodes constructed during parsing Bottom-up parser Stored in push-down stack Bottom-up parser Grammar rules annotated with actions for AST construction When node is constructed all children available (already constructed) Node (RESULT) pushed on stack Top-down parser More complicated

AST construction expr ::= expr:e1 PLUS expr:e2 1 + (2) + (3) expr ::= expr:e1 PLUS expr:e2 {: RESULT = new plus(e1,e2); :} | LPAREN expr:e RPAREN {: RESULT = e; :} | INT_CONST:i {: RESULT = new int_const(…, i); :} expr + (2) + (3) expr + (expr) + (3) expr + (3) expr + (expr) expr plus e1 e2 expr plus e1 e2 expr expr expr expr int_const val = 1 int_const val = 2 int_const val = 3 1 + ( 2 ) + ( 3 )

Designing an AST terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV,LPAREN,RPAREN,SEMI; terminal UMINUS; non terminal Integer expr; non terminal expr_list, expr_part; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr_list ::= expr_list expr_part | expr_part ; expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER

Designing an AST Rules of thumb Remember - bottom-up Interfaces or abstract classes for non-terminals with alternatives Class for each non-terminal or group of related non-terminals with similar functionality Remember - bottom-up When constructing a node children nodes already constructed but parent not constructed yet

Alternative 2 class for each op: Designing an AST ExprProgram expr_list ::= expr_list expr_part | expr_part ; expr_part ::= expr SEMI expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER Expr Alternative 2 class for each op: Alternative 1: op type field of Expr PlusExpr MinusExpr MultExpr DivExpr UnaryMinusExpr ValueExpr

Designing an AST terminal Integer NUMBER; non terminal Expr expr, expr_part; non terminal ExprProgram expr_list; expr_list ::= expr_list:el expr_part:ep {: RESULT = el.addExpressionPart(ep); :} | expr_part:ep {: RESULT = new ExprProgram(ep); :} ; expr_part ::= expr:e SEMI {: RESULT = e; :} expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Expr(e1,e2,”PLUS”); :} | expr:e1 MINUS expr:e2 {: RESULT = new Expr(e1,e2,”MINUS”); :} | expr:e1 MULT expr:e2 {: RESULT = new Expr(e1,e2,”MULT”); :} | expr:e1 DIV expr:e2 {: RESULT = new Expr(e1,e2,”DIV”); :} | MINUS expr:e1 {: RESULT = new Expr(e1,”UMINUS”); :} %prec UNMINUS | LPAREN expr RPAREN {: RESULT = e1; :} | NUMBER:n {: RESULT = new Expr(n); :}

Designing an AST public abstract class ASTNode { // common AST nodes functionality } public class Expr extends ASTNode { private int value; private Expr left; private Expr right; private String operator; public Expr(Integer val) { value = val.intValue(); public Expr(Expr operand, String op) { this.left = operand; this.operator = op; public Expr(Expr left, Expr right, String op) { this.left = left; this.right = right;

Computing meaning Evaluate expression by AST traversal Traversal for debug printing Later – annotate AST More on AST next recitation

PA2 Write parser for IC Write parser for libic.sig Check syntax Emit either “Parsed [file] successfully!” or “Syntax error in [file]: [details]” -print-ast option Prints one AST node per line

PA2 – step 1 Optional: perform error recovery Understand IC grammar in the manual Don’t touch the keyboard before understanding spec Write a debug JavaCup spec for IC grammar A spec with “debug actions” : print-out debug messages to understand what’s going on Try “debug grammar” on a number of test cases Keep a copy of “debug grammar” spec around Optional: perform error recovery Use JavaCup error token

PA2 – step 2 Design AST class hierarchy Flesh out AST class hierarchy Don’t touch the keyboard before you understand the hierarchy Keep in mind that this is the basis for later stages Web-site contains an AST adapted with permission from Tovi Almozlino (Code requires password which I will email to you) Change CUP actions to construct AST nodes

Partial example of main import java.io.*; import IC.Lexer.Lexer; import IC.Parser.*; import IC.AST.*; public class Compiler { public static void main(String[] args) { try { FileReader txtFile = new FileReader(args[0]); Lexer scanner = new Lexer(txtFile); Parser parser = new Parser(scanner); // parser.parse() returns Symbol, we use its value ProgAST root = (ProgAST) parser.parse().value; System.out.println(“Parsed ” + args[0] + “ successfully!”); } catch (SyntaxError e) { System.out.print(“Syntax error in ” + args[0] + “: “ + e); } if (libraryFileSpecified) {... try { FileReader libicFile = new FileReader(libPath); Lexer scanner = new Lexer(libicFile); LibraryParser parser = new LibraryParser(scanner); ClassAST root = (ClassAST) parser.parse().value; System.out.println(“parsed “ + libPath + “ successfully!”); } catch (SyntaxError e) { System.out.print(“Syntax error in “ + libPath + “ “ + e); } ...

See you next week