COMP261 Lecture 19 Parsing 4 of 4.

Slides:



Advertisements
Similar presentations
Feature Structures and Parsing Unification Grammars Algorithms for NLP 18 November 2014.
Advertisements

Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Printf transformation in TXL Mariano Ceccato ITC-Irst Istituto per la ricerca Scientifica e Tecnologica
Parsing & Scanning Lecture 2 COMP /25/2004 Derek Ruths Office: DH Rm #3010.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
CS252: Systems Programming Ninghui Li Topic 4: Regular Expressions and Lexical Analysis.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Recursive Data Types Koen Lindström Claessen. Modelling Arithmetic Expressions Imagine a program to help school-children learn arithmetic, which presents.
Honors Compilers An Introduction to Grammars Feb 12th 2002.
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Context-Free Grammars Lecture 7
Tutorial 1 Scanner & Parser
Programming Languages An Introduction to Grammars Oct 18th 2002.
CS252 Lab 2 Prepared by El Kindi Rezig. Notes Check out new version of the “official” fiz interpreter at
Parsing arithmetic expressions Reading material: These notes and an implementation (see course web page). The best way to prepare [to be a programmer]
Syntax and Backus Naur Form
CS 280 Data Structures Professor John Peterson. How Does Parsing Work? You need to know where to start (“statement”) This grammar is constructed so that.
COMP Parsing 2 of 4 Lecture 22. How do we write programs to do this? The process of getting from the input string to the parse tree consists of.
Using CookCC.  Use *.l and *.y files.  Proprietary file format  Poor IDE support  Do not work well for some languages.
Shell Programming. Creating Shell Scripts: Some Basic Principles A script name is arbitrary. Choose names that make it easy to quickly identify file function.
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
Upgrading to SQL Server 2000 Kashef Mughal. Multiple Versions SQL Server 2000 supports multiple versions of SQL Server on the same machine It does that.
COMP Parsing 3 of 4 Lectures 23. Using the Scanner Break input into tokens Use Scanner with delimiter: public void parse(String input ) { Scanner.
© M. Winter COSC 4P41 – Functional Programming Programming with actions Why is I/O an issue? I/O is a kind of side-effect. Example: Suppose there.
Iteration Statements CGS 3460, Lecture 17 Feb 17, 2006 Hen-I Yang.
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
1 Parsers and Grammar. 2 Categories of Grammar Rules  Declarations or definitions. AttributeDeclaration ::= [ final ] [ static ] [ access ] datatype.
Bernd Fischer COMP2010: Compiler Engineering Abstract Syntax Trees.
Functions CSE 1310 – Introduction to Computers and Programming Vassilis Athitsos University of Texas at Arlington 1.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
Comp 411 Principles of Programming Languages Lecture 3 Parsing
Announcements/Reading
Parsing 2 of 4: Scanner and Parsing
ASTs, Grammars, Parsing, Tree traversals
Compiler Design (40-414) Main Text Book:
Tail Recursion.
Koen Lindström Claessen
COMP261 Lecture 18 Parsing 3 of 4.
Koen Lindström Claessen
Tagged Data Tag: a symbol in a data structure that identifies its type
Week 3 - Friday CS221.
Programming Abstractions
Context-free Languages
Formal Language Theory
CMPE 152: Compiler Design April 5 Class Meeting
CS 3304 Comparative Languages
Compiler Design 4. Language Grammars
Building Java Programs
CSE 143 Lecture 9 References and Linked Nodes reading: 16.1
Parsing & Scanning Lecture 2
Building Java Programs
CSE 143 Lecture 6 References and Linked Nodes reading: 16.1
slides created by Marty Stepp
Syntax-Directed Translation
Programming Abstractions
Interpreter Pattern.
slides adapted from Marty Stepp and Hélène Martin
Building Java Programs
COMPUTER 2430 Object Oriented Programming and Data Structures I
Building Java Programs
Learning VB2005 language (basics)
Building Java Programs
The Recursive Descent Algorithm
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
CSE 143 Lecture 7 References and Linked Nodes reading: 16.1
slides adapted from Marty Stepp
CMPE 152: Compiler Design March 28 Class Meeting
slides created by Marty Stepp
Presentation transcript:

COMP261 Lecture 19 Parsing 4 of 4

Admin: Test results See your test results online. Collect your test paper from School office.

A Better parser: using patterns Give names to patterns to make program easier to understand and to modify Precompile the patterns for efficiency: private Pattern numPat = Pattern.compile( "[-+]?(\\d+([.]\\d*)?|[.]\\d+)"); private Pattern addPat = Pattern.compile("add"); private Pattern subPat = Pattern.compile("sub"); private Pattern mulPat = Pattern.compile("mul"); private Pattern divPat = Pattern.compile("div"); private Pattern opPat = Pattern.compile("add|sub|mul|div"); private Pattern openPat = Pattern.compile("\\("); private Pattern commaPat = Pattern.compile(","); private Pattern closePat = Pattern.compile("\\)");

A Better parser: using patterns public Node parseExpr(Scanner s) { Node n; if (!s.hasNext()) { fail("Empty expr",s); } if (s.hasNext(numPat)) { return parseNumber(s); } if (s.hasNext(addPat)) { return parseAdd(s); } if (s.hasNext(subPat)) { return parseSub(s); } if (s.hasNext(mulPat)) { return parseMul(s); } if (s.hasNext(divPat)) { return parseDiv(s); } fail("Unknown expr",s); return null; }

A Better parser: multiple arguments Allow add(1,2,3), etc. public Node parseAdd(Scanner s) { List<Node> args = new ArrayList<Node>(); require(addPat, "Expecting add", s); require(openPat, "Missing '('", s); args.add(parseExpr(s)); do { require (commaPat, "Missing ','", s); } while (!s.hasNext(closePat)); require(closePat, "Missing ')'", s); return new AddNode(args); } (need new version of require, taking a Pattern instead of a String)

Multiple arguments: Printing AST NumberNode: public String toString(){ return String.format("%.5f", value); } AddNode: public String toString(){ String ans = "(" + first; for (Node nd : rest){ ans += " + "+ nd; } return ans + ")"; SubNode: public String toString(){ for (Node nd : rest){ ans += " - "+ nd; }

Examples Expr: add(10.5 ,-8) Print  (10.5 + -8.0) Value  2.500 Expr: add(sub(10.5 ,-8), mul(div(45, 5), 6.8)) Print  ((10.5 – –8.0) + ((45.0 / 5.0) * 6.8)) Value  79.700 Expr: add(14.0, sub(mul(div (1.0, 28), 17), mul(3, div(5, sub(7, 5))))) Print  (14.0 + (((1.0 / 28.0) * 17.0) – (3.0 * (5.0 / (7.0 – 5.0))))) Value  7.107

Less Restricted Grammars This is enough for most of the assignment: method for each Node type peek at next token to determine which branch to follow build and return node throw error (including helpful message) when parsing breaks use require(…) to wrap up "check then consume/return or fail" adjust grammar to make it cleaner What happens when our grammar is not quite so helpful? For example: E ::= number | E “+” E | E “–” E | E “∗” E | E “/” E What are the problems, and how can you fix them?

Ambiguous Grammars Grammar: E ::= number | E “+” E | E “–” E | E “∗” E | E “/” E Text: 65 * 74 – 68 + 25 * 5 / 3 + 16 E E E E E E E

Possible Parses Grammar: E ::= number | E “+” E | E “–” E | E “∗” E | E “/” E 65 * 74 – 68 + 25 * 5 / 3 + 16 E E E E E E E

Possible Parses Grammar: E ::= number | E “+” E | E “–” E | E “∗” E | E “/” E 65 * 74 – 68 + 25 * 5 / 3 + 16 E E E E E E E

Ambiguous Grammars If a grammar allows multiple parses then we need to specify which we want (if it makes a difference) For example: EXPR ::= TERM | TERM “+” EXPR | TERM “–” EXPR TERM ::= number | number “∗” TERM | number “/” TERM 65 * 74 – 68 + 25 * 5 / 3 + 16 E E E T E T T T T T T

Telling which option to follow EXPR ::= TERM | TERM “+” EXPR | TERM “–” EXPR TERM ::= number | number “∗” TERM | number “/” TERM Break into subrules, collecting the shared elements: EXPR ::= TERM RESTOFEXPR RESTOFEXPR ::= “+” EXPR | “–” EXPR | ∈ TERM ::= number RESTOFTERM RESTOFTERM ::= “∗” TERM | “/” TERM | ∈ ( ∈ means “empty string” ) Transformations such as these can often turn a problematic grammar into a tractable grammar

A more practical approach Instead of E ::= number | E “+” E | E “–” E | E “∗” E | E “/” E Write: E ::= number [ Op number ]* Op := “+” | “–” | “∗” | “/” And the parser as: parseNum(s); while (!s.hasNext(opPat)) { s.next(); }

A more practical approach What about operator precedence: * before +, etc “ Grammar: E ::= T [ (“+”|”-”) T]* Expression T ::= F [ (“*”|”/”) F]* Term F ::= number | “(“ E “)” Factor Parser: public parseE(s) { parseT; while (!s.hasNext(addOpPat)) { // + or - s.next(); parseT(s); }

Next week: String searching How can you find an occurrence (all occurrences) of a string s in a text t? What is the cost? Can you make it faster??