Top-down parsing 1. Last Time CYK – Step 1: get a grammar in Chomsky Normal Form – Step 2: Build all possible parse trees bottom-up Start with runs of.

Slides:



Advertisements
Similar presentations
Parsing V: Bottom-up Parsing
Advertisements

Exercise: Balanced Parentheses
1 Parsing The scanner recognizes words The parser recognizes syntactic units Parser operations: Check and verify syntax based on specified syntax rules.
Chap. 5, Top-Down Parsing J. H. Wang Mar. 29, 2011.
YANGYANG 1 Chap 5 LL(1) Parsing LL(1) left-to-right scanning leftmost derivation 1-token lookahead parser generator: Parsing becomes the easiest! Modifying.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Top-Down Parsing.
Parsing III (Eliminating left recursion, recursive descent parsing)
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
1 Predictive parsing Recall the main idea of top-down parsing: Start at the root, grow towards leaves Pick a production and try to match input May need.
Parsing — Part II (Ambiguity, Top-down parsing, Left-recursion Removal)
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
1 The Parser Its job: –Check and verify syntax based on specified syntax rules –Report errors –Build IR Good news –the process can be automated.
Professor Yihjia Tsai Tamkang University
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Top-Down Parsing.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
CPSC 388 – Compiler Design and Construction
Compiler Principles Winter Compiler Principles Exercises on scanning & top-down parsing Roman Manevich Ben-Gurion University.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Syntax and Semantics Structure of programming languages.
Parsing Chapter 4 Parsing2 Outline Top-down v.s. Bottom-up Top-down parsing Recursive-descent parsing LL(1) parsing LL(1) parsing algorithm First.
Review: –How do we define a grammar (what are the components in a grammar)? –What is a context free grammar? –What is the language defined by a grammar?
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
1 Chapter 5 LL (1) Grammars and Parsers. 2 Naming of parsing techniques The way to parse token sequence L: Leftmost R: Righmost Top-down  LL Bottom-up.
LANGUAGE TRANSLATORS: WEEK 3 LECTURE: Grammar Theory Introduction to Parsing Parser - Generators TUTORIAL: Questions on grammar theory WEEKLY WORK: Read.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
10/13/2015IT 3271 Tow kinds of predictive parsers: Bottom-Up: The syntax tree is built up from the leaves Example: LR(1) parser Top-Down The syntax tree.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Syntax and Semantics Structure of programming languages.
4 4 (c) parsing. Parsing A grammar describes syntactically legal strings in a language A recogniser simply accepts or rejects strings A generator produces.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
6/4/2016IT 3271 The most practical Parsers: Predictive parser: 1.input (token string) 2.Stacks, parsing table 3.output (syntax tree, intermediate codes)
COP4020 Programming Languages Parsing Prof. Xin Yuan.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing lecture slides from C OMP 412 Rice University Houston, Texas, Fall 2001.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
TOP-DOWN PARSING Recursive-Descent, Predictive Parsing.
Top-down Parsing. 2 Parsing Techniques Top-down parsers (LL(1), recursive descent) Start at the root of the parse tree and grow toward leaves Pick a production.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Parsing Context-Free Grammars, Parsing, Syntax Trees.
Top-Down Parsing.
CSE 5317/4305 L3: Parsing #11 Parsing #1 Leonidas Fegaras.
Top-Down Predictive Parsing We will look at two different ways to implement a non- backtracking top-down parser called a predictive parser. A predictive.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
COMP 3438 – Part II-Lecture 5 Syntax Analysis II Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Chapter 2 (part) + Chapter 4: Syntax Analysis S. M. Farhad 1.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
Parsing III (Top-down parsing: recursive descent & LL(1) )
Spring 16 CSCI 4430, A Milanova 1 Announcements HW1 due on Monday February 8 th Name and date your submission Submit electronically in Homework Server.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
CS 2130 Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing Warning: The precedence table given for the Wff grammar is in error.
Parsing Bottom Up CMPS 450 J. Moloney CMPS 450.
Programming Languages Translator
Parsing IV Bottom-up Parsing
Parsing — Part II (Top-down parsing, left-recursion removal)
4 (c) parsing.
Top-Down Parsing.
Top-Down Parsing CS 671 January 29, 2008.
CS 540 George Mason University
Predictive Parsing Lecture 9 Wed, Feb 9, 2005.
Presentation transcript:

Top-down parsing 1

Last Time CYK – Step 1: get a grammar in Chomsky Normal Form – Step 2: Build all possible parse trees bottom-up Start with runs of 1 terminal Connect 1-terminal runs into 2-terminal runs Connect 1- and 2- terminal runs into 3-terminal runs Connect 1- and 3- or 2- and 2- terminal runs into 4 terminal runs … If we can connect the entire tree, rooted at the start symbol, we’ve found a valid parse 2

Some Interesting properties of CYK Very old algorithm – Already well known in early 70s No problems with ambiguous grammars: – Gives a solution for all possible parse tree simultaneously 3

CYK Example 4 I,NL C R ZX N X W F FI WI W FI YI Y WL XL X XN RN R YL RL R Nid NI ZI Z ZC NC N I L( R) C,,)( 1,22,33,44,55,6 1,32,43,54,6 1,42,53,6 1,52,6 1,6

Thinking about Language Design Balanced considerations – Powerful enough to be useful – Simple enough to be parseable Syntax need not be complex for complex behaviors – Guy Steele’s “Growing a Language” 5

Restricting the Grammar By restricting our grammars we can – Detect ambiguity – Build linear-time, O(n) parsers LL(1) languages – Particularly amenable to parsing – Parseable by Predictive (top-down) parsers Sometimes called recursive descent 6

Top-Down Parsers Start at the Start symbol “predict” what productions to use – Example: if the current token to be parsed is an id, no need to try productions that start with integer literal – This might seem simple, but keep in mind multiple levels of productions that have to be used 7

Scanner Predictive Parser Sketch 8 Selector table “Work to do” Stack EOFabaa Token Stream Row: nonterminal Col: terminal current Parser

Algorithm 9 stack.push(eof) stack.push(Start non-term) t = scanner.getToken() Repeat if stack.top is a terminal y match y with t pop y from the stack t = scanner.next_token() if stack.top is a nonterminal X get table[X,t] pop X from the stack push production’s RHS (each symbol from Right to Left) Until one of the following: stack is empty stack.top is a terminal that doesn’t match t stack.top is a non-term and parse table entry is empty reject accept

Example 10 S → ( S ) | { S } | ε ( S )ε{ S }εε S (){} eof S) ( }S { “Work to do” Stack ({})eof S current

Example 2, bad input: You try 11 S → ( S ) | { S } | ε ( S )ε{ S }εε S (){} eof ((} INPUT

This Parser works great! Given a single token we always knew exactly what production it started 12 ( S )ε{ S }εε S (){} eof

Two Outstanding Issues 1.How do we know if the language is LL(1) – Easy to imagine a Grammar where a single token is not enough to select a rule 1.How do we build the selector table? – It turns out that there is one answer to both: 13 S → ( S ) | { S } | ε | ( ) If our selector table has 1 production per cell, then grammar is LL(1)

LL(1) Grammar Transformations Necessary (but not sufficient conditions) for LL(1) Parsing: – Free of left recursion No nonterminal loops for a production Why? Need to look past list to know when to cap it – Left factored No rules with common prefix Why? We’d need to look past the prefix to pick rule 14

Left-Recursion 15

Why Left Recursion is a Problem (Blackbox View) 16 XList XList x | x x XList How should we grow the tree top-down? x XList Current parse tree: Current token: CFG snippet: XList x (OR) Correct if there are no more xsCorrect if there are more xs We don’t know which without more lookahead

Why Left Recursion is a Problem (Whitebox View) 17 XList XList x | x x XList Current parse tree: Current token: CFG snippet: Parse table: XList XList x x ε eof Stack eof Current x XList x x x (Stack overflow)

Removing Left-Recursion 18 A → A α | βA → β A’ A’→ α A’ | ε (for a single immediately left-recursive rule) Where β does not begin with A

Example 19 Exp → Exp – Factor | Factor Factor → intlit | ( Exp ) A → A α | β A → β A’ A’→ α A’ | ε Exp → Factor Exp’ Exp’ → - Factor Exp’ | ε Factor → intlit | ( Exp )

Let’s check in on the Parse Tree… 20 E E -F E - F F Exp → Exp – Factor | Factor Factor → intlit | ( Exp ) Exp → Factor Exp’ Exp’ → - Factor Exp’ | ε Factor → intlit | ( Exp ) E F E 2 - FE - FE 3 4 ε

… We’ll fix that later 21

General Rule for Removing Immediate Left-Recursion 22 A → α 1 | α 2 | … | α n | A β 1 | A β 2 | … A β m A → α 1 A’ | α 2 A’ | … | α n A’ A’ → β 1 A’ | β 2 A’ | … | β m A’ | ε

Left Factored Grammars If a nonterminal has two productions whose RHS has a common prefix it is not left factored and not LL(1) Exp → ( Exp ) | ( ) 23 Not left factored

Left Factoring Given productions of the form A → α β 1 | α β 2 24 A → α A’ A’ → β 1 | β 2

Combined Example 25 Exp → ( Exp ) | Exp Exp | ( ) Exp → ( Exp ) Exp' | ( ) Exp' Exp' → Exp Exp' | ε Exp -> ( Exp'' Exp'' -> Exp ) Exp' | ) Exp' Exp' -> exp exp' | ε Remove Immediate left-recursion Left-factoring

Where are we at? We’ve set ourselves up for success in building the selection table – Two things that prevent a grammar from being LL(1) were identified and avoided Not Left-Factored grammars Left-recursive grammars – Next time Build two data structures that combine to yield a selector table: – FIRST set – FOLLOW set 26