1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)

Slides:



Advertisements
Similar presentations
Compiler Construction
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Compiler Designs and Constructions
Bottom up Parsing Bottom up parsing trys to transform the input string into the start symbol. Moves through a sequence of sentential forms (sequence of.
Mooly Sagiv and Roman Manevich School of Computer Science
Bhaskar Bagchi (11CS10058) Lecture Slides( 9 th Sept. 2013)
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Bottom Up Parsing. 2 Bottom-Up Parsing l Bottom-up parsing is more general than top-down parsing »And just as efficient »Builds on ideas in top-down.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
By Neng-Fa Zhou Syntax Analysis lexical analyzer syntax analyzer semantic analyzer source program tokens parse tree parser tree.
Formal Aspects Term 2, Week4 LECTURE: LR “Shift-Reduce” Parsers: The JavaCup Parser-Generator CREATES LR “Shift-Reduce” Parsers, they are very commonly.
Lecture #8, Feb. 7, 2007 Shift-reduce parsing,
Bottom Up Parsing.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
1 Bottom-up parsing Goal of parser : build a derivation –top-down parser : build a derivation by working from the start symbol towards the input. builds.
Shift/Reduce and LR(1) Professor Yihjia Tsai Tamkang University.
Bottom-up parsing Goal of parser : build a derivation
Lexical and syntax analysis
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
LESSON 24.
11/22/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Shift-Reduce Parsing in Detail Dr. Jan Hajič CS Dept., Johns.
Syntax and Semantics Structure of programming languages.
410/510 1 of 21 Week 2 – Lecture 1 Bottom Up (Shift reduce, LR parsing) SLR, LR(0) parsing SLR parsing table Compiler Construction.
LR Parsing Compiler Baojian Hua
LR(k) Parsing CPSC 388 Ellen Walker Hiram College.
Chap. 6, Bottom-Up Parsing J. H. Wang May 17, 2011.
Parsing Jaruloj Chongstitvatana Department of Mathematics and Computer Science Chulalongkorn University.
CS 321 Programming Languages and Compilers Bottom Up Parsing.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
Chapter 3-3 Chang Chi-Chung Bottom-Up Parsing LR methods (Left-to-right, Rightmost derivation)  LR(0), SLR, Canonical LR = LR(1), LALR 
Syntax and Semantics Structure of programming languages.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
Syntax Analysis - LR(0) Parsing Compiler Design Lecture (02/04/98) Computer Science Rensselaer Polytechnic.
111 Chapter 6 LR Parsing Techniques Prof Chung. 1.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
Bottom-Up Parsing David Woolbright. The Parsing Problem Produce a parse tree starting at the leaves The order will be that of a rightmost derivation The.
Bernd Fischer RW713: Compiler and Software Language Engineering.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 6: LR grammars and automatic parser generators.
1 Syntax Analysis Part II Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Lecture 5: LR Parsing CS 540 George Mason University.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
1 Chapter 6 Bottom-Up Parsing. 2 Bottom-up Parsing A bottom-up parsing corresponds to the construction of a parse tree for an input tokens beginning at.
Conflicts in Simple LR parsers A SLR Parser does not use any lookahead The SLR parsing method fails if knowing the stack’s top state and next input token.
Chapter 8. LR Syntactic Analysis Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
COMPILER CONSTRUCTION
Syntax and Semantics Structure of programming languages.
Programming Languages Translator
Parsing and Parser Parsing methods: top-down & bottom-up
UNIT - 3 SYNTAX ANALYSIS - II
Table-driven parsing Parsing performed by a finite state machine.
CS 404 Introduction to Compiler Design
Syntax Analysis Part II
Subject Name:COMPILER DESIGN Subject Code:10CS63
Lexical and Syntax Analysis
Parsing #2 Leonidas Fegaras.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Parsing #2 Leonidas Fegaras.
Kanat Bolazar February 16, 2010
Announcements HW2 due on Tuesday Fall 18 CSCI 4430, A Milanova.
Parsing Bottom-Up LR Table Construction.
Parsing Bottom-Up LR Table Construction.
Chap. 3 BOTTOM-UP PARSING
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 7, 10/09/2003 Prof. Roy Levow.
Presentation transcript:

1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)

2 Human preference in Parsing Parsing techniques seen so far depended on a complete Search But human seem to parse more deterministically However, they may fall in a garden-path: The raft floated down the river sank

3 Human preference in Parsing Some of principles that appears to be used by people to choose the correct interpretation are: –Minimal attachment –Right Association –Lexical Preferences

4 Minimal Attachment

5 The M.A. Principle may cause misparsing 1.We painted all the walls with cracks PP tends to attach to VP rather to NP 2.The horse [that was] raced past the barn fell The reduced relative clause introduce more nodes, so “raced” is taken as the main verb, but it is rejected when “fell” is seen

6 Right Association (or Late Closure) George said that Henry left in his car I thought it would rain yesterday

7 Right Association (or Late Closure)

8 Lexical Preference M.A. and R.A. principle may have conflict The man keep the dog in the house R.A. Suggests The man keep the dog in the house M.A. Suggests The man keep the dog in the house Should M.A. be given more priority?

9 Lexical Preference 1.I wanted the dog in the house I wanted the dog in the house 2.I kept the dog in the house I kept the dog in the house 3.I put the dog in the house So L.P. overrides M.A. and R.A.

10 Uncertainty in Shift-Reduce Parsers Postponing decisions (  breadth first search), or Coding all possibilities into a parse table The grammar need to be unambiguous No unambiguous grammar exists for N.L. But the technique can be extended Consider the following grammar S  VP NP NP  ART N VP  AUX V NP VP  V NP

11 Transition Graph

12 Parse table (Oracle)

13 Shift-Reduce Parsing A class of parsers with the following principles: Parsing is done Bottom-Up, reducing the input into the grammar start symbol The parser builds a right-most derivation of the input in reverse Parsing algorithm simulates the operation of a PDA Prefix of the sentential form is kept on the stack Two types of operation: – Shift the next input symbol onto the stack – Reduce the stack by popping the RHS of a grammar rule, and pushing the corresponding LHS non-terminal symbol Parser is usually deterministic and with no back-tracking Extremely efficient, operating in linear time - O(n) But - possible to construct for only a limited class of CFGs

14 LR Parsing General Principles: Use sets of “dotted” grammar rules to reflect the state of the parser: – What constituents have we constructed so far – What constituents are we predicting next Pre-compile the grammar into a collection of finite sets of “dotted” rules Use these sets to capture the state of the parser during parsing The Parser is a deterministic shift-reduce parser. Developed by Knuth in the late 1960s - as a framework for compiling programming languages

15 LR Parsing Algorithm Performs shift and reduce parsing actions on the stack, and changes state with each operation Is driven by a pre-compiled parsing table that has two parts – The action table specifies the next shift or reduce parsing operation – The goto table specifies which state to transfer to after a reduction The stack stores a string of the form S 0 X 1 S 1 X 2 …X m S m where the S i are parser states and the X i are grammar symbols At each step the parser does one of the following types of operations: – Shift s: Push the current input symbol X i on the stack followed by the new state s – Reduce i: Reduce the stack according to rule i of the grammar – Reject: Reject the input as ungrammatical and signal an error – Accept: Accept the input as grammatical and halt

16 LR Parsing - Example The Grammar: (1) S  NP V P (2) NP  art adj n (3) NP  art n (4) NP  adj n (5) VP  aux V P (6) VP  v NP The original input: “ x = The large can can hold the water” POS assigned input: “ x = art adj n aux v art n” Parser input: “ x = art adj n aux v art n $”

17 Parse Table

18 The input: “x = art adj n aux v art n $”

19 Constructing an SLR Parsing Table An LR(0) item is a “dotted” grammar rule [A  B  ] We construct a deterministic FSA that recognizes prefixes of rightmost sentential forms of the grammar G. The states of the FSA are sets of LR(0) items We augment the grammar with a new start rule S’  S We define the closure operation on a set S of LR(0) items: 1. Every item in S is also in closure(S) 2. If [A  B  ]  closure(S) and B   is a rule in G, then add [B   ] to closure(S)

20 Constructing an SLR Parsing Table We define the Goto operation for an item set S and a grammar symbol X: Goto(S, X) is the closure of the set of all items [A   X  ] such that [A   X  ]  S Example: S 0 = {[S   NP VP]} Goto(S 0, NP) =

21 Constructing an SLR Parsing Table We construct the collection of sets of LR(0) items for an augmented grammar G We start with the item set S0 = {closure ({[S’  S]})} The algorithm:

22 Constructing an SLR Parsing Table - Example

23 Constructing an SLR Parsing Table

24 The constructed FSA for the example grammar:

25 Parsing with an LR Parser The pointers that form the parse tree can be created while performing reduce actions A parse node is created for each constituent that is pushed onto the stack When we do a reduce - we create a new parse node for the LHS non-terminal and link it to the parse-nodes of the popped RHS constituents At the end - the S constituent on the stack point to the root of the parse tree

26 The input: “x = art adj n aux v art n $”

27 Shift-Reduce Parsers and Ambiguity 1.NP  ART N REL-PRO VP 2.NP  ART N PP NP1: NP   ART N REL-PRO VP NP   ART N PP NP2: NP  ART  N REL-PRO VP NP  ART  N PP NP3: NP  ART N  REL-PRO VP NP  ART N  PP PP   P NP NP4: NP  ART N REL-PRO  VP VP   V NP

28 Lexical Ambiguity Ambiguous words are pushed onto stack Adding some extra states Can is both V and AUX Add S3_4 which is the union of S3 and S4 VP  AUX  V NP VP  V  NP NP   ART N Next input will resolve the ambiguity If it is a V then go to S5, If it is an ART then go to S1, If it is an NP then go to S3’

29 Ambiguous Parse States 3.NP  ART N 4.NP  ART N PP NP  ART  N NP  ART  N PP NP5: NP  ART N  NP  ART N  PP PP   P NP Now, what if the next input is a P? There is a Shift/Reduce Conflict

30 Ambiguous Parse States (Cont.) Solutions: 1.Maintain determinism and lose some interpretations (as a human may do) Choose shift in Shift/Reduce conflicts (  R.A.) Choose the longer rule in Reduce/Reduce conflicts (  M.A.) 2.Use search again (dfs or bfs) Dfs combined with General Preference Principles