CSA2050 Introduction to Computational Linguistics

Slides:



Advertisements
Similar presentations
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Advertisements

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
CS Basic Parsing with Context-Free Grammars.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Context-Free Parsing. 2/37 Basic issues Top-down vs. bottom-up Handling ambiguity –Lexical ambiguity –Structural ambiguity Breadth first vs. depth first.
Top-Down Parsing.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
CS 4705 Basic Parsing with Context-Free Grammars.
LR(1) Languages An Introduction Professor Yihjia Tsai Tamkang University.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
Parsing IV Bottom-up Parsing Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
October 2008csa3180: Setence Parsing Algorithms 1 1 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up.
11/22/1999 JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Shift-Reduce Parsing in Detail Dr. Jan Hajič CS Dept., Johns.
Top-Down Parsing - recursive descent - predictive parsing
4 4 (c) parsing. Parsing A grammar describes the strings of tokens that are syntactically legal in a PL A recogniser simply accepts or rejects strings.
1 Natural Language Processing Lecture 11 Efficient Parsing Reading: James Allen NLU (Chapter 6)
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
October 2008CSA3180: Sentence Parsing1 CSA3180: NLP Algorithms Sentence Parsing Algorithms 2 Problems with DFTD Parser.
CSA2050 Introduction to Computational Linguistics Parsing I.
Natural Language - General
November 2004csa3050: Sentence Parsing II1 CSA350: NLP Algorithms Sentence Parsing 2 Top Down Bottom-Up Left Corner BUP Implementation in Prolog.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
Top-Down Parsing.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
UMBC  CSEE   1 Chapter 4 Chapter 4 (b) parsing.
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
CSC 594 Topics in AI – Natural Language Processing
Programming Languages Translator
Unit-3 Bottom-Up-Parsing.
Parsing IV Bottom-up Parsing
Basic Parsing with Context Free Grammars Chapter 13
Table-driven parsing Parsing performed by a finite state machine.
Parsing — Part II (Top-down parsing, left-recursion removal)
4 (c) parsing.
Subject Name:COMPILER DESIGN Subject Code:10CS63
Lexical and Syntax Analysis
Top-Down Parsing CS 671 January 29, 2008.
4d Bottom Up Parsing.
Natural Language - General
Compiler Design 7. Top-Down Table-Driven Parsing
Parsing and More Parsing
Lecture 7: Introduction to Parsing (Syntax Analysis)
Bottom Up Parsing.
Parsing IV Bottom-up Parsing
Bottom-Up Parsing “Shift-Reduce” Parsing
4d Bottom Up Parsing.
Kanat Bolazar February 16, 2010
4d Bottom Up Parsing.
4d Bottom Up Parsing.
4d Bottom Up Parsing.
csa3180: Setence Parsing Algorithms 1
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
4d Bottom Up Parsing.
Parsing CSCI 432 Computer Science Theory
Presentation transcript:

CSA2050 Introduction to Computational Linguistics Parsing II

Problems with Recursive Descent Parsing Left Recursion Inefficiency Repeated Work apr 2008 CSA2050 - Parsing II

Left Recursion A grammar is left recursive if it contains at least one non-terminal A for which A * A and  *  (n.b. * is the transitive closure of ) Intuitive idea: derivation of that category includes itself along its leftmost branch. NP  NP PP NP  NP and NP NP  DetP Nominal DetP  NP ' s apr 2008 CSA2050 - Parsing II

Left Recursion Left recursion can lead to an infinite loop [nltk demo apr 2008 CSA2050 - Parsing II

Dealing with Left Recursion Use different parsing strategy Reformulate the grammar to eliminate LR A  A |  is rewritten as A  A' A'  A' |  apr 2008 CSA2050 - Parsing II

Rewriting the Grammar NP → NP ‘and’ NP NP → D N | D N PP apr 2008 CSA2050 - Parsing II

Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP α apr 2008 CSA2050 - Parsing II

Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP α New Grammar NP → α NP1 NP1 → β NP1 | ε apr 2008 CSA2050 - Parsing II

Rewriting the Grammar NP → NP ‘and’ NP β NP → D N | D N PP α New Grammar NP → α NP1 NP1 → β NP1 | ε α → D N | D N PP β → ‘and’ NP apr 2008 CSA2050 - Parsing II

New Parse Tree NP α NP1 D N the cat ε apr 2008 CSA2050 - Parsing II

Rewriting the Grammar Different parse tree Unnatural parse tree? apr 2008 CSA2050 - Parsing II

Problems with Recursive Descent Parsing Left Recursion Inefficiency Repeated Work apr 2008 CSA2050 - Parsing II

Inefficiency Top down strategy uses the grammar to predict the input. Recursive descent cannot confirm a structure until it looks at the input. Consequently it wastes a lot of time building structures that may be inconsistent with the input. apr 2008 CSA2050 - Parsing II

Prediction can be inefficient N -> apple N -> ant N -> alloy . N -> zebra N -> zoo NP D N the zoo apr 2008 CSA2050 - Parsing II

Prediction can be inefficient 1. VP -> V NP 2. VP -> V NP PP The first NP constituent is built by the first rule, which fails. The same constituents are rebuilt when the parser backtracks to the second rule VP NP V D N saw the man with the dog apr 2008 CSA2050 - Parsing II

Problems with Recursive Descent Parsing Left Recursion Inefficiency Repeated Work apr 2008 CSA2050 - Parsing II

Repeated Parsing of Subtrees a flight 4 from Indianapolis 3 to Houston 2 on TWA 1 A flight from Indianapolis A flight from Indianapolis to Houston A flight from Indianapolis to Houston on TWA apr 2008 CSA2050 - Parsing II

Bottom Up Shift/Reduce Algorithm Two data structures input string stack Repeat until input is exhausted Shift word to stack Reduce stack using grammar and lexicon until no further reductions are possible Unlike top down, algorithm does not require category to be specified in advance. It simply finds all possible trees. apr 2008 CSA2050 - Parsing II

Shift/Reduce Operation →| Step Action Stack Input 0 (start) the dog barked 1 shift the dog barked 2 reduce d dog barked 3 shift dog d barked 4 reduce n d barked 5 reduce np barked 6 shift barked np 7 reduce v np 8 reduce vp np 9 reduce s >>>from nltk.draw.srparser import demo >>>srparser.demo() apr 2008 CSA2050 - Parsing II

Shift Reduce Parser Standard implementations do not perform backtracking (e.g. NLTK) Only one result is returned even when sentence is ambiguous. May not fail even when sentence is grammatical Shift/Reduce conflict Reduce/Reduce conflict apr 2008 CSA2050 - Parsing II

Handling Conflicts Shift-reduce parsers may employ policies for resolving such conflicts, e.g. For Shift/Reduce Conflicts Prefer shift Prefer reduce For Reduce/Reduce Conflicts Choose reduction which removes most elements from the stack apr 2008 CSA2050 - Parsing II

Top Down vs Bottom Up General For: Never wastes time exploring trees that cannot be derived from S Against: Can generate trees that are not consistent with the input Bottom up For: Never wastes time building trees that cannot lead to input text segments. Against: Can generate subtrees that can never lead to an S node. apr 2008 CSA2050 - Parsing II

Top Down Parsing - Remarks Top-down parsers do well if there is useful grammar driven control: search can be directed by the grammar. Not too many different rules for the same category Not too much distance between non terminal and terminal categories. Top-down is unsuitable for rewriting parts of speech (preterminals) with words (terminals). In practice that is always done bottom-up as lexical lookup. apr 2008 CSA2050 - Parsing II

Bottom Up Parsing - Remarks It is data-directed: it attempts to parse the words that are there. Does well, e.g. for lexical lookup. Does badly if there are many rules with similar RHS categories. Inefficient when there is great lexical ambiguity (grammar driven control might help here) Empty categories: termination problem unless rewriting of empty constituents is somehow restricted (but then it’s generally incomplete) apr 2008 CSA2050 - Parsing II

Left Corner Parsing S → NP VP NP → D N NP → D N PP NP → PN … more rules D → the D → a PN → John “John saw the dog” There are three NP rules If you were parsing top down, which NP rule will be used first? Is this the best? apr 2008 CSA2050 - Parsing II

Left Corner Parsing We know that parser has to expand NP in such a way that NP derives “John”. There is only one rule which does this. Basic idea behind Left Corner parser is to use input to determine which rule is most relevant. apr 2008 CSA2050 - Parsing II

Bottom Up Filtering We know the current input word must serve as the first word in the derivation of the unexpanded node the parser is currently processing. Therefore the parser should not consider grammar rule for which the current word cannot serve as the "left corner" apr 2008 CSA2050 - Parsing II

Left Corner The node marked Verb is a left corner of VP fl fl apr 2008 CSA2050 - Parsing II

Left Corner Definition X is a direct left corner of a nonterminal A, if there is an A-production with X as the left-most symbol on the right-hand side. the left-corner relation is the reflexive transitive closure of the direct-left-corner relation. proper-left-corner relation is the transitive closure of the direct-left-corner relation. Proper left corners of all non-terminal categories can be determined in advance and placed in a table. apr 2008 CSA2050 - Parsing II

DCG-style Grammar/Lexicon s --> np, vp. s --> aux, np, vp. s --> vp. np --> det nom. nom --> noun. nom --> noun, nom. nom --> nom, pp pp --> prep, np. np --> pn. vp --> v. vp --> v np What are the left corners of S? What are the proper left corners of S? apr 2008 CSA2050 - Parsing II

DCG-style Grammar/Lexicon s --> np, vp. s --> aux, np, vp. s --> vp. np --> det nom. nom --> noun. nom --> noun, nom. nom --> nom, pp pp --> prep, np. np --> pn. vp --> v. vp --> v np What are the left corners of S? np, aux, vp, det, pn, noun, v What are the proper left corners of S? s, np, aux, vp, det, pn, noun, v apr 2008 CSA2050 - Parsing II

Example of Left Corner Table Category Proper Left Corners s np nom vp np, aux, vp, det, pn, noun, v pn, det noun v apr 2008 CSA2050 - Parsing II

How to use the Left Corner Table If attempting to parse category A, only consider rules A → Bα for which category(current input)  LeftCorners(B) s → np vp s → aux np vp s → vp apr 2008 CSA2050 - Parsing II

Left Corner Parsing Algorithm Key Idea: accept a word, identify the constituent it marks the beginning of, and parse the rest of the constituent top down. Main Advantages: Like a bottom-up parser, can handle left recursion without looping, since it starts each constituent by accepting a word from the input string. Like a top-down parser, is always expecting a particular category for which only a few of the grammar rules are relevant. It is therefore more efficient than a plain shift-reduce algorithm. apr 2008 CSA2050 - Parsing II

Left Corner Algorithm define parse(C) //parse a constituent of type C: { W = readnextword() K = category(w) complete(K,C) } define complete(K,C) { if K=C, exit with success else foreach rule (CC -> K α)  Grammar { parseList(α); complete(CC,C)} define parselist(L) {if empty(L) succeed else { parse(head(L)); parselist(tail(L)) } apr 2008 CSA2050 - Parsing II

Left Corner Example Input: Vincent slept; parse s (top down) S s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example Category of next input word = pn (bottom up) pn Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example select rule with direct left corner = pn np → pn parse remainder of rhs (nothing remains) np pn Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example select rule with direct left corner = np s → np vp parse remainder of rhs = [ vp ] np pn Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example Category of next input word = iv (bottom up) np pn iv Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example Select rule with direct left corner = iv vp → iv parse remainder of rhs. nothing left so np vp pn iv Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II

Left Corner Example with vp complete, nothing left on rhs of s rule s np vp pn iv Vincent slept s → np vp np → d n np → pn vp → iv d → the n → robber pn → vincent iv → slept apr 2008 CSA2050 - Parsing II