74.406 Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.

Slides:



Advertisements
Similar presentations
Basic Parsing with Context-Free Grammars CS 4705 Julia Hirschberg 1 Some slides adapted from Kathy McKeown and Dan Jurafsky.
Advertisements

May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Approaches to Parsing.
Chapter 9: Parsing with Context-Free Grammars
PARSING WITH CONTEXT-FREE GRAMMARS
Parsing with Context Free Grammars Reading: Chap 13, Jurafsky & Martin
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Parsing context-free grammars Context-free grammars specify structure, not process. There are many different ways to parse input in accordance with a given.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Parsing with CFG Ling 571 Fei Xia Week 2: 10/4-10/6/05.
Earley’s algorithm Earley’s algorithm employs the dynamic programming technique to address the weaknesses of general top-down parsing. Dynamic programming.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
CS 4705 Basic Parsing with Context-Free Grammars.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
1 Basic Parsing with Context Free Grammars Chapter 13 September/October 2012 Lecture 6.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
TEORIE E TECNICHE DEL RICONOSCIMENTO Linguistica computazionale in Python: -Analisi sintattica (parsing)
BİL711 Natural Language Processing
CS 4705 Parsing More Efficiently and Accurately. Review Top-Down vs. Bottom-Up Parsers Left-corner table provides more efficient look- ahead Left recursion.
中文信息处理 Chinese NLP Lecture 9.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8.
Chapter 10. Parsing with CFGs From: Chapter 10 of An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, by.
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
Understanding Natural Language
10. Parsing with Context-free Grammars -Speech and Language Processing- 발표자 : 정영임 발표일 :
Chapter 13: Parsing with Context-Free Grammars Heshaam Faili University of Tehran.
11 Syntactic Parsing. Produce the correct syntactic parse tree for a sentence.
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing with Context Free Grammars CSC 9010 Natural Language Processing Paula Matuszek and Mary-Angela Papalaskari This slide set was adapted from: Jim.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
1 Chart Parsing Allen ’ s Chapter 3 J & M ’ s Chapter 10.
Parsing with Context-Free Grammars References: 1.Natural Language Understanding, chapter 3 (3.1~3.4, 3.6) 2.Speech and Language Processing, chapters 9,
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
November 2004csa3050: Sentence Parsing II1 CSA350: NLP Algorithms Sentence Parsing 2 Top Down Bottom-Up Left Corner BUP Implementation in Prolog.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
Artificial Intelligence 2004
Computerlinguistik II / Sprachtechnologie Vorlesung im SS 2010 (M-GSW-10) Prof. Dr. Udo Hahn Lehrstuhl für Computerlinguistik Institut für Germanistische.
CS 4705 Lecture 7 Parsing with Context-Free Grammars.
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
Instructor: Nick Cercone CSEB - 1 Parsing and Context Free Grammars Parsers, Top Down, Bottom Up, Left Corner, Earley.
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
November 2004csa3050: Parsing Algorithms 11 CSA350: NLP Algorithms Parsing Algorithms 1 Top Down Bottom-Up Left Corner.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing SLP Chapter 13 Parsing.
Basic Parsing with Context Free Grammars Chapter 13
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 6/19/2018
CPSC 503 Computational Linguistics
Grammars August 31, /16/2018.
CKY Parser 0Book 1 the 2 flight 3 through 4 Houston5 11/16/2018
Natural Language - General
Parsing and More Parsing
CSA2050 Introduction to Computational Linguistics
Parsing I: CFGs & the Earley Parser
David Kauchak CS159 – Spring 2019
Presentation transcript:

Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing Chart-Parsing Earley-Algorithm

Natural Language - Parsing Natural Language Syntax described like a formal language, usually through a context-free grammar: the Start-Symbol S ≡ sentence Non-Terminals NT ≡ syntactic constituents Terminals T ≡ lexical entries/ words Productions/Rules P  NT  (NT  T) + ≡ grammar rules Parsing derive the syntactic structure of a sentence based on a language model (grammar) construct a parse tree, i.e. the derivation of the sentence based on the grammar (rewrite system)

Sample Grammar Grammar (S, NT, T, P) – Sentence Symbol S  NT, Part-of-Speech  NT, syntactic Constituents  NT, Grammar Rules P  NT  (NT  T)* S  NP VPstatement S  Aux NP VPquestion S  VPcommand NP  Det Nominal NP  Proper-Noun Nominal  Noun | Noun Nominal | Nominal PP VP  Verb | Verb NP | Verb PP | Verb NP PP PP  Prep NP Det  that | this | a Noun  book | flight | meal | money Proper-Noun  Houston | American Airlines | TWA Verb  book | include | prefer Aux  does Prep  from | to | on Task: Parse "Does this flight include a meal?"

S Aux NP VP Det Nominal Verb NP Noun Det Nominal does this flight include a meal Sample Parse Tree

Problems in Parsing - Ambiguity Ambiguity “One morning, I shot an elephant in my pajamas. How he got into my pajamas, I don’t know.” Groucho Marx syntactical/structural ambiguity – several parse trees are possible e.g. above sentence semantic/lexical ambiguity – several word meanings e.g. bank (where you get money) and (river) bank even different word categories possible (interim) e.g. “He books the flight.” vs. “The books are here.“ or “Fruit flies from the balcony” vs. “Fruit flies are on the balcony.”

Problems in Parsing - Attachment Attachment in particular PP (prepositional phrase) binding; often referred to as ‘binding problem’ “One morning, I shot an elephant in my pajamas.” (S... (NP (PNoun I)(VP (Verb shot) (NP (Det an (Nominal (Noun elephant))) (PP in my pajamas))...) rule VP  Verb NP PP (S... (NP (PNoun I)) (VP (Verb shot) (NP (Det an) (Nominal (Nominal (Noun elephant) (PP in my pajamas)... ) rule VP  Verb NP and NP  Det Nominal and Nominal  Nominal PP and Nominal  Noun

Bottom-up and Top-down Parsing Bottom-up – from word-nodes to sentence-symbol Top-down Parsing – from sentence-symbol to words S Aux NP VP Det Nominal Verb NP Noun Det Nominal does this flight include a meal

Problems with Bottom-up and Top-down Parsing Problems with left-recursive rules like NP  NP PP: don’t know how many times recursion is needed Pure Bottom-up or Top-down Parsing is inefficient because it generates and explores too many structures which in the end turn out to be invalid (several grammar rules applicable  ‘interim’ ambiguity). Combine top-down and bottom-up approach: Start with sentence; use rules top-down (look-ahead); read input; try to find shortest path from input to highest unparsed constituent (from left to right).  Chart-Parsing / Earley-Parser

Chart Parsing / Early Algorithm Earley-Parser based on Chart-Parsing Essence: Integrate top-down and bottom-up parsing. Keep recognized sub-structures (sub-trees) for shared use during parsing. Top-down: Start with S-symbol. Generate all applicable rules for S. Go further down with left- most constituent in rules and add rules for these constituents until you encounter a left-most node on the RHS which is a word category (POS). Bottom-up: Read input word and compare. If word matches, mark as recognized and move parsing on to the next category in the rule(s).

Chart Sequence of n input words; n+1 nodes marked 0 to n. Arcs indicate recognized part of RHS of rule. The indicates recognized constituents in rules. Jurafsky & Martin, Figure 10.15, p. 380

Chart Parsing / Earley Parser 1 Chart Sequence of input words; n+1 nodes marked 0 to n. States in chart represent possible rules and recognized constituents, with arcs. Interim state S  VP, [0,0]  top-down look at rule S  VP  nothing of RHS of rule yet recognized ( is far left)  arc at beginning, no coverage (covers no input word; beginning of arc at 0 and end of arc at 0)

Chart Parsing / Earley Parser 2 Interim states NP  Det Nominal, [1,2]  top-down look with rule NP  Det Nominal  Det recognized ( after Det)  arc covers one input word which is between node 1 and node 2  look next for Nominal NP  Det Nominal, [1,3]  Nominal was recognized, move after Nominal  move end of arc to cover Nominal (change 2 to 3)  structure is completely recognized; arc is inactive; mark NP as recognized in other rules (move ).

Chart - 0 Book this flight S . VP VP . V NP

Chart - 1 VP  V. NP V Book this flight S . VP NP . Det Nom

Chart - 2 VP  V. NP V Book this flight S . VP NP  Det. Nom Det Nom . Noun

Chart - 3a VP  V. NP V Book this flight S . VP NP  Det. Nom Det Nom  Noun. Noun

Chart - 3b VP  V. NP V Book this flight S . VP NP  Det Nom. Det Nom  Noun. Noun

Chart - 3c VP  V NP. V Book this flight NP  Det Nom. Det Nom  Noun. Noun S . VP

Chart - 3d VP  V NP. V Book this flight S  VP. NP  Det Nom. Det Nom  Noun. Noun

Chart - All States NP  Det Nom. VP  V. NP Nom  Noun. VDetNoun Book this flight S . VP VP . V NP NP . Det Nom NP  Det. Nom VP  V NP. S  VP. Nom . Noun

Chart - Final States NP  Det Nom. Nom  Noun. V Det Noun Book this flight VP  V NP. S  VP.

Chart 0 with two S- and two VP-Rules Book this flight S . VP VP . V NP additional S-rule S . VP NP additional VP-rule VP . V

Chart 1a with two S- and two VP-Rules VP  V. NP V Book this flight S . VP NP . Det Nom S . VP NP VP  V.

Chart 1b with two S- and two VP-Rules VP  V. NP V Book this flight S  VP. NP . Det Nom S  VP. NP VP  V.

Chart 2 with two S- and two VP-Rules VP  V. NP V Book this flight S  VP. NP  Det. Nom S  VP. NP VP  V. Nom . Noun

VP  V NP. V Book this flight S  VP. NP  Det Nom. Det Nom  Noun. S  VP NP. VP  V. Chart 3 with two S- and two VP-Rules Noun

Final Chart - with two S-and two VP-Rules VP  V NP. V Book this flight S  VP NP. NP  Det Nom. Det Nom  Noun. Noun S  VP. VP  V.

Earley Algorithm - Functions predictor generates new rules for partly recognized RHS with constituent right of (top-down generation) scanner if word category (POS) is found right of the, the Scanner reads the next input word and adds a rule for it to the chart (bottom-up mode) completer if rule is completely recognized (the is far right), the recognition state of earlier rules in the chart advances: the is moved over the recognized constituent (bottom-up recognition).

function EARLEY-PARSE(words, grammar) returns chart ENQUEUE((  S, [0,0]), chart[0]) for i_from 0 to LENGTH(words) do for each state in chart[i] do if INCOMPLETE?(state) and NEXT-CAT(state) is not a part of speech then PREDICTOR(state) elseif INCOMPLETE?(state) and NEXT-CAT(state) is a part of speech then SCANNER(state) else COMPLETER(state) end return(chart) - continued on next slide - Earley-Algorithm

procedure PREDICTOR((A   B , [i, j])) for each (B   ) in GRAMMAR-RULES-FOR(B, grammar) do ENQUEUE((B   [ j, j], chart[j]) end procedure SCANNER ((A    B , [i, j])) if B  PARTS-OF-SPEECH(word[j]) then ENQUEUE((B  word[j], [ j, j+1]), chart[j+1]) procedure COMPLETER ((B   , [j, k])) for each (A    B , [ i, j]) in chart[j] do ENQUEUE((A   B  , [ i,k]), chart[k]) end procedure ENQUEUE(state, chart-entry) if state is not already in chart-entry then PUSH(state, chart-entry) end

Earley Algorithm - Figures - Jurafsky & Martin, Figures 10.16, 10.17, 10.18

Additional References Jurafsky, D. & J. H. Martin, Speech and Language Processing, Prentice-Hall, (Chapters 9 and 10) Earley Algorithm Jurafsky & Martin, Figure 10.16, p.384 Earley Algorithm - Examples Jurafsky & Martin, Figures and 10.18