Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006.

Slides:



Advertisements
Similar presentations
Albert Gatt Corpora and Statistical Methods Lecture 11.
Advertisements

Probabilistic and Lexicalized Parsing CS Probabilistic CFGs: PCFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
PARSING WITH CONTEXT-FREE GRAMMARS
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
Introduction and Jurafsky Model Resource: A Probabilistic Model of Lexical and Syntactic Access and Disambiguation, Jurafsky 1996.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CKY Parsing Ling 571 Deep Processing Techniques for NLP January 12, 2011.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
1 Earley Algorithm Chapter 13.4 October 2009 Lecture #9.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
Albert Gatt LIN3022 Natural Language Processing Lecture 8.
Amirkabir University of Technology Computer Engineering Faculty AILAB Efficient Parsing Ahmad Abdollahzadeh Barfouroush Aban 1381 Natural Language Processing.
Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05.
Basic Parsing with Context- Free Grammars 1 Some slides adapted from Julia Hirschberg and Dan Jurafsky.
Syntactic Parsing with CFGs CMSC 723: Computational Linguistics I ― Session #7 Jimmy Lin The iSchool University of Maryland Wednesday, October 14, 2009.
Fall 2004 Lecture Notes #5 EECS 595 / LING 541 / SI 661 Natural Language Processing.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
Parsing SLP Chapter 13. 7/2/2015 Speech and Language Processing - Jurafsky and Martin 2 Outline  Parsing with CFGs  Bottom-up, top-down  CKY parsing.
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור תשע Bottom Up Parsing עידו דגן.
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
1 Basic Parsing with Context- Free Grammars Slides adapted from Dan Jurafsky and Julia Hirschberg.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8.
December 2004CSA3050: PCFGs1 CSA305: Natural Language Algorithms Probabilistic Phrase Structure Grammars (PCFGs)
LINGUISTICA GENERALE E COMPUTAZIONALE ANALISI SINTATTICA (PARSING)
October 2005csa3180: Parsing Algorithms 11 CSA350: NLP Algorithms Sentence Parsing I The Parsing Problem Parsing as Search Top Down/Bottom Up Parsing Strategies.
Parsing I: Earley Parser CMSC Natural Language Processing May 1, 2003.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Page 1 Probabilistic Parsing and Treebanks L545 Spring 2000.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Sentence Parsing Parsing 3 Dynamic Programming. Jan 2009 Speech and Language Processing - Jurafsky and Martin 2 Acknowledgement  Lecture based on  Jurafsky.
Natural Language - General
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
NLP. Introduction to NLP Motivation –A lot of the work is repeated –Caching intermediate results improves the complexity Dynamic programming –Building.
Quick Speech Synthesis CMSC Natural Language Processing April 29, 2003.
CS 4705 Lecture 10 The Earley Algorithm. Review Top-Down vs. Bottom-Up Parsers –Both generate too many useless trees –Combine the two to avoid over-generation:
CPSC 422, Lecture 28Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 28 Nov, 18, 2015.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
October 2005CSA3180: Parsing Algorithms 21 CSA3050: NLP Algorithms Parsing Algorithms 2 Problems with DFTD Parser Earley Parsing Algorithm.
November 2009HLT: Sentence Parsing1 HLT Sentence Parsing Algorithms 2 Problems with Depth First Top Down Parsing.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Speech and Language Processing SLP Chapter 13 Parsing.
Chapter 12: Probabilistic Parsing and Treebanks Heshaam Faili University of Tehran.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Probabilistic and Lexicalized Parsing. Probabilistic CFGs Weighted CFGs –Attach weights to rules of CFG –Compute weights of derivations –Use weights to.
1 Statistical methods in NLP Diana Trandabat
Natural Language Processing Vasile Rus
Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition
Basic Parsing with Context Free Grammars Chapter 13
Natural Language Processing
Probabilistic and Lexicalized Parsing
Probabilistic and Lexicalized Parsing
Natural Language - General
Parsing and More Parsing
CPSC 503 Computational Linguistics
CSA2050 Introduction to Computational Linguistics
CPSC 503 Computational Linguistics
Parsing I: CFGs & the Earley Parser
CPSC 503 Computational Linguistics
David Kauchak CS159 – Spring 2019
NLP.
CPSC 503 Computational Linguistics
Presentation transcript:

Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006

Outline  Probabilistic Context-Free Grammars  Probabilistic CYK Parsing  PCFG Problems

Probabilistic Context-Free Grammars  Intuition Behind To find “correct” parse for the ambiguous sentences  i.e. can you book TWA flights?  i.e. the flights include a book  Definition of Context-Free Grammar 4-tuple G = (N, Σ, P, S)  N: a finite set of non-terminal symbols  Σ: a finite set of terminal symbols, where N Λ Σ = Φ  P: A  β, where A is in N, and β is in (N V Σ)*  S: start symbol in N  Definition of Probabilistic Context-Free Grammar 5-tuple G = (N, Σ, P, S, D)  D: A function P  [0,1] to assign a probability to each rule in P Rules are written as A  β[p], where p = D(A  β)  i.e. A  a B [0.6], B  C D [0.3]

PCFG Example S  NP VP.8 S  Aux NP VP.15 S  VP.05 NP  Det Nom.2 NP  ProperN.35 NP  Noun.05 NP  ProNoun.4 Nom  Noun.75 Nom  Noun Nom.2 Nom  ProperN Nom.05 VP  Verb.55 VP  Verb NP.4 VP  Verb NP NP.05 Det  that.5 Det  the.8 Det  a.15 Noun  book.1 Noun  flights.5 Noun  meal.4 Verb  book.3 Verb  include.3 Verb  want.4 Aux  can.4 Aux  does.3 Aux  do.3 ProperN  TWA.4 ProperN  Denver.6 Pronoun  you.4 Pronoun  I.6

Probability of A Sentence in PCFG  Probability of any parse tree T of S P(T,S) = Π D(r(n)) T is the parse tree and S is the sentence to be parsed n is a sub tree of T and r(n) is a rule to expand n  Probability of A parse tree P(T,S) = P(T) * P(S|T) A parse tree T uniquely corresponds a sentence S, so P(S|T) = 1 P(T) = P(T,S)  Probability of a sentence P(S) = Σ P(T), where T is in τ(S), the set of all the parse trees of S In particular, for an unambiguous sentence, P(S) = P(T)

Example  P(T l ) = 0.15*0.40*0.05* 0.05*0.35*0.75* 0.40*0.40*0.30* 0.40*0.50= 3.78*10 -7  P(T r ) = 0.15*0.40*0.40* 0.05*0.05*0.75* 0.40*0.40*0.30* 0.40*0.50= 4.32*10 -7

Probabilistic CYK Parsing of PCFG  Bottom-Up approach Dynamic Programming: fill the tables of partial solutions to the sub-problems until they contain all the solutions to the entire problem  Input CNF: ε-free, each production in form A  β or A  BC n words, w 1, w 2, …, w n  Data Structure Π [i, j, A]: the maximum probability for a constituent with non-terminal A spanning j words from w i β[i, j, A] = {k, B, C}, where A  BC, and B spans k words from w i (for rebuilding the parse tree)  Output The maximum probability parse will be Π[1,n,1] The root of the parse tree is S, and spans entire string

 Base case Consider the input strings of length one By the rules A  w i  Recursive case For strings of words of length>1, A → w ij There exists some rules A  BC and k  0<k<j  B → w ik (known)  C → w (i+k)(j-k) (known) Compute the probability of w ij by multiplying the two probabilities  If there are more than one A  BC, pick the one that maximize the probability of w ij CYK Algorithm Π [i,0,A] {k, B, C} My implementation is in lectura under directory /home/shan/538share/pcyk.c

PCFG Example – Revisit to rewrite S  NP VP.8 S  Aux NP VP.15 S  VP.05 NP  Det Nom.2 NP  ProperN.35 NP  Noun.05 NP  ProNoun.4 Nom  Noun.75 Nom  Noun Nom.2 Nom  ProperN Nom.05 VP  Verb.55 VP  Verb NP.4 VP  Verb NP NP.05 Det  that.5 Det  the.8 Det  a.15 Noun  book.1 Noun  flights.5 Noun  meal.4 Verb  book.3 Verb  include.3 Verb  want.4 Aux  can.4 Aux  does.3 Aux  do.3 ProperN  TWA.4 ProperN  Denver.6 Pronoun  you.4 Pronoun  I.6

Example (CYK Parsing) - Rewrite as CNF S  NP VP.8 (S  Aux NP VP.15) S  Aux NV.15 NV  NP VP1.0 (S  VP.05) S  book S  include S  want.011 S  Verb NP.02 S  Verb DNP.0025 NP  Det Nom.2 (NP  ProperN.35) NP  TWA.14 NP  Denver.21 (NP  Nom.05) NP  book NP  flights NP  meal.015 NP  Noun Nom.01 NP  ProperN Nom.0025 (NP  ProNoun.4) NP  you.16 NP  I.24 (Nom  Noun.75) Nom  book.075 Nom  flights.375 Nom  meal.3 Nom  Noun Nom.2 Nom  ProperN Nom.05 (VP  Verb.55) VP  book.165 VP  include.165 VP  want.22 VP  Verb NP.4 (VP  Verb NP NP.05) VP  Verb DNP.05 DNP  NP NP1.0

Example (CYK Parsing) – Π matrix Π i+j i Aux:.4 2 Pronoun:.4 NP:.16 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP: S: ProperN:.4 NP:.14 5 Noun:.5 Nom:.375 NP: canyoubookTWAflights

Example (CYK Parsing) – Π matrix Π i+j i Aux:.40 2 Pronoun:.4 NP:.16 S: NV:.0264 DNP: Noun:.1 Verb:.3 VP:.165 Nom:.075 NP: S: S: VP:.0168 DNP: ProperN:.4 NP:.14 NP: Nom:.0075 DNP: Noun:.5 Nom:.375 NP: canyoubookTWAflights

Example (CYK Parsing) – Π matrix Π i+j i Aux:.40S: Pronoun:.4 NP:.16 S: NV:.0264 DNP:.0006 S: NV: Noun:.1 Verb:.3 VP:.165 Nom:.075 NP: S: S: VP:.0168 DNP: S: NP: Nom: VP: DNP: ProperN:.4 NP:.14 NP: Nom:.0075 DNP: Noun:.5 Nom:.375 NP: canyoubookTWAflights

Example (CYK Parsing) – Π matrix Π i+j i Aux:.40S:.01584S: Pronoun:.4 NP:.16 S: NV:.0264 DNP:.0006 S: NV: S: NV: DNP: Noun:.1 Verb:.3 VP:.165 Nom:.075 NP: S: S: VP:.0168 DNP: S: NP: Nom: VP: DNP: ProperN:.4 NP:.14 NP: Nom:.0075 DNP: Noun:.5 Nom:.375 NP: canyoubookTWAflights

Example (CYK Parsing) – Π matrix Π i+j i Aux:.40S:.01584S: S: Pronoun:.4 NP:.16 S: NV:.0264 DNP:.0006 S: NV: S: NV: DNP: Noun:.1 Verb:.3 VP:.165 Nom:.075 NP: S: S: VP:.0168 DNP: S: NP: Nom: VP: DNP: ProperN:.4 NP:.14 NP: Nom:.0075 DNP: Noun:.5 Nom:.375 NP: canyoubookTWAflights

Example (CYK Parsing) – β matrix B i+j i N/A S  Aux NV, k = 1 2 S  NP VP, k = 1 NV  NP VP, k = 1 DNP  NP NP, k = 1 S  NP VP, k = 1 NV  NP VP, k = 1 S  NP VP, k = 1 NV  NP VP, k = 1 DNP  NP NP, k = 1 3 S  Verb NP, k = 1 VP  Verb NP, k = 1 DNP  NP NP, k = 1 S  Verb NP, k = 1 NP  Noun Nom, k = 1 Nom  Noun Nom, k = 1 VP  Verb NP, k = 1 DNP  NP NP, k = 1 4 NP  ProperN Nom, k = 1 Nom  ProperN Nom, k = 1 DNP  NP NP, k = 1 5 canyoubookTWAflights

PCFG Problems  Independence Assumption Assumption: the expansion of one nonterminal is independent of the expansion of others. However, examination shows that how a node expands is dependent on the location of the node  91% of the subjects are pronouns.  She’s able to take her baby to work with her. (91%)  Uh, my wife worked until we had a family. (9%)  But only 34% of the objects are pronouns.  Some laws absolutely prohibit it. (34%)  All the people signed confessions. (66%)

PCFG Problems  Lack of sensitivity of words Lexical information in a PCFG can only be represented via the probability of pre-terminal nodes (such as Verb, Noun, Det) However, lexical information and dependencies turns out to be important in modeling syntactic probabilities.  Example: Moscow sent more than 100,000 soldiers into Afghanistan.  In PCFG, into Afghanistan may attach NP (more than 100,000 soldiers) or VP (sent)  Statistics shows that NP attachment is 67% or 52%  Thus, PCFG will produce an incorrect result.  Why? the word “Send” subcategorizes for a destination, which can be expressed with the preposition “into”.  In fact, when the verb is “send”, “into” always attaches to it

PCFG Problems  Coordination ambiguity Look at the following case  Example: dogs in houses and cats Semantically, dogs is a better conjunct for cats than houses  Thus, the parse [dogs in [ NP houses and cats]] intuitively sounds unnatural, and should be dispreferred. However, PCFG assigns them the same probability, since the structures are using exactly the same rules.

References  NLTK Tutorial: Probabilistic Parsing:  Stanford Probabilistic Parsing Group:  General CYK algorithm  General CYK algorithm web compute  Probabilistic CYK parsing

Questions?