Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006.

Similar presentations


Presentation on theme: "Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006."— Presentation transcript:

1 Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006

2 Outline  Probabilistic Context-Free Grammars  Probabilistic CYK Parsing  PCFG Problems

3 Probabilistic Context-Free Grammars  Intuition Behind To find “correct” parse for the ambiguous sentences  i.e. can you book TWA flights?  i.e. the flights include a book  Definition of Context-Free Grammar 4-tuple G = (N, Σ, P, S)  N: a finite set of non-terminal symbols  Σ: a finite set of terminal symbols, where N Λ Σ = Φ  P: A  β, where A is in N, and β is in (N V Σ)*  S: start symbol in N  Definition of Probabilistic Context-Free Grammar 5-tuple G = (N, Σ, P, S, D)  D: A function P  [0,1] to assign a probability to each rule in P Rules are written as A  β[p], where p = D(A  β)  i.e. A  a B [0.6], B  C D [0.3]

4 PCFG Example S  NP VP.8 S  Aux NP VP.15 S  VP.05 NP  Det Nom.2 NP  ProperN.35 NP  Noun.05 NP  ProNoun.4 Nom  Noun.75 Nom  Noun Nom.2 Nom  ProperN Nom.05 VP  Verb.55 VP  Verb NP.4 VP  Verb NP NP.05 Det  that.5 Det  the.8 Det  a.15 Noun  book.1 Noun  flights.5 Noun  meal.4 Verb  book.3 Verb  include.3 Verb  want.4 Aux  can.4 Aux  does.3 Aux  do.3 ProperN  TWA.4 ProperN  Denver.6 Pronoun  you.4 Pronoun  I.6

5 Probability of A Sentence in PCFG  Probability of any parse tree T of S P(T,S) = Π D(r(n)) T is the parse tree and S is the sentence to be parsed n is a sub tree of T and r(n) is a rule to expand n  Probability of A parse tree P(T,S) = P(T) * P(S|T) A parse tree T uniquely corresponds a sentence S, so P(S|T) = 1 P(T) = P(T,S)  Probability of a sentence P(S) = Σ P(T), where T is in τ(S), the set of all the parse trees of S In particular, for an unambiguous sentence, P(S) = P(T)

6 Example  P(T l ) = 0.15*0.40*0.05* 0.05*0.35*0.75* 0.40*0.40*0.30* 0.40*0.50= 3.78*10 -7  P(T r ) = 0.15*0.40*0.40* 0.05*0.05*0.75* 0.40*0.40*0.30* 0.40*0.50= 4.32*10 -7

7 Probabilistic CYK Parsing of PCFG  Bottom-Up approach Dynamic Programming: fill the tables of partial solutions to the sub-problems until they contain all the solutions to the entire problem  Input CNF: ε-free, each production in form A  β or A  BC n words, w 1, w 2, …, w n  Data Structure Π [i, j, A]: the maximum probability for a constituent with non-terminal A spanning j words from w i β[i, j, A] = {k, B, C}, where A  BC, and B spans k words from w i (for rebuilding the parse tree)  Output The maximum probability parse will be Π[1,n,1] The root of the parse tree is S, and spans entire string

8  Base case Consider the input strings of length one By the rules A  w i  Recursive case For strings of words of length>1, A → w ij There exists some rules A  BC and k  0<k<j  B → w ik (known)  C → w (i+k)(j-k) (known) Compute the probability of w ij by multiplying the two probabilities  If there are more than one A  BC, pick the one that maximize the probability of w ij CYK Algorithm Π [i,0,A] {k, B, C} My implementation is in lectura under directory /home/shan/538share/pcyk.c

9 PCFG Example – Revisit to rewrite S  NP VP.8 S  Aux NP VP.15 S  VP.05 NP  Det Nom.2 NP  ProperN.35 NP  Noun.05 NP  ProNoun.4 Nom  Noun.75 Nom  Noun Nom.2 Nom  ProperN Nom.05 VP  Verb.55 VP  Verb NP.4 VP  Verb NP NP.05 Det  that.5 Det  the.8 Det  a.15 Noun  book.1 Noun  flights.5 Noun  meal.4 Verb  book.3 Verb  include.3 Verb  want.4 Aux  can.4 Aux  does.3 Aux  do.3 ProperN  TWA.4 ProperN  Denver.6 Pronoun  you.4 Pronoun  I.6

10 Example (CYK Parsing) - Rewrite as CNF S  NP VP.8 (S  Aux NP VP.15) S  Aux NV.15 NV  NP VP1.0 (S  VP.05) S  book.00825 S  include.00825 S  want.011 S  Verb NP.02 S  Verb DNP.0025 NP  Det Nom.2 (NP  ProperN.35) NP  TWA.14 NP  Denver.21 (NP  Nom.05) NP  book.00375 NP  flights.01875 NP  meal.015 NP  Noun Nom.01 NP  ProperN Nom.0025 (NP  ProNoun.4) NP  you.16 NP  I.24 (Nom  Noun.75) Nom  book.075 Nom  flights.375 Nom  meal.3 Nom  Noun Nom.2 Nom  ProperN Nom.05 (VP  Verb.55) VP  book.165 VP  include.165 VP  want.22 VP  Verb NP.4 (VP  Verb NP NP.05) VP  Verb DNP.05 DNP  NP NP1.0

11 Example (CYK Parsing) – Π matrix Π i+j i 12345 1Aux:.4 2 Pronoun:.4 NP:.16 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP:.00375 S:.00825 4 ProperN:.4 NP:.14 5 Noun:.5 Nom:.375 NP:.01875 canyoubookTWAflights

12 Example (CYK Parsing) – Π matrix Π i+j i 12345 1Aux:.40 2 Pronoun:.4 NP:.16 S:.02112 NV:.0264 DNP:.0006 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP:.00375 S:.00825 S:.00084 VP:.0168 DNP: 000525 4 ProperN:.4 NP:.14 NP:.000375 Nom:.0075 DNP:.002625 5 Noun:.5 Nom:.375 NP:.01875 canyoubookTWAflights

13 Example (CYK Parsing) – Π matrix Π i+j i 12345 1Aux:.40S:.01584 2 Pronoun:.4 NP:.16 S:.02112 NV:.0264 DNP:.0006 S:.021504 NV:.002688 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP:.00375 S:.00825 S:.00084 VP:.0168 DNP: 000525 S:.00000225 NP:.0000075 Nom:.00015 VP:.000045 DNP:.000001406 4 ProperN:.4 NP:.14 NP:.000375 Nom:.0075 DNP:.002625 5 Noun:.5 Nom:.375 NP:.01875 canyoubookTWAflights

14 Example (CYK Parsing) – Π matrix Π i+j i 12345 1Aux:.40S:.01584S:.00016128 2 Pronoun:.4 NP:.16 S:.02112 NV:.0264 DNP:.0006 S:.021504 NV:.002688 S:.00000576 NV:.0000072 DNP:.0000012 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP:.00375 S:.00825 S:.00084 VP:.0168 DNP: 000525 S:.00000225 NP:.0000075 Nom:.00015 VP:.000045 DNP:.000001406 4 ProperN:.4 NP:.14 NP:.000375 Nom:.0075 DNP:.002625 5 Noun:.5 Nom:.375 NP:.01875 canyoubookTWAflights

15 Example (CYK Parsing) – Π matrix Π i+j i 12345 1Aux:.40S:.01584S:.00016128S:.000000432 2 Pronoun:.4 NP:.16 S:.02112 NV:.0264 DNP:.0006 S:.021504 NV:.002688 S:.00000576 NV:.0000072 DNP:.0000012 3 Noun:.1 Verb:.3 VP:.165 Nom:.075 NP:.00375 S:.00825 S:.00084 VP:.0168 DNP: 000525 S:.00000225 NP:.0000075 Nom:.00015 VP:.000045 DNP:.000001406 4 ProperN:.4 NP:.14 NP:.000375 Nom:.0075 DNP:.002625 5 Noun:.5 Nom:.375 NP:.01875 canyoubookTWAflights

16 Example (CYK Parsing) – β matrix B i+j i 12345 1N/A S  Aux NV, k = 1 2 S  NP VP, k = 1 NV  NP VP, k = 1 DNP  NP NP, k = 1 S  NP VP, k = 1 NV  NP VP, k = 1 S  NP VP, k = 1 NV  NP VP, k = 1 DNP  NP NP, k = 1 3 S  Verb NP, k = 1 VP  Verb NP, k = 1 DNP  NP NP, k = 1 S  Verb NP, k = 1 NP  Noun Nom, k = 1 Nom  Noun Nom, k = 1 VP  Verb NP, k = 1 DNP  NP NP, k = 1 4 NP  ProperN Nom, k = 1 Nom  ProperN Nom, k = 1 DNP  NP NP, k = 1 5 canyoubookTWAflights

17 PCFG Problems  Independence Assumption Assumption: the expansion of one nonterminal is independent of the expansion of others. However, examination shows that how a node expands is dependent on the location of the node  91% of the subjects are pronouns.  She’s able to take her baby to work with her. (91%)  Uh, my wife worked until we had a family. (9%)  But only 34% of the objects are pronouns.  Some laws absolutely prohibit it. (34%)  All the people signed confessions. (66%)

18 PCFG Problems  Lack of sensitivity of words Lexical information in a PCFG can only be represented via the probability of pre-terminal nodes (such as Verb, Noun, Det) However, lexical information and dependencies turns out to be important in modeling syntactic probabilities.  Example: Moscow sent more than 100,000 soldiers into Afghanistan.  In PCFG, into Afghanistan may attach NP (more than 100,000 soldiers) or VP (sent)  Statistics shows that NP attachment is 67% or 52%  Thus, PCFG will produce an incorrect result.  Why? the word “Send” subcategorizes for a destination, which can be expressed with the preposition “into”.  In fact, when the verb is “send”, “into” always attaches to it

19 PCFG Problems  Coordination ambiguity Look at the following case  Example: dogs in houses and cats Semantically, dogs is a better conjunct for cats than houses  Thus, the parse [dogs in [ NP houses and cats]] intuitively sounds unnatural, and should be dispreferred. However, PCFG assigns them the same probability, since the structures are using exactly the same rules.

20 References  NLTK Tutorial: Probabilistic Parsing: http://nltk.sourceforge.net/tutorial/pcfg/index.html http://nltk.sourceforge.net/tutorial/pcfg/index.html  Stanford Probabilistic Parsing Group: http://nlp.stanford.edu/projects/stat-parsing.shtml http://nlp.stanford.edu/projects/stat-parsing.shtml  General CYK algorithm http://en.wikipedia.org/wiki/CYK_algorithmhttp://en.wikipedia.org/wiki/CYK_algorithm  General CYK algorithm web compute http://www2.informatik.hu-berlin.de/~pohl/cyk.php?action=example  Probabilistic CYK parsing http://www.ifi.unizh.ch/cl/gschneid/ParserVorl/ParserVorl7.pdf http://catarina.ai.uiuc.edu/ling306/slides/lecture23.pdf

21 Questions?


Download ppt "Chapter 12 Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006."

Similar presentations


Ads by Google