Download presentation
Presentation is loading. Please wait.
1
Probabilistic Parsing Ling 571 Fei Xia Week 4: 10/18-10/20/05
2
Outline Misc: Hw3 and Hw4: lexicalized rules CYK recap –Converting CFG into CNF –N-best Quiz #2 Common prob equations Independence assumption Lexicalized models
3
CYK Recap
4
Converting CFG into CNF CNF Extended CNF CFG in general vs. CFG for natural languages Converting CFG into CNF Converting PCFG into CNF Recovering parse trees
5
Definition of CNF A, B,C are non-terminal, a is terminal, S is start symbol Definition 1: –A B C, –A a, –S Where B, C are not start symbols. Definition 2: -free grammar –A B C –A a
6
Extended CNF Definition 3: –A B C –A a or A B We use Def 3: –Unit rules such as NP N are allowed. –No need to remove unit rules during conversion. –CYK algorithm needs to be modified.
7
CYK algorithm with Def 2 For every rule A w_i, For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then
8
CYK algorithm with Def 3 For every position i for all A, if A w_i, for all A and B, if A=>B, update For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: …. for all non-terminals A and B, if A B, update
9
CFG CFG in general: –G=(N, T, P, S) –Rules: CFG for natural languages: –G=(N, T, P, S) –Pre-terminal: –Rules: Syntactic rules: Lexicon:
10
Conversion from CFG to CNF CFG (in general) to CNF (Def 1) –Add S0 S –Remove e-rules –Remove unit rules –Replace n-ary rules with binary rules CFG (for NL) to CNF (Def 3) –CFG (for NL) has no e-rules –Unit rules are allowed in CNF (Def 3) –Only the last step is necessary
11
An example VP V NP PP PP To recover the parse tree w.r.t original CFG, just remove added non-terminals.
12
Converting PCFG into CNF VP V NP PP PP 0.1 => VP V X1 0.1 X1 NP X2 1.0 X2 PP PP 1.0
13
CYK with N-best output
14
N-best parse trees Best parse tree: N-best parse trees:
15
CYK algorithm for N-best For every rule A w_i, For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: for each if val > one of probs in then remove the last element in and insert val to the array remove the last element in B[begin][end][A] and insert (m, B,C,i, j) to B[begin][end][A].
16
Mary bought books with cash S NP VP (1,1,1) S NP VP (1,1,2) VP V NP (2,1,1) VP VP PP (3,1,1) NP NP PP (3,1,1) PP P NP (4,1,1) N cash NP N ---P with S NP VP (1,1,1) VP V NP (2,1,1) N books NP N -V bought N book NP N
17
Common probability equations
18
Three types of probability Joint prob: P(x,y)= prob of x and y happening together Conditional prob: P(x|y) = prob of x given a specific value of y Marginal prob: P(x) = prob of x for all possible values of y
19
Common equations
20
An example #(words)=100, #(nouns)=40, #(verbs)=20 “books” appears 10 times, 3 as verbs, 7 as nouns P(w=books)=0.1 P(w=books,t=noun)=0.07 P(t=noun|w=books)=0.7 P(nouns)=0.4 P(w=books|t=nouns)=7/40
21
More general cases
22
Independence assumption
23
Two variables A and B are independent if –P(A,B)=P(A)*P(B) –P(A)=P(A|B) –P(B)=P(B|A) Two variables A and B are conditional independent given C if –P(A,B|C)=P(A|C) * P(B|C) –P(A|B,C)=P(A|C) –P(B|A,C)=P(B|C) Independence assumption is used to remove some conditional factors, which will reduce the number of parameters in a model.
24
PCFG parsers It assumes each rule is independent of other rules
25
Problems of independence assumptions Lexical independence: –P(VP V, V bought) = P(VP V)*P(V bought) See Table 12.2 on M&S P418. cometakethinkwant VP->V9.5%2.6%4.6%5.7% VP->V NP1.1%32.1%0.2%13.9% VP->V PP34.5%3.1%7.1%0.3% VP->V SBAR6.6%0.3%73.0%0.2%
26
Problems of independence assumptions (cont) Structural independence: –P(S NP VP, NP Pron) = P(S NP VP) * P(NP Pron) See Table 12.3 on M&S P420. % as subj% as obj NP Pron13.7%2.1% NP Det NN 5.6%4.6% NP NP SBAR 0.5%2.6% NP NP PP 5.6%14.1%
27
Dealing with the problems Lexical rules: –P(VP V | V=come) –P(VP V | V=think) Adding context info: is a function that groups into equivalence classes.
28
PCFG It assumes each rule is independent of other rules
29
A lexicalized model
30
An example he likes her
31
Head-head probability
32
Head-rule probability
33
Collecting the counts
34
Remaining problems he likes her The Prob(T,S) is the same if the sentence is changed to “her likes he”.
35
Previous model
36
A new model
37
New formula he likes her
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.