Download presentation
Presentation is loading. Please wait.
Published bySheena Shanon Wilcox Modified over 9 years ago
1
CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-16: Probabilistic parsing; computing probability of a sentence
2
N-gram vs. PCFG
3
Chain Rule P(w 1,w 2,…,w n ) = P(w 1 ) * P(w 2 /w 1 ) * P(w 3 /w 1-2 ) *…* P(w n /w 1-(n-1) ) where, P(w n /w 1-(n-1) ) = # / # # denotes “number of”
4
Unigram & Bigram Probability Unigram –P(w 1 ) = # / #words Bigram –P(w 2 /w 1 ) = # / #
5
PCFG Prob (P) (Corpus) CFG (human)
6
N-Gram v/s PCFG N-gramPCFG P(w 1,m ) = P(w 1 ) * P(w 2 /w 1 ) * P(w 3 /w 1-2 ) *…* P(w n /w 1, m-1 ) P(w 1,m ) = ∑ t P(w 1,m, t) – Marginalisation
7
Compare P(w 1,m ) = P(w 1 ) * i=1 π i=m P(w i /w 1, n-i ) Statistics (Speech) P(w 1,m ) = ∑ t all parses P(t) Statistics + Linguistics w 1,m = yield(s) linguistics
8
Example of Sentence labeling: Parsing [ S1 [ S [ S [ VP [ VB Come][ NP [ NNP July]]]] [,,] [ CC and] [ S [ NP [ DT the] [ JJ UJF] [ NN campus]] [ VP [ AUX is] [ ADJP [ JJ abuzz] [ PP [ IN with] [ NP [ ADJP [ JJ new] [ CC and] [ VBG returning]] [ NNS students]]]]]] [..]]]
9
Rule Probabilities Rule probabilities are such that E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = 0.3 P( NP DT NN) = 0.2 Means 20 % of the training data parses use the rule NP DT NN
10
Probability of a sentence Notation : –w ab – subsequence w a ….w b –N j dominates w a ….w b or yield(N j ) = w a ….w b w a ……………..w b NjNj Where t is a parse tree of the sentence the..sweet..teddy..bear NP Probability of a sentence = P(w 1m ) If t is a parse tree for the sentence w 1m, this will be 1 !!
11
Assumptions of the PCFG model Place invariance : P(NP DT NN) is same in locations 1 and 2 Context-free : P(NP DT NN | anything outside “The child”) = P(NP DT NN) Ancestor free : At 2, P(NP DT NN|its ancestor is VP) = P(NP DT NN) S NP The child VP NP The child 1 2
12
Probability of a parse tree Domination :We say N j dominates from k to l, symbolized as, if W k,l is derived from N j P (tree |sentence) = P (tree | S 1,l ) where S 1,l means that the start symbol S dominates the word sequence W 1,l P (t |s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)
13
Probability of a parse tree (cont.) S 1,l NP 1,2 VP 3,l N 2,2 V 3,3 PP 4,l P 4,4 NP 5,l w2w2 w4w4 DT 1 w1w1 w3w3 w5w5 wlwl P ( t|s ) = P (t | S 1,l ) = P ( NP 1,2, DT 1,1, w 1, N 2,2, w 2, VP 3,l, V 3,3, w 3, PP 4,l, P 4,4, w 4, NP 5,l, w 5…l | S 1,l ) = P ( NP 1,2, VP 3,l | S 1,l ) * P ( DT 1,1, N 2,2 | NP 1,2 ) * D(w 1 | DT 1,1 ) * P (w 2 | N 2,2 ) * P (V 3,3, PP 4,l | VP 3,l ) * P(w 3 | V 3,3 ) * P( P 4,4, NP 5,l | PP 4,l ) * P(w 4 |P 4,4 ) * P (w 5…l | NP 5,l ) (Using Chain Rule, Context Freeness and Ancestor Freeness )
14
Example PCFG Rules & Probabilities S NP VP1.0 NP DT NN0.5 NP NNS0.3 NP NP PP 0.2 PP P NP1.0 VP VP PP 0.6 VP VBD NP0.4 DT the1.0 NN gunman0.5 NN building0.5 VBD sprayed 1.0 NNS bullets1.0
15
Example Parse t 1` The gunman sprayed the building with bullets. S 1.0 NP 0.5 VP 0.6 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunman sprayed P (t 1 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225 VP 0.4
16
Another Parse t 2 S 1.0 NP 0.5 VP 0.4 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunmansprayed NP 0.2 P (t 2 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015 The gunman sprayed the building with bullets.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.