Download presentation
Presentation is loading. Please wait.
Published byEstella McDaniel Modified over 9 years ago
1
CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-17: Probabilistic parsing; inside- outside probabilities
2
Probability of a parse tree (cont.) S 1,l NP 1,2 VP 3,l N 2,2 V 3,3 PP 4,l P 4,4 NP 5,l w2w2 w4w4 DT 1 w1w1 w3w3 w5w5 wlwl P ( t|s ) = P (t | S 1,l ) = P ( NP 1,2, DT 1,1, w 1, N 2,2, w 2, VP 3,l, V 3,3, w 3, PP 4,l, P 4,4, w 4, NP 5,l, w 5…l | S 1,l ) = P ( NP 1,2, VP 3,l | S 1,l ) * P ( DT 1,1, N 2,2 | NP 1,2 ) * D(w 1 | DT 1,1 ) * P (w 2 | N 2,2 ) * P (V 3,3, PP 4,l | VP 3,l ) * P(w 3 | V 3,3 ) * P( P 4,4, NP 5,l | PP 4,l ) * P(w 4 |P 4,4 ) * P (w 5…l | NP 5,l ) (Using Chain Rule, Context Freeness and Ancestor Freeness )
3
Example PCFG Rules & Probabilities S NP VP1.0 NP DT NN0.5 NP NNS0.3 NP NP PP 0.2 PP P NP1.0 VP VP PP 0.6 VP VBD NP0.4 DT the1.0 NN gunman0.5 NN building0.5 VBD sprayed 1.0 NNS bullets1.0
4
Example Parse t 1` The gunman sprayed the building with bullets. S 1.0 NP 0.5 VP 0.6 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunman sprayed P (t 1 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225 VP 0.4
5
Another Parse t 2 S 1.0 NP 0.5 VP 0.4 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunmansprayed NP 0.2 P (t 2 ) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015 The gunman sprayed the building with bullets.
6
HMM ↔ PCFG O observed sequence ↔ w 1m sentence X state sequence ↔ t parse tree model ↔ G grammar Three fundamental questions
7
HMM ↔ PCFG How likely is a certain observation given the model? ↔ How likely is a sentence given the grammar? How to choose a state sequence which best explains the observations? ↔ How to choose a parse which best supports the sentence? ↔ ↔
8
HMM ↔ PCFG How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences? ↔
9
Interesting Probabilities The gunman sprayed the building with bullets 1 2 3 45 6 7 N1N1 NP What is the probability of having a NP at this position such that it will derive “the building” ? - What is the probability of starting from N 1 and deriving “The gunman sprayed”, a NP and “with bullets” ? - Inside Probabilities Outside Probabilities
10
Interesting Probabilities Random variables to be considered –The non-terminal being expanded. E.g., NP –The word-span covered by the non-terminal. E.g., (4,5) refers to words “the building” While calculating probabilities, consider: –The rule to be used for expansion : E.g., NP DT NN –The probabilities associated with the RHS non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities
11
Outside Probabilities Forward ↔ Outside probabilities j (p,q) : The probability of beginning with N 1 & generating the non-terminal N j pq and all words outside w p..w q Forward probability : Outside probability : w 1 ………w p-1 w p …w q w q+1 ……… w m N1N1 NjNj
12
Inside Probabilities Backward ↔ Inside probabilities j (p,q) : The probability of generating the words w p..w q starting with the non-terminal N j pq. Backward probability : Inside probability : w 1 ………w p-1 w p …w q w q+1 ……… w m N1N1 NjNj
13
Outside & Inside Probabilities The gunman sprayed the building with bullets 1 2 3 45 6 7 N1N1 NP
14
Inside probabilities j (p,q) Base case: Base case is used for rules which derive the words or terminals directly E.g., Suppose N j = NN is being considered & NN building is one of the rules with probability 0.5
15
Induction Step Induction step : wpwp NjNj NrNr NsNs wdwd w d+1 wqwq Consider different splits of the words - indicated by d E.g., the huge building Consider different non-terminals to be used in the rule: NP DT NN, NP DT NNS are available options Consider summation over all these. Split here for d=2 d=3
16
The Bottom-Up Approach The idea of induction Consider “the gunman” Base cases : Apply unary rules DT the Prob = 1.0 NN gunmanProb = 0.5 Induction : Prob that a NP covers these 2 words = P (NP DT NN) * P (DT deriving the word “the”) * P (NN deriving the word “gunman”) = 0.5 * 1.0 * 0.5 = 0.25 The gunman NP 0.5 DT 1.0 NN 0.5
17
Parse Triangle A parse triangle is constructed for calculating j (p,q) Probability of a sentence using j (p,q):
18
Parse Triangle The (1) gunman (2) sprayed (3) the (4) building (5) with (6) bullets (7) 1 2 3 4 5 6 7 Fill diagonals with
19
Parse Triangle The (1) gunman (2) sprayed (3) the (4) building (5) with (6) bullets (7) 1 2 3 4 5 6 7 Calculate using induction formula
20
Example Parse t 1` S 1.0 NP 0.5 VP 0.6 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunman sprayed VP 0.4 Rule used here is VP VP PP The gunman sprayed the building with bullets.
21
Another Parse t 2 S 1.0 NP 0.5 VP 0.4 DT 1.0 NN 0.5 VBD 1.0 NP 0.5 PP 1.0 DT 1.0 NN 0.5 P 1.0 NP 0.3 NNS 1.0 bullets with buildingthe Thegunmansprayed NP 0.2 Rule used here is VP VBD NP The gunman sprayed the building with bullets.
22
Parse Triangle The (1)gunman (2) sprayed (3) the (4) building (5) with (6) bullets (7) 1 2 3 4 5 6 7
23
Different Parses Consider –Different splitting points : E.g., 5th and 3 rd position –Using different rules for VP expansion : E.g., VP VP PP, VP VBD NP Different parses for the VP “sprayed the building with bullets” can be constructed this way.
24
Outside Probabilities j (p,q) Base case: Inductive step for calculating : wpwp N f pe N j pq N g (q+1)e wqwq w q+1 wewe w p-1 w1w1 wmwm w e+1 N1N1 Summation over f, g & e
25
Probability of a Sentence Joint probability of a sentence w 1m and that there is a constituent spanning words w p to w q is given as: The gunman sprayed the building with bullets 1 2 3 45 6 7 N1N1 NP
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.