Download presentation
Presentation is loading. Please wait.
1
Naïve Bayes for WSD w: ambiguous word S = {s 1, s 2, …, s n } senses for w V = {v 1, v 2, v 3 } words used as contextual features for disambiguation We use the following models for WSD
2
Generative vs. discriminative Write the joint probability distribution for the two GM Estimation (Training): estimation of the following probability distributions: PUT YOUR ANSWER HERE Inference (Testing): consist in computing the following Conditional probability distribution: PUT YOUR ANSWER HERE v1v1 v2v2 v3v3 sksk v1v1 v2v2 v3v3 sksk GenerativeDiscriminative Estimation (Training): estimation of the following probability distributions: PUT YOUR ANSWER HERE Inference (Testing): consist in computing the following Conditional probability distribution: PUT YOUR ANSWER HERE
3
Naïve Bayes for topic classification w1w1 w2w2 wnwn T Recall the general joint probability distribution: P(X 1,..X N ) = i P(X i | Par(X i ) ) P(T, w 1..w n ) = P(T) P(w 1 | T) P(w 2 | T) … P(w n | T )= = P(T) i P(w i | T) Inference (Testing): Compute conditional probabilities: P(T | w 1, w 2,..w n ) Estimation (Training): Given data, estimate: P(T), P(w i | T)
4
Topic = sport (num words = 16) D1: 2009 open season D2: against Maryland Sept D3: play six games seasons D3: schedule games weekends D4: games games games Exercise: estimation Topic = politics (num words = 20) D1: Obama hoping rally support D2: billion stimulus package D3: House Republicans tax D4: cuts spending GOP games D4: Republicans obama open D5: political season senate P(obama | T = politics) = HERE (whole formula and numerical results) P(obama | T = sport) = HERE ((whole formula and numerical results) P(season | T=politics) = HERE (only results) P(season | T= sport) = HERE (only results) Estimate: for each w i, T j
5
Exercise P(republicans|T=politics)= HERE P(republicans|T= sport)= HERE P(games|T=politics)= HERE P(games|T= sport)= HERE P(season|T=politics)= HERE P(season|T= sport)= HERE P(open|T=politics)= HERE P(open|T= sport)= HERE P(T=politics)= HERE P(T= sport)= HERE
6
Exercise: inference What is the topic of new documents: –Republicans obama season –games season open –democrats kennedy
7
Exercise: Bayes classification We compute P(T j | c) with Bayes rule Because of the dependencies encoded in the GM Bayes rule This GM
8
Exercise: Bayes classification New sentences: games season open T = politics? P(politics I c) = PUT YOU ANSWER HERE T = sport? P(sport I c) = PUT YOU ANSWER HERE That is, for each T j we calculate and see which one is higher Choose T = PUT YOU ANSWER HERE
9
Exercise: Bayes classification New sentences: republicans obama season T = politics? P(politics I c) = PUT YOU ANSWER HERE T = sport? P(sport I c) = PUT YOU ANSWER HERE That is, for each T j we calculate and see which one is higher Choose T = PUT YOU ANSWER HERE
10
Exercise: Bayes classification New sentences: democrats kennedy T = politics? P(politics I c) = PUT YOU ANSWER HERE T = sport? P(sport I c) = PUT YOU ANSWER HERE That is, for each T j we calculate and see which one is higher Choose T = PUT YOU ANSWER HERE
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.