Presentation is loading. Please wait.

Presentation is loading. Please wait.

Empirical Development of an Exponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI),

Similar presentations


Presentation on theme: "Empirical Development of an Exponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI),"— Presentation transcript:

1 Empirical Development of an Exponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI), MIT

2 Goal: Better Generative Model Generative v. discriminative model Applies to many applications Information retrieval (IR)  Relevance feedback  Using unlabeled data Classification Assumptions explicit

3 Using a Model for IR 1. Define model 2. Learn parameters from query 3. Rank documents Hyper-learn Better model improves applications  Trickle down to improve retrieval  Classification, relevance feedback, … Corpus specific models

4 Overview Related work Probabilistic models Example: Poisson Model Compare model to text Hyper-learning the model Exponential framework Investigate retrieval performance Conclusion and future work

5 Related Work Using text for retrieval algorithm [Jones, 1972], [Greiff, 1998] Using text to model text [Church & Gale, 1995], [Katz, 1996] Learning model parameters [Zhai & Lafferty, 2002] Hyper-learn the model from text!

6 Probabilistic Models Rank documents by RV = Pr(rel|d) Naïve Bayesian models RV = Pr(rel|d)

7 Probabilistic Models Rank documents by RV = Pr(rel|d) Naïve Bayesian models =  Pr(d t |rel) features t RV = Pr(rel|d) 8 Open assumptions Feature definition Feature distribution family words # occs in doc Defines the model! Pr(d|rel)

8 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents

9 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) =

10 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) = θ e -θ-θ dt!dt! dtdt Poisson Model θ : specifies term distribution

11 Term occurs exactly d t times Pr(d t |rel) Example Poisson Distribution θ =0.0006 Pr(d t |rel)≈1E-15 +

12 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Learn a θ for each term Maximum likelihood θ Term’s average number of occurrence Incorporate prior expectations

13 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents

14 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents For each document, find RV Sort documents by RV =  Pr(d t |rel). words t RV

15 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents For each document, find RV Sort documents by RV =  Pr(d t |rel). words t RV Which step goes wrong?

16 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents

17 Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) = θ e -θ-θ dt!dt! dtdt

18 Term occurs exactly d t times Pr(d t |rel) How Good is the Model? θ =0.0006 15 times +

19 How Good is the Model? Term occurs exactly d t times Pr(d t |rel) θ =0.0006 15 times Misfit! +

20 Hyper-learning a Better Fit Through Textual Analysis Using an Exponential Framework

21 Need framework for hyper-learning Bernoulli Poisson Normal Mixtures Hyper-Learning Framework

22 Need framework for hyper-learning Goal: Same benefits as Poisson Model One parameter Easy to work with (e.g., prior) Bernoulli Poisson Normal One parameter exponential families Mixtures Hyper-Learning Framework

23 Well understood, learning easy [Bernardo & Smith, 1994], [Gous, 1998] Pr( d t | rel ) = f(d t ) g( θ ) e Functions f(d t ) and h(d t ) specify family E.g., Poisson: f( d t ) = ( d t !) -1, h( d t ) = d t Parameter θ term’s specific distribution Exponential Framework θ h(dt)θ h(dt)

24 Using a Hyper-learned Model 1. Define model 2. Learn parameters from query 3. Rank documents

25 Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents

26 Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Want “best” f(d t ) and h(d t ) Iterative hill climbing Local maximum Poisson starting point

27 Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Data: TREC query result sets Past queries to learn about future queries Hyper-learn and test with different sets

28 Recall the Poisson Distribution Term occurs exactly d t times Pr(d t |rel) 15 times +

29 Poisson Starting Point - h(d t ) h(dt)h(dt) dtdt Pr(d t |rel) = f(d t ) g( θ ) e θ h(dt)θ h(dt) +

30 h(dt)h(dt) dtdt Hyper-learned Model - h(d t ) + Pr(d t |rel) = f(d t ) g( θ ) e θ h(dt)θ h(dt)

31 Poisson Distribution Term occurs exactly d t times Pr(d t |rel) 15 times +

32 Term occurs exactly d t times Hyper-learned Distribution 15 times Hyper-learned Distribution + Pr(d t |rel)

33 Term occurs exactly d t times 5 times Hyper-learned Distribution + Pr(d t |rel)

34 Term occurs exactly d t times 30 times Hyper-learned Distribution + Pr(d t |rel)

35 Term occurs exactly d t times 300 times Hyper-learned Distribution + Pr(d t |rel)

36 Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents

37 Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Pr( d t | rel ) = f(d t ) g( θ ) e Learn θ for each term θ h(dt)θ h(dt) Labeled docs

38 Learning θ Sufficient statistics Summarize all observed data τ 1 : # of observations τ 2 : Σ observations d h(d t ) Incorporating prior easy Map τ 1 and τ 2 θ 20 labeled documents

39 Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents

40 Recall Precision Results: Labeled Documents

41 Recall Precision Results: Labeled Documents

42 Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Short query

43 Query = single labeled document Vector space-like equation RV = Σ a(t, d) + Σ b(q, d) Problem: Document dominates Solution: Use only query portion Another solution: Normalize Retrieval: Query t in doc q in query Retrieval: Query

44 Recall Precision Retrieval: Query

45 Recall Precision Retrieval: Query

46 Recall Precision Retrieval: Query

47 Conclusion Probabilistic models Example: Poisson Model Hyper-learning the model Exponential framework Learned a better model Investigate retrieval performance - Easy to work with - Better … - Bad text model - Heavy tailed!

48 Use model better Use for other applications Other IR applications Classification Correct for document length Hyper-learn on different corpora Test if learned model generalizes Different for genre? Language? People? Hyper-learn model better Future Work

49 Questions? Contact us with questions: Jaime Teevan teevan@ai.mit.edu David Karger karger@theory.lcs.mit.edu


Download ppt "Empirical Development of an Exponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI),"

Similar presentations


Ads by Google