Download presentation
Presentation is loading. Please wait.
1
Empirical Development of an Exponential Probabilistic Model Using Textual Analysis to Build a Better Model Jaime Teevan & David R. Karger CSAIL (LCS+AI), MIT
2
Goal: Better Generative Model Generative v. discriminative model Applies to many applications Information retrieval (IR) Relevance feedback Using unlabeled data Classification Assumptions explicit
3
Using a Model for IR 1. Define model 2. Learn parameters from query 3. Rank documents Hyper-learn Better model improves applications Trickle down to improve retrieval Classification, relevance feedback, … Corpus specific models
4
Overview Related work Probabilistic models Example: Poisson Model Compare model to text Hyper-learning the model Exponential framework Investigate retrieval performance Conclusion and future work
5
Related Work Using text for retrieval algorithm [Jones, 1972], [Greiff, 1998] Using text to model text [Church & Gale, 1995], [Katz, 1996] Learning model parameters [Zhai & Lafferty, 2002] Hyper-learn the model from text!
6
Probabilistic Models Rank documents by RV = Pr(rel|d) Naïve Bayesian models RV = Pr(rel|d)
7
Probabilistic Models Rank documents by RV = Pr(rel|d) Naïve Bayesian models = Pr(d t |rel) features t RV = Pr(rel|d) 8 Open assumptions Feature definition Feature distribution family words # occs in doc Defines the model! Pr(d|rel)
8
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents
9
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) =
10
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) = θ e -θ-θ dt!dt! dtdt Poisson Model θ : specifies term distribution
11
Term occurs exactly d t times Pr(d t |rel) Example Poisson Distribution θ =0.0006 Pr(d t |rel)≈1E-15 +
12
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Learn a θ for each term Maximum likelihood θ Term’s average number of occurrence Incorporate prior expectations
13
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents
14
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents For each document, find RV Sort documents by RV = Pr(d t |rel). words t RV
15
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents For each document, find RV Sort documents by RV = Pr(d t |rel). words t RV Which step goes wrong?
16
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents
17
Using a Naïve Bayesian Model 1. Define model 2. Learn parameters from query 3. Rank documents Pr(d t |rel) = θ e -θ-θ dt!dt! dtdt
18
Term occurs exactly d t times Pr(d t |rel) How Good is the Model? θ =0.0006 15 times +
19
How Good is the Model? Term occurs exactly d t times Pr(d t |rel) θ =0.0006 15 times Misfit! +
20
Hyper-learning a Better Fit Through Textual Analysis Using an Exponential Framework
21
Need framework for hyper-learning Bernoulli Poisson Normal Mixtures Hyper-Learning Framework
22
Need framework for hyper-learning Goal: Same benefits as Poisson Model One parameter Easy to work with (e.g., prior) Bernoulli Poisson Normal One parameter exponential families Mixtures Hyper-Learning Framework
23
Well understood, learning easy [Bernardo & Smith, 1994], [Gous, 1998] Pr( d t | rel ) = f(d t ) g( θ ) e Functions f(d t ) and h(d t ) specify family E.g., Poisson: f( d t ) = ( d t !) -1, h( d t ) = d t Parameter θ term’s specific distribution Exponential Framework θ h(dt)θ h(dt)
24
Using a Hyper-learned Model 1. Define model 2. Learn parameters from query 3. Rank documents
25
Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents
26
Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Want “best” f(d t ) and h(d t ) Iterative hill climbing Local maximum Poisson starting point
27
Using a Hyper-learned Model 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Data: TREC query result sets Past queries to learn about future queries Hyper-learn and test with different sets
28
Recall the Poisson Distribution Term occurs exactly d t times Pr(d t |rel) 15 times +
29
Poisson Starting Point - h(d t ) h(dt)h(dt) dtdt Pr(d t |rel) = f(d t ) g( θ ) e θ h(dt)θ h(dt) +
30
h(dt)h(dt) dtdt Hyper-learned Model - h(d t ) + Pr(d t |rel) = f(d t ) g( θ ) e θ h(dt)θ h(dt)
31
Poisson Distribution Term occurs exactly d t times Pr(d t |rel) 15 times +
32
Term occurs exactly d t times Hyper-learned Distribution 15 times Hyper-learned Distribution + Pr(d t |rel)
33
Term occurs exactly d t times 5 times Hyper-learned Distribution + Pr(d t |rel)
34
Term occurs exactly d t times 30 times Hyper-learned Distribution + Pr(d t |rel)
35
Term occurs exactly d t times 300 times Hyper-learned Distribution + Pr(d t |rel)
36
Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents
37
Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Pr( d t | rel ) = f(d t ) g( θ ) e Learn θ for each term θ h(dt)θ h(dt) Labeled docs
38
Learning θ Sufficient statistics Summarize all observed data τ 1 : # of observations τ 2 : Σ observations d h(d t ) Incorporating prior easy Map τ 1 and τ 2 θ 20 labeled documents
39
Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents
40
Recall Precision Results: Labeled Documents
41
Recall Precision Results: Labeled Documents
42
Performing Retrieval 1. Hyper-learn model 2. Learn parameters from query 3. Rank documents Short query
43
Query = single labeled document Vector space-like equation RV = Σ a(t, d) + Σ b(q, d) Problem: Document dominates Solution: Use only query portion Another solution: Normalize Retrieval: Query t in doc q in query Retrieval: Query
44
Recall Precision Retrieval: Query
45
Recall Precision Retrieval: Query
46
Recall Precision Retrieval: Query
47
Conclusion Probabilistic models Example: Poisson Model Hyper-learning the model Exponential framework Learned a better model Investigate retrieval performance - Easy to work with - Better … - Bad text model - Heavy tailed!
48
Use model better Use for other applications Other IR applications Classification Correct for document length Hyper-learn on different corpora Test if learned model generalizes Different for genre? Language? People? Hyper-learn model better Future Work
49
Questions? Contact us with questions: Jaime Teevan teevan@ai.mit.edu David Karger karger@theory.lcs.mit.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.