Download presentation
Presentation is loading. Please wait.
1
Carnegie Mellon Exact Maximum Likelihood Estimation for Word Mixtures Yi Zhang & Jamie Callan Carnegie Mellon University {yiz,callan}@cs.cmu.edu Wei Xu NEC C&C Research Lab xw@ccrl.sj.nec.com
2
Carnegie Mellon Outline u Introduction 1. Why this problems? some retrieval applications 2. Traditional solutions: EM algorithm u New algorithm: exact MLE estimation u Experimental Results
3
Carnegie Mellon Query Q Document D Results Feedback Docs F={d 1, d 2, …, d n } Example 1: Model-based Feedback in the Language Modeling Approach to IR Based on Zhai&Lafferty’s slides in CIKM 2001
4
Carnegie Mellon F Estimation based on Generative Mixture Model w w F={d 1, …, d n } Maximum Likelihood P(w| ) P(w| C) 1- P(source) Background words Topic words Based on Zhai&Lafferty’s slides in CIKM 2001 Given: F, P(w|c) and Find: MLE of
5
Carnegie Mellon M T : Topic M E : general English M I : new E T new Example 2: Model-based Approach for Novelty Detection in Adaptive Information Filtering Based on Zhang&Callan’s paper in SIGIR 2002 Given: general English, Topic E T new Find: MLE of new
6
Carnegie Mellon Problem Setting and Traditional Solution Using EM u Observe: data generated by a mixture multinomial distribution r=(r 1, r 2, r 3, …, r k ) u Given: interpolation weights and , another multinomial distribution p=(p 1, p 2, p 3, …, p k ) u Find: the maximum likelihood estimation (MLE) of multinomial distribution q=(q 1, q 2, q 3, …, q k ) u Traditional solution: EM algorithm l Iterative process which can be computationally expensive l Only provide approximate solution
7
Carnegie Mellon Finding q (1) Under the constraints: Where: f i is observed frequency of word i
8
Carnegie Mellon Finding q (2) For all the q i such that q i >0, apply Lagrange multiplier method and calculate the derivatives with respect to q i : This is a close form solution for q i, if we know all i that q i >0. Theorem: All the q i greater than 0 correspond to the smallest See detailed proof in our paper…
9
Carnegie Mellon Algorithm for Finding Exact MLE for q
10
Carnegie Mellon Experiments Setting on Model Based Feedback in IR u 20 relevant documents (sampled from AP Wire News and Wall Street Journal dataset from 1988- 1990) for a topic as observed training data sequence. p is calculated directly as described in (Zhai&Lafferty) from 119823 documents. u There are 2352 unique words in these 20 relevant documents, which means at most 2352 q i 's are none zero, while there are 200542 p i 's are none zero.
11
Carnegie Mellon EM result converges to the result calculated directly by our algorithm.
12
Carnegie Mellon Compar ing the Speed of Our Algorithm With EM u EM stop if change of LL < 10 - u 50000 times on PIII 500 PC
13
Carnegie Mellon Conclusion u We developed a new training algorithm that provide exact MLE for word mixtures u Theoretically and Empirically works well u Can be used in several language model based IR applications
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.