1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,

Slides:



Advertisements
Similar presentations
ACM SIGIR 2009 Workshop on Redundancy, Diversity, and Interdependent Document Relevance, July 23, 2009, Boston, MA 1 Modeling Diversity in Information.
Advertisements

Information Retrieval and Organisation Chapter 12 Language Models for Information Retrieval Dell Zhang Birkbeck, University of London.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao, and ChengXiang Zhai University of Illinois at Urbana Champaign SIGIR 2004 (Best paper.
Statistical Translation Language Model Maryam Karimzadehgan University of Illinois at Urbana-Champaign 1.
Language Models Hongning Wang
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Improvements to BM25 and Language Models Examined ANDREW TROTMAN, ANTTI PUURULA, BLAKE BURGESS AUSTRALASIAN DOCUMENT COMPUTING SYMPOSIUM 2014 MELBOURNE,
Probabilistic Ranking Principle
Information Retrieval Models: Probabilistic Models
Mixture Language Models and EM Algorithm
Chapter 7 Retrieval Models.
Database Management Systems, R. Ramakrishnan1 Computing Relevance, Similarity: The Vector Space Model Chapter 27, Part B Based on Larson and Hearst’s slides.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 12: Language Models for IR.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 11: Probabilistic Information Retrieval.
Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.
Formal Multinomial and Multiple- Bernoulli Language Models Don Metzler.
Introduction to Language Models Evaluation in information retrieval Lecture 4.
Language Modeling Frameworks for Information Retrieval John Lafferty School of Computer Science Carnegie Mellon University.
Language Modeling Approaches for Information Retrieval Rong Jin.
Generating Impact-Based Summaries for Scientific Literature Qiaozhu Mei, ChengXiang Zhai University of Illinois at Urbana-Champaign 1.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Language Models for IR Debapriyo Majumdar Information Retrieval Indian Statistical Institute Kolkata Spring 2015 Credit for several slides to Jimmy Lin.
IRDM WS Chapter 4: Advanced IR Models 4.1 Probabilistic IR 4.2 Statistical Language Models (LMs) Principles and Basic LMs Smoothing.
Language Models Hongning Wang Two-stage smoothing [Zhai & Lafferty 02] c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 7. Topic Extraction.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Language Models Hongning Wang Recap: document generation model 4501: Information Retrieval Model of relevant docs for Q Model of non-relevant.
Positional Relevance Model for Pseudo–Relevance Feedback Yuanhua Lv & ChengXiang Zhai Department of Computer Science, UIUC Presented by Bo Man 2014/11/18.
Statistical Language Models for Biomedical Literature Retrieval ChengXiang Zhai Department of Computer Science, Institute for Genomic Biology, And Graduate.
1 A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao and ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
CpSc 881: Information Retrieval. 2 Using language models (LMs) for IR ❶ LM = language model ❷ We view the document as a generative model that generates.
Relevance Feedback Hongning Wang
A Study of Poisson Query Generation Model for Information Retrieval
Statistical Language Models Hongning Wang CS 6501: Text Mining1.
2010 © University of Michigan Probabilistic Models in Information Retrieval SI650: Information Retrieval Winter 2010 School of Information University of.
Recent Paper of Md. Akmal Haidar Meeting before ICASSP 2013 報告者:郝柏翰 2013/05/23.
A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.
Language Modeling Again So are we smooth now? Courtesy of Chris Jordan.
A Formal Study of Information Retrieval Heuristics Hui Fang, Tao Tao, ChengXiang Zhai University of Illinois at Urbana Champaign Urbana SIGIR 2004 Presented.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 14: Language Models for IR.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
Information Retrieval Models: Language Models
A Formal Study of Information Retrieval Heuristics
Lecture 13: Language Models for IR
Statistical Language Models
Language Models for Text Retrieval
Hidden Markov Models (HMMs)
CSCI 5417 Information Retrieval Systems Jim Martin
Information Retrieval Models: Probabilistic Models
Relevance Feedback Hongning Wang
Language Models for Information Retrieval
Hidden Markov Models (HMMs)
Introduction to Statistical Modeling
Pairwise Sequence Alignment (cont.)
Bayesian Inference for Mixture Language Models
John Lafferty, Chengxiang Zhai School of Computer Science
Language Model Approach to IR
Language Models Hongning Wang
CS590I: Information Retrieval
INF 141: Information Retrieval
Conceptual grounding Nisheeth 26th March 2019.
Information Retrieval and Web Design
Language Models for TR Rong Jin
Presentation transcript:

1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

2 Text Generation with Unigram LM (Unigram) Language Model  p(w|  ) … text 0.2 mining 0.1 assocation 0.01 clustering 0.02 … food … Topic 1: Text mining … food 0.25 nutrition 0.1 healthy 0.05 diet 0.02 … Topic 2: Health Document Text mining paper Food nutrition paper Sampling

3 Estimation of Unigram LM (Unigram) Language Model  p(w|  )=? Document text 10 mining 5 association 3 database 3 algorithm 2 … query 1 efficient 1 … text ? mining ? assocation ? database ? … query ? … Estimation A “text mining paper” (total #words=100) 10/100 5/100 3/100 1/100

4 Language Models for Retrieval (Ponte & Croft 98) Document Text mining paper Food nutrition paper Language Model … text ? mining ? assocation ? clustering ? … food ? … food ? nutrition ? healthy ? diet ? … Query = “data mining algorithms” ? Which model would most likely have generated this query?

5 Ranking Docs by Query Likelihood d1d1 d2d2 dNdN q d1d1 d2d2 dNdN Doc LM p(q|  d 1 ) p(q|  d 2 ) p(q|  d N ) Query likelihood

6 Retrieval as Language Model Estimation Document ranking based on query likelihood Retrieval problem  Estimation of p(w i |d) Smoothing is an important issue, and distinguishes different approaches Document language model

7 How to Estimate p(w|d)? Simplest solution: Maximum Likelihood Estimator –P(w|d) = relative frequency of word w in d –What if a word doesn’t appear in the text? P(w|d)=0 In general, what probability should we give a word that has not been observed? If we want to assign non-zero probabilities to such words, we’ll have to discount the probabilities of observed words This is what “smoothing” is about …

8 Language Model Smoothing (Illustration) P(w) Word w Max. Likelihood Estimate Smoothed LM

9 A General Smoothing Scheme All smoothing methods try to –discount the probability of words seen in a doc –re-allocate the extra probability so that unseen words will have a non-zero probability Most use a reference model (collection language model) to discriminate unseen words Discounted ML estimate Collection language model

10 Smoothing & TF-IDF Weighting Plug in the general smoothing scheme to the query likelihood retrieval formula, we obtain Ignore for ranking IDF weighting TF weighting Doc length normalization (long doc is expected to have a smaller  d ) Smoothing with p(w|C)  TF-IDF + length norm.

11 Derivation of the Query Likelihood Retrieval Formula Discounted ML estimate Reference language model Retrieval formula using the general smoothing scheme Key rewriting step Similar rewritings are very common when using LMs for IR…

12 Three Smoothing Methods (Zhai & Lafferty 01) Simplified Jelinek-Mercer: Shrink uniformly toward p(w|C) Dirichlet prior (Bayesian): Assume pseudo counts  p(w|C) Absolute discounting: Subtract a constant 

13 Comparison of Three Methods

14 The Need of Query-Modeling (Dual-Role of Smoothing) Verbose queries Keyword queries Why does query type affect smoothing sensitivity?

15 Query = “the algorithms for data mining” Another Reason for Smoothing p( “algorithms”|d1) = p(“algorithm”|d2) p( “data”|d1) < p(“data”|d2) p( “mining”|d1) < p(“mining”|d2) So we should make p(“the”) and p(“for”) less different for all docs, and smoothing helps achieve this goal… Content words Intuitively, d2 should have a higher score, but p(q|d1)>p(q|d2)… p DML (w|d1): p DML (w|d2): Query = “the algorithms for data mining” P(w|REF) Smoothed p(w|d1): Smoothed p(w|d2):

16 Two-stage Smoothing c(w,d) |d| P(w|d) = +  p(w|C) ++ Stage-1 -Explain unseen words -Dirichlet prior(Bayesian)  (1- )+ p(w|U) Stage-2 -Explain noise in query -2-component mixture User background model and  can be automatically set through statistical estimation

17 Automatic 2-stage results  Optimal 1-stage results Average precision (3 DB’s + 4 query types, 150 topics) SK, LK, SV, LV: different types of queries

18 What You Should Know The basic idea of ranking docs by query likelihood (“the language modeling approach”) How smoothing is connected with TF-IDF weighting and document length normalization The basic idea of two-stage smoothing