Murat Açar - Zeynep Çipiloğlu Yıldız

Slides:



Advertisements
Similar presentations
Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Advertisements

Even More TopX: Relevance Feedback Ralf Schenkel Joint work with Osama Samodi, Martin Theobald.
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Information Retrieval and Organisation Chapter 12 Language Models for Information Retrieval Dell Zhang Birkbeck, University of London.
Chapter 5: Introduction to Information Retrieval
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
Language Models Hongning Wang
Introduction to Information Retrieval (Part 2) By Evren Ermis.
Probabilistic Ranking Principle
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
Information Retrieval Models: Probabilistic Models
Chapter 7 Retrieval Models.
Incorporating Language Modeling into the Inference Network Retrieval Framework Don Metzler.
A Markov Random Field Model for Term Dependencies Donald Metzler and W. Bruce Croft University of Massachusetts, Amherst Center for Intelligent Information.
Chapter 5: Query Operations Baeza-Yates, 1999 Modern Information Retrieval.
Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.
Retrieval Models II Vector Space, Probabilistic.  Allan, Ballesteros, Croft, and/or Turtle Properties of Inner Product The inner product is unbounded.
ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.
Language Modeling Approaches for Information Retrieval Rong Jin.
ICAIL 2007 DESI Workshop Panel presentation Marie-Francine Moens Centre for Law and ICT/ Department of Computer Science Katholieke Universiteit Leuven,
1 Probabilistic Language-Model Based Document Retrieval.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
Lemur Application toolkit Kanishka P Pathak Bioinformatics CIS 595.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
A Markov Random Field Model for Term Dependencies Donald Metzler W. Bruce Croft Present by Chia-Hao Lee.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Query Operations J. H. Wang Mar. 26, The Retrieval Process User Interface Text Operations Query Operations Indexing Searching Ranking Index Text.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
A General Optimization Framework for Smoothing Language Models on Graph Structures Qiaozhu Mei, Duo Zhang, ChengXiang Zhai University of Illinois at Urbana-Champaign.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Can Change this on the Master Slide Monday, August 20, 2007Can change this on the Master Slide0 A Distributed Ranking Algorithm for the iTrust Information.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
A Language Modeling Approach to Information Retrieval 한 경 수  Introduction  Previous Work  Model Description  Empirical Results  Conclusions.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.
Query Suggestions in the Absence of Query Logs Sumit Bhatia, Debapriyo Majumdar,Prasenjit Mitra SIGIR’11, July 24–28, 2011, Beijing, China.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Relevance Feedback Hongning Wang
A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR
User-Friendly Systems Instead of User-Friendly Front-Ends Present user interfaces are not accepted because the underlying systems are too difficult to.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture Probabilistic Information Retrieval.
The Effect of Database Size Distribution on Resource Selection Algorithms Luo Si and Jamie Callan School of Computer Science Carnegie Mellon University.
A Study of Poisson Query Generation Model for Information Retrieval
Collection Fusion in Carrot2
Reading Notes Wang Ning Lab of Database and Information Systems
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
A research literature search engine with abbreviation recognition
Information Retrieval Models: Probabilistic Models
Relevance Feedback Hongning Wang
Language Models for Information Retrieval
John Lafferty, Chengxiang Zhai School of Computer Science
Michal Rosen-Zvi University of California, Irvine
Language Model Approach to IR
Text Retrieval and Mining
Probabilistic Ranking Principle
Topic Models in Text Processing
Language Models Hongning Wang
Relevance and Reinforcement in Interactive Browsing
INF 141: Information Retrieval
Information Retrieval and Web Design
Language Models for TR Rong Jin
Presentation transcript:

Murat Açar - Zeynep Çipiloğlu Yıldız A LANGUAGE MODELING APPROACH TO INFORMATION RETRIEVAL JAY M. Ponte & W. BRUCE Croft Murat Açar - Zeynep Çipiloğlu Yıldız

Introduction The problem is: the integration of document indexing and retrieval models the lack of an adequate indexing model parametric assumptions prior assumptions about the similarity of documents The novel approach is: non-parametric based on probabilistic language modeling to integrate document indexing and document retrieval models into a single model inspired by speech recognition

Previous Work 2-Poisson model [Harter] probabilistic indexing model a subset of terms in a document is useful for indexing identify words by distribution and assign indexing words Robertson and Spark Jones model estimates the probability of relevance of each document to the query INQUERY inference network model [Turtle and Croft] integrate indexing and retrieval by making inferences of concepts from features features: words, phrases, or more complex structures Bayesian network (for multiple feature sets/queries)

Language Model Method: infer a language model for each document individually estimate the probability of producing the query rank the documents with respect to probabilities Estimate the prob. of the query, given the LM of doc. d          MLE of the prob. of term t under term distribution of doc. d  Problem: only document sized sample

Language Model (cont.) Risk function (geometric distribution): Probability of producing the query for a given document model   Compute               for each candidate document and rank

Experimental Results 11 point recall/precision experiments on TREC data Labrador(a research prototype retrieval engine) Wilcoxon test LM:   has better precision             at all levels  significantly better at several levels

Conclusion / FUTURE WORK Text retrieval based on probabilistic language modeling It is both conceptually simple and explanatory The improvement in the performance is not the main point More significant is that a different approach to retrieval was shown to be effective It can be improved: Additional knowledge about the language generation process will yield better estimates Textual/graphical tools to sense the distribution of terms

References [1] Harter, S. P. "A Probabilistic Approach to Automatic Keyword Indexing” Journal of the American Society for Information Science, July-August, 1975. [2] Robertson, S. E. and K. Sparck Jones. “Relevance Weighting Of Search Terms,” Journal of the American Society for Information Science, vol. 27, 1977. [3] Turtle H. and W. B. Croft. “Efficient Probabilistic Inference for Text Retrieval,” Proceedings of RIAO 3, 1991.

THANK YOU FOR LISTENING