Bayesian Query-Focused Summarization Slide 1 Hal Daumé III Bayesian Query-Focused Summarization Hal Daumé III and Daniel Marcu Information.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Metadata in Carrot II Current metadata –TF.IDF for both documents and collections –Full-text index –Metadata are transferred between different nodes Potential.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Overview of Collaborative Information Retrieval (CIR) at FIRE 2012 Debasis Ganguly, Johannes Leveling, Gareth Jones School of Computing, CNGL, Dublin City.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
Presenters: Başak Çakar Şadiye Kaptanoğlu.  Typical output of an IR system – static predefined summary ◦ Title ◦ First few sentences  Not a clear view.
Search Engines and Information Retrieval
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
Modern Information Retrieval
Information Retrieval in Practice
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Evaluating the Performance of IR Sytems
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Important Task in Patents Retrieval Recall is an Important Factor Given Query Patent -> the Task is to Search all Related Patents Patents have Complex.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Search Engines and Information Retrieval Chapter 1.
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Applying the KISS Principle with Prior-Art Patent Search Walid Magdy Gareth Jones Dublin City University CLEF-IP, 22 Sep 2010.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Relevance Feedback Hongning Wang What we have learned so far Information Retrieval User results Query Rep Doc Rep (Index) Ranker.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Probabilistic Models of Novel Document Rankings for Faceted Topic Retrieval Ben Cartrette and Praveen Chandar Dept. of Computer and Information Science.
Evaluation INST 734 Module 5 Doug Oard. Agenda Evaluation fundamentals Test collections: evaluating sets  Test collections: evaluating rankings Interleaving.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Language Model in Turkish IR Melih Kandemir F. Melih Özbekoğlu Can Şardan Ömer S. Uğurlu.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Relevance-Based Language Models Victor Lavrenko and W.Bruce Croft Department of Computer Science University of Massachusetts, Amherst, MA SIGIR 2001.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
NTNU Speech Lab Dirichlet Mixtures for Query Estimation in Information Retrieval Mark D. Smucker, David Kulp, James Allan Center for Intelligent Information.
Active Feedback in Ad Hoc IR Xuehua Shen, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
INAOE at GeoCLEF 2008: A Ranking Approach based on Sample Documents Esaú Villatoro-Tello Manuel Montes-y-Gómez Luis Villaseñor-Pineda Language Technologies.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
A Logistic Regression Approach to Distributed IR Ray R. Larson : School of Information Management & Systems, University of California, Berkeley --
Relevance Feedback Hongning Wang
CS798: Information Retrieval Charlie Clarke Information retrieval is concerned with representing, searching, and manipulating.
Crowdsourcing Blog Track Top News Judgments at TREC Richard McCreadie, Craig Macdonald, Iadh Ounis {richardm, craigm, 1.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Information Retrieval in Practice
Queensland University of Technology
Proposal for Term Project
Semantic Processing with Context Analysis
Evaluation of IR Systems
An Empirical Study of Learning to Rank for Entity Search
Relevance Feedback Hongning Wang
John Lafferty, Chengxiang Zhai School of Computer Science
Citation-based Extraction of Core Contents from Biomedical Articles
CS 4501: Information Retrieval
CMU Y2 Rosetta GnG Distillation
Relevance and Reinforcement in Interactive Browsing
Retrieval Evaluation - Reference Collections
Retrieval Utilities Relevance feedback Clustering
Information Retrieval and Web Design
Summarizing Use the following slides in order to organize your understanding of the article. After filling in the graphic organizer, then write your summary.
Presentation transcript:

Bayesian Query-Focused Summarization Slide 1 Hal Daumé III Bayesian Query-Focused Summarization Hal Daumé III and Daniel Marcu Information Sciences Institute University of Southern California

Bayesian Query-Focused Summarization Slide 2 Hal Daumé III Three Perspectives Model for query focused sentence extraction Well-founded query expansion in LM for IR Application of LDA-style topic model

Bayesian Query-Focused Summarization Slide 3 Hal Daumé III Task Given document, produce summary important information retained redundant information removed What is important to me? ==> User provides queries Annotating data is expensive Unsupervised approach (almost) (caveat: sentence extraction ahead!)

Bayesian Query-Focused Summarization Slide 4 Hal Daumé III Suppose we have... (A) Large Document Collection (B) Query “Why is argmax search hard?” (C) Relevance Judgments

Bayesian Query-Focused Summarization Slide 5 Hal Daumé III Idea What do relevant documents have that makes them relevant?

Bayesian Query-Focused Summarization Slide 6 Hal Daumé III A Statistical Model Query Juice Document- Specific Juice General English Juice

Bayesian Query-Focused Summarization Slide 7 Hal Daumé III An Example Iraq’s National Assembly approved a list of Cabinet members for a transitional government Thursday, three months after national elections. Three ministries – Defense, Oil and Electricity – were filled with temporary appointments because of a last minute failure to reach a compromise. Prime minister Ibrahim al-Jaafari assumed his post with the creation of his government The approval of the Cabinet represents the end of a major political impasse in the country. On Wednesday, al-Jaafari told a news conference that he had submitted his proposal Cabinet to President Jalal Talabani, who had to approve the names before the transitional National Assembly voted on them. Al-Jaafari’s announcement came a short time after gunmen shot and killed an assembly member on her doorstep in Baghdad… Q D G Q D G Q D G Q D G Q D G Q D G

Bayesian Query-Focused Summarization Slide 8 Hal Daumé III Graphical Model w p   r G q Q D Document Sentence Word Query Why does this implement our idea?

Bayesian Query-Focused Summarization Slide 9 Hal Daumé III The LM for IR Perspective... D O C U M E N T Query word p(word) word p(word) KL

Bayesian Query-Focused Summarization Slide 10 Hal Daumé III LM for IR in Summaries Sentence Query word p(word) word p(word) KL Sparse ==> Unreliable estimate

Bayesian Query-Focused Summarization Slide 11 Hal Daumé III Why does it work? Q: IBM stocks S1: The price of IBM shares went up 5%... S8: Shares in the company are set to top out soon Q: IBM stocks + shares, up,... S1: The price of IBM shares went up 5%... S8: Shares in the company are set to top out soon

Bayesian Query-Focused Summarization Slide 12 Hal Daumé III The LDA-style Perspective... w p   r G q Q D Document Sentence Word Query #Q + #D + 1 topics Constrained topic- to-word mapping

Bayesian Query-Focused Summarization Slide 13 Hal Daumé III Data ➢ From the NIST-sponsored Text REtrieval Conference ➢ Collection of queries, documents and relevance judgments ➢ Queries come in three parts: ➢ Title (3-4 words) ➢ Description (1 sentence) ➢ Narrative (1 short paragraph) ➢ We use 350 queries in our experiments ➢ All relevant documents: ➢ 43 thousand documents, 2.1 million sentences, 65.8 million words ➢ Computation time is about 2.5 hours ➢ Evaluation data: ➢ Annotators selected up to 4 sentences for an extract ➢ Given title, description and narrative ➢ 166 total extracts (reasonably high agreement) ➢ Can compute mean average precision, mean reciprocal rank and precision at 2

Bayesian Query-Focused Summarization Slide 14 Hal Daumé III Baseline Systems ➢ Strawman models: ➢ Rank sentences randomly ➢ Rank sentences according to sentence position ➢ Compute simple word overlap between query and sentence (Jaccard) ➢ Compute simple distance between query and sentence (cosine) ➢ The real competition ➢ State of the art passage retrieval model based on language modeling and KL divergences ➢ KL model + blind relevance feedback

Bayesian Query-Focused Summarization Slide 15 Hal Daumé III Evaluation Results on All Fields Random Position Jaccard Cosine KL KL+R BayeSum

Bayesian Query-Focused Summarization Slide 16 Hal Daumé III Evaluation Results by Field Position TitleKL+R BayeSum DescKL+R BayeSum SumKL+R BayeSum NarKL+R BayeSum ConcKL+R BayeSum NO QUERY

Bayesian Query-Focused Summarization Slide 17 Hal Daumé III Conclusions Statistical model for query-focused summarization Inference possible using most any technique Highly insensitive to size of query Utilizes relevance judgments in a well-founded way Interesting extensions possible (but robust)