Elliot Holt Kelly Peterson. D4 – Smells Like D3 Primary Goal – improve D3 MAP with lessons learned After many experiments: TREC 2004 MAP = 0.300 -> 0.321.

Slides:



Advertisements
Similar presentations
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Advertisements

QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Group 3 Chad Mills Esad Suskic Wee Teck Tan. Outline  System and Data  Document Retrieval  Passage Retrieval  Results  Conclusion.
Improvements to BM25 and Language Models Examined ANDREW TROTMAN, ANTTI PUURULA, BLAKE BURGESS AUSTRALASIAN DOCUMENT COMPUTING SYMPOSIUM 2014 MELBOURNE,
Answer Extraction Ling573 NLP Systems and Applications May 17, 2011.
Deliverable #3: Document and Passage Retrieval Ling 573 NLP Systems and Applications May 10, 2011.
Passage Retrieval & Re-ranking Ling573 NLP Systems and Applications May 5, 2011.
Passage Retrieval and Re-ranking Ling573 NLP Systems and Applications May 3, 2011.
A Basic Q/A System: Passage Retrieval. Outline  Query Expansion  Document Ranking  Passage Retrieval  Passage Re-ranking.
’ strict ’ strict ’ strict ’ lenient ‘ lenient ‘ lenient
WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
Lucene Brian Nisonger Feb 08,2006. What is it? Doug Cutting’s grandmother’s middle name Doug Cutting’s grandmother’s middle name A open source set of.
LING 573: Deliverable 3 Group 7 Ryan Cross Justin Kauhl Megan Schneider.
Team 8 Mowry, Srinivasan and Wong Ling 573, Spring University of Washington.
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Overview of Search Engines
A Web-based Question Answering System Yu-shan & Wenxiu
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Text- and Content-based Approaches to Image Retrieval for the ImageCLEF 2009 Medical Retrieval Track Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman,
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Date: 2014/02/25 Author: Aliaksei Severyn, Massimo Nicosia, Aleessandro Moschitti Source: CIKM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Building.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Question Answering over Implicitly Structured Web Content
 Examine two basic sources for implicit relevance feedback on the segment level for search personalization. Eye tracking Display time.
Lucene-Demo Brian Nisonger. Intro No details about Implementation/Theory No details about Implementation/Theory See Treehouse Wiki- Lucene for additional.
November 10, 2004Dmitriy Fradkin, CIKM'041 A Design Space Approach to Analysis of Information Retrieval Adaptive Filtering Systems Dmitriy Fradkin, Paul.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
1 Using The Past To Score The Present: Extending Term Weighting Models with Revision History Analysis CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG,
Lucene. Lucene A open source set of Java Classses ◦ Search Engine/Document Classifier/Indexer 
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Ling573 NLP Systems and Applications May 7, 2013.
LING 573 Deliverable 3 Jonggun Park Haotian He Maria Antoniak Ron Lockwood.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
The Loquacious ( 愛說話 ) User: A Document-Independent Source of Terms for Query Expansion Diane Kelly et al. University of North Carolina at Chapel Hill.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications April 28, 2011.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Hui Fang (ACL 2008) presentation 2009/02/04 Rick Liu.
UIC at TREC 2006: Blog Track Wei Zhang Clement Yu Department of Computer Science University of Illinois at Chicago.
Integrating linguistic knowledge in passage retrieval for question answering J¨org Tiedemann Alfa Informatica, University of Groningen HLT/EMNLP 2005.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Text Similarity: an Alternative Way to Search MEDLINE James Lewis, Stephan Ossowski, Justin Hicks, Mounir Errami and Harold R. Garner Translational Research.
Information Retrieval in Practice
Queensland University of Technology
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Query Reformulation & Answer Extraction
Multimedia Information Retrieval
ABDULLAH ALOTAYQ, DONG WANG, ED PHAM PROJECT BY:
Document Expansion for Speech Retrieval (Singhal, Pereira)
Relevance and Reinforcement in Interactive Browsing
Question Answer System Deliverable #2
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Introduction to Search Engines
Presentation transcript:

Elliot Holt Kelly Peterson

D4 – Smells Like D3 Primary Goal – improve D3 MAP with lessons learned After many experiments: TREC 2004 MAP = > (+7%) Global MRR++ Help came from: Custom Lucene Analyzer Porter stem the index NLTK stop the index Added to index (NOTE: not helpful for QA)

Answer Extraction? Not yet… Still improving the input pipes Spillover splitting/scoring SiteQ PR System MRR Improvements 3sent window = start = baseline = final = 0.424

Answer Extraction? Hahaha. No. Enhanced Query Expansion - IT WORKS! IT: using improved Rocchio/expanded term weights using expanded query terms for scoring initial documents researching for better documents WORKS: Significant MRR improvements 2004: [1] 100: > (+2.2%) [0-3] 1000: > (+6.5%) 2005 [1-2] 1000: > 0.398(+1.5%)

Answer Extraction? Well, kinda… Development Efforts: Hand-built Pattern matching reranking boost (Soubbotin 2001) NE boosting with Q_Class associations (Echihabi et al, 2005) Needed accuracy evaluation method Create a corrected TREC 2004 Q_Class label file…manually Extract information from compute_mrr.py results TREC 2004 data, 100 char, average MRR Miss = question with 0.0 MRR HitAccuracy = Average MRR of non-miss questions

By-Label Accuracy 1000 char: (lenient) 100 char: (lenient)

Answer Extraction Results Similar or better results without The systems are implemented internally, but untuned Worst performers (that could have been helped): by Misses: NUM:date HUM:indiv By HitAccuracy: NUM:money LOC More patterns & NE associations needed

TREC 2004 tuning results

TREC 2005 testing results

Further work Apply new pattern and NE boosting to disaster areas. Gloat about getting query expansion to improve both document and passage retrieval. Restart.