Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.


Similar presentations
ThemeInformation Extraction for World Wide Web PaperUnsupervised Learning of Soft Patterns for Generating Definitions from Online News Author Cui, H.,

Query Chain Focused Summarization Tal Baumel, Rafi Cohen, Michael Elhadad Jan 2014.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
QA and Language Modeling (and Some Challenges) Eduard Hovy Information Sciences Institute University of Southern California.
Group Members: Satadru Biswas ( ) Tanmay Khirwadkar ( ) Arun Karthikeyan Karra (05d05020) CS Course Seminar Group-2 Question Answering.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) Classic Information Retrieval (IR)
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
An Overview of Text Mining Rebecca Hwa 4/25/2002 References M. Hearst, “Untangling Text Data Mining,” in the Proceedings of the 37 th Annual Meeting of.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
Properties of Text CS336 Lecture 3:. 2 Information Retrieval Searching unstructured documents Typically text –Newspaper articles –Web pages Other documents.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Vector Space Model CS 652 Information Extraction and Integration.
Techniques Used in Modern Question-Answering Systems Candidacy Exam Elena Filatova December 11, 2002 Committee Luis GravanoColumbia University Vasileios.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Overview of Search Engines
Question-Answering: Systems & Resources Ling573 NLP Systems & Applications April 8, 2010.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Information Retrieval in Practice
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.
A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.
Query Processing: Query Formulation Ling573 NLP Systems and Applications April 14, 2011.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Deep Processing QA & Information Retrieval Ling573 NLP Systems and Applications April 11, 2013.
GTRI.ppt-1 NLP Technology Applied to e-discovery Bill Underwood Principal Research Scientist “The Current Status and.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
TWC Illuminate Knowledge Elements in Geoscience Literature Xiaogang (Marshall) Ma, Jin Guang Zheng, Han Wang, Peter Fox Tetherless World Constellation.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
Survey on Long Queries in Keyword Search : Phrase-based IR Sungchan Park
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Web IR: Recent Trends; Future of Web Search
Automatic Detection of Causal Relations for Question Answering
CSE 635 Multimedia Information Retrieval
Presentation transcript:

Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation

Highlights Two Systems PowerAnswer-2 : factoids (main task) PALANTIR : relationships Bells and whistles –Web-boosting strategy –Abductive logic prover –World-knowledge axioms: XWN, SUMO,… Results : “above median for all groups” –53.4% Main task, 20.4% Relationships task

TREC 2005 Tasks: Main (factoids), Relationships What’s new –Question types: “Other” –Answer types: Events Challenges –More complex coreference resolution –Temporal and other event-like constraints –Discovering info nuggets for “Other” questions

Challenges: Coreference resolution TREC 2004: single antecedent for anaphora TREC 2005: more candidate antecedents…

Challenges: Inter-Question constraints A question and its answer constrain the subsequent questions –Correct answer to Q136.5 depends on correct coreference resolution with previous Q’s correct answer to Q136.4 Event answer types –Nominal answer types act as topics of subsequent questions; Events constrain subsequent questions with event-like properties: time, participants…

The LCC Solution: Two Systems PowerAnswer-2 –Factoid questions –Includes: Abductive logic, temporal reasoner, world- knowledge axioms –Bonus: discover interesting and novel nuggets for “Other” questions PALANTIR –Relationship questions –Includes: keyword expansion, topic representation, automatic lexicon generation

PowerAnswer-2: Architecture

PowerAnswer-2: Components Standard modules: QP, PR, AP –Question Processor, Passage Retrieval, Answer Processor Sneaky module: WebBooster Fancy module: COGEX Logic Prover –World-knowledge: SUMO, eXtended WordNet, JAGUAR –Linguistic knowledge: WordNet, manual ellipses and coreference axioms –“Prove” correct answers with abductive logic –Temporal inference from “advanced textual inference techniques”

WebBooster Exploit redundancy on web for answer ranking –Construct series of search engine queries from “linguistic patterns” (morph/lex alternations?) –Extract most redundant answers from web documents –“Boost” (ie, increase weight of) answers from TREC collection that most closely match answers from web collection Justification: the larger the set, the easier it is to pinpoint answers that more closely resemble surface form of question Results: 20.8 % increase in factoid score

COGEX: Logic Prover Convert Question  QLF, Answer  ALF Perform “proof” on question over candidate answers Rank answers by semantic similarity to question –Semantic similarity: WordNet! Ex: similarity of “buy” and “own” judged by length of connecting path in WordNet Results: 12.4 % increase in factoid score

COGEX: Temporal Context Reasoner Document processing: index by dates Q and A processing: represent temporal relations as triples (S, E1, E2) –S is temporal signal (“during”, “after”), Es are events Reasoning: –Prefer passages that match detected temporal constraints in Q –Discover events related by temporal signals in the Q and candidate As –Perform temporal unification btw the Q and candidate As, boosting As that match Q times Results: 2 % increase in factoid score

“Other” Questions Generic definition-pattern based nuggets “...Russian submarine Kursk, which is lying on the sea bed in the Barents Sea...” Answer-type based nuggets –Nugget-patterns pecific to properties of answer type –33 target classes generated by Naïve Bayes classifier on WordNet synsets Bing Crosby  musican_person: band, singer, born, … Entity-relationship based nuggets –Nugget patterns are based on relations to other NEs Akira Kurosawa AND _date Akira Kurosawa AND _location …

PALANTIR: Architecture

PALANTIR: Keyword Selection Collocation detection –identify complete phrases that aren’t just bags of keywords (Organization of African States) Keyword Ranking –detect overall importance of keyword in query –Use keyword-density strategy for doc ranking Keyword Expansion –Synonyms, alternate forms for keywords

PALANIR: Topic Representation Harvest “topic signatures” from text –?? Find relationships between topic signatures –Use syntax- and semantic-based relations between verbs and arguments –Use context-based relations that exist between entities

PALANTIR: Lexicon Generation Q: Relationship questions have no single semantic answer type; how to identify appropriate answers from passages? A: By generating set-types on the fly, of course! –Use weakly-supervised learning approach to identify semantic sets in question, then keywords relevant to that set (South American countries) –Automatically generate a large db of syntactic frames that represent semantic relations

Results PowerAnswer-2PALANTIR

Summary WebBooster – 20% increase COGEX – 12% increase Temporal Reasoner – 2% increase Nugget-pattern discovery – 22.8% f-measure PALANTIR strategies: