AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.

Slides:



Advertisements
Similar presentations
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Advertisements

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
Markov Logic Networks Instructor: Pedro Domingos.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Vikas BhardwajColumbia University NLP for the Web – Spring 2010 Improving QA Accuracy by Question Inversion Prager et al. IBM T.J. Watson Res. Ctr. 02/18/2010.
Question-Answering: Overview Ling573 Systems & Applications March 31, 2011.
Evaluation.  Allan, Ballesteros, Croft, and/or Turtle Types of Evaluation Might evaluate several aspects Evaluation generally comparative –System A vs.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
CSE 574: Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
21 21 Web Content Management Architectures Vagan Terziyan MIT Department, University of Jyvaskyla, AI Department, Kharkov National University of Radioelectronics.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
University of Kansas Data Discovery on the Information Highway Susan Gauch University of Kansas.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
A Web-based Question Answering System Yu-shan & Wenxiu
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Faculty of Informatics and Information Technologies Slovak University of Technology Personalized Navigation in the Semantic Web Michal Tvarožek Mentor:
JAVELIN Project Briefing 1 AQUAINT Year I Mid-Year Review Language Technologies Institute Carnegie Mellon University Status Update for Mid-Year Program.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
CSC 9010 Spring Paula Matuszek A Brief Overview of Watson.
Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Semantic Search via XML Fragments: A High-Precision Approach to IR Jennifer Chu-Carroll, John Prager, David Ferrucci, and Pablo Duboue IBM T.J. Watson.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Learning Phonetic Similarity for Matching Named Entity Translation and Mining New Translations Wai Lam, Ruizhang Huang, Pik-Shan Cheung ACM SIGIR 2004.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar.
Carnegie Mellon School of Computer Science Copyright © 2001, Carnegie Mellon. All Rights Reserved. JAVELIN Project Briefing 1 AQUAINT Phase I Kickoff December.
AQUAINT BBN’s AQUA Project Ana Licuanan, Jonathan May, Scott Miller, Ralph Weischedel, Jinxi Xu 3 December 2002.
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
PIQUANT at AQUAINT Kick-Off Dec PIQUANT Practical Intelligent QUestion ANswering Technology A Question Answering system integrating Information.
Question Answering over Implicitly Structured Web Content
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
AQUAINT IBM PIQUANT ARDACycorp Subcontractor: PIQUANT Question Answering System ARDA AQUAINT Program June Workshop 2002 This work was supported in part.
Faculty of Informatics and Information Technologies Slovak University of Technology Personalized Navigation in the Semantic Web Michal Tvarožek Mentor:
AQUAINT Testbed John Aberdeen, John Burger, Conrad Chang, Scott Mardis The MITRE Corporation © 2002, The MITRE Corporation.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Making Watson Fast Daniel Brown HON111. Need for Watson to be fast to play Jeopardy successfully – All computations have to be done in a few seconds –
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
MedKAT Medical Knowledge Analysis Tool December 2009.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
AQUAINT AQUAINT Evaluation Overview Ellen M. Voorhees.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
Strategies for Advanced Question Answering Sanda Harabagiu & Finley Lacatusu Language Computer Corporation HLT-NAACL2004 Workshop.
AQUAINT R&D Program Phase I Kickoff Workshop WELCOME.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
OSSIM Technology Overview Mark Lucas. “Awesome” Open Source Software Image Map (OSSIM)
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
1 Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan, MIT Susan T. Dumais, Microsoft Eric Horvitz, Microsoft SIGIR 2005.
PIQUANT Question Answering System
A research literature search engine with abbreviation recognition
CSE 635 Multimedia Information Retrieval
Topic: Semantic Text Mining
Presentation transcript:

AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part by the Advanced Research and Development Activity (ARDA)'s Advanced Question Answering for Intelligence (AQUAINT) Program under contract number MDA C-0988.

IBM - PIQUANT IBM Research Subcontractor: Cycorp Overview Progress with QPlans Multi-Agent, Multi-Source Architecture & Answer Resolution 2002 Performance Evaluation

IBM - PIQUANT IBM Research Subcontractor: Cycorp Single Strategy -> Plan-Based More sophisticated Question Analysis using full parse and NE recognition Search strategy based on type of question. Any or all of following: Regular – Predictive Annotation Relative – Relative clauses, appositions Definition – Use external structured knowledge (WordNet, Tables from WWW, Databases, Cyc) Corpus strategy (selected by user) Answering Agent strategy (selected by user)

IBM - PIQUANT IBM Research Subcontractor: Cycorp A Single-Agent, Single-Strategy QA Architecture Question Answer Presentation Answers NLP Utilities Answer Classification Question Analysis Search HitList WordNet Answer Selection SE Query Answer Type

IBM - PIQUANT IBM Research Subcontractor: Cycorp A Multi-Agent QA Architecture KSP-Based Answering Agent Rule-Based Answering Agent Answering Agents Statistical Answering Agent Definitional Q Answering Agent Question Answer Resolution Answer QGoals Answer Justification & Presentation Answers QFrame NLP Utilities Answer Classification Question Analysis Search HitList WordNet Cyc Answer Selection Web KS Adaptation Layer Web-Based Answering Agent QPlan Generator QPlan Executor

IBM - PIQUANT IBM Research Subcontractor: Cycorp A Multi-Agent QA Architecture KSP-Based Answering Agent Rule-Based Answering Agent Answering Agents Statistical Answering Agent Definitional Q Answering Agent Question Answer Resolution Answer QGoals Answer Justification & Presentation Answers QFrame NLP Utilities Answer Classification Question Analysis Search HitList WordNet Cyc Answer Selection Web KS Adaptation Layer Web-Based Answering Agent QPlan Generator QPlan Executor

AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: Merging and Resolving Answers in a Multi-Agent QA Architecture Jennifer Chu-Carroll November 8, 2002

IBM - PIQUANT IBM Research Subcontractor: Cycorp A Multi-Agent QA Architecture KSP-Based Answering Agent Rule-Based Answering Agent Answering Agents Statistical Answering Agent Definitional Q Answering Agent Question Answer Resolution Answer QGoals Answer Justification & Presentation Answers QFrame NLP Utilities Answer Classification Question Analysis Search HitList WordNet Cyc Answer Selection Web KS Adaptation Layer (OntASK) Web-Based Answering Agent TREC 10 TREC 11 EB

IBM - PIQUANT IBM Research Subcontractor: Cycorp Currently Implemented Answering Agents Agents based on unstructured information Agent strategies: Knowledge-based answering agent Statistical answering agent Knowledge sources: AQUAINT corpus TREC corpus Encyclopedia Britannica Agents based on structured information Agent strategies: Knowledge source query via KSP Sanity checking: Post-hoc filtering of candidate answers Knowledge sources WordNet Cyc Databases

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Resolution Combine answers from multiple answering agents Question Analysis 1 Search 1 Corpus 1 Answer Selection 1 Question Analysis 2 Search 2 Answer Selection 2 Confidence Reranking Answer Resolution passages answers Question Analysis 1 Search 1 Corpus 2 Answer Selection 1 Agent 1 Agent 2 Agent 3 Sanity Checking Final answer

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Resolution Components Answer Selection Combines answers proposed by passages retrieved Using different keywords and/or search strategies From different corpora using the same strategy Motivation Different strategies/corpora may produce different relevant passages Semantically-equivalent answers appearing in different contexts Enables answer selection to better find close matches with question Reinforces one another A corpus may be Primary corpus: answers can be proposed and justified Supporting corpus: answers can only support those found in the primary corpus

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Selection Process Identifies candidate answers and their semantic types Evaluates candidate answers based on Semantic type match Grammatical relationship match Performs candidate answer normalization E.g., Clinton = Bill Clinton = President Clinton Currently focuses on named entity normalization Combines evidence for each candidate answer and computes score

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Selection Example TREC11 Q: “How many chromosomes does a human zygote have?” Passages from rule-based strategy + AQUAINT corpus Of the 46 human chromosomes, 44 are identical pairs. There are 46 paired chromosomes in a human being’s cell nucleus. … the order of the 21 st of the 23 pairs of human chromosome, … … narrowed their search of the gene to a small section of human chromosome 7, … … fused together to form the present-day human chromosome 7. System returns “7” as its top answer

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Selection Example (Cont’d) TREC11 Q: “How many chromosomes does a human zygote have?” Passages from statistical strategy + AQUAINT corpus … sequence the roughly 100,000 genes on the 46 human chromosomes. Of the 46 human chromosomes, 44 are identical pairs. Passages from rule-based strategy + TREC corpus There are 46 chromosomes in a normal human cell. … located on one of the 46 chromosomes in every human cell. Passages from rule-based strategy + Encyclopedia Britannica In each body cell of normal human beings, there are 46 chromosomes, … Normally, humans have 46 chromosomes arranged in 23 pairs. With additional passages, system now returns “46” as its top answer

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Resolution Components Confidence Reranking Invoked only if two or more strategically-independent answering agent are used Motivation Better confidence in the same answer given by two strategically- independent agents Process Adjust confidence scores of previously-determined answers in consultation with another answer set Score receives large boost if identical answer given by other answer agent Score receives small boost if partially overlapping answer given

IBM - PIQUANT IBM Research Subcontractor: Cycorp Answer Resolution Components Cyc Sanity Checker A post-hoc process for Rejecting “insane” answers How much does a grey wolf weigh? 300 tons Boosting confidence for “sane” answers Sanity checker invoked with Predicate, e.g. “weight” Focus, e.g. “grey wolf” Candidate value, e.g. “300 tons” Sanity checker returns “Sane”: + or – 10% of value in Cyc “Insane”: outside of the reasonable range “Don’t know” Confidence score highly boosted when answer is “sane”

IBM - PIQUANT IBM Research Subcontractor: Cycorp Cyc Sanity Checking Example Trec11 Q: “What is the population of Maryland?” Without sanity checking PIQUANT’s top answer: “50,000” Justification: “Maryland’s population is 50,000 and growing rapidly.” Passage discusses an exotic species “nutria”, not humans With sanity checking Cyc knows the population of Maryland is 5,296,486 It rejects the top “insane” answers PIQUANT’s new top answer: “5.1 million” with very high confidence

IBM - PIQUANT IBM Research Subcontractor: Cycorp Performance Evaluation Conducted experiments to evaluate the multi-source and multi- strategy aspects of PIQUANT System configurations TREC2001 system: pre-AQUAINT Single Source & Single Strategy: Strategy: Rule-based or statistical answering agent Source: AQUAINT corpus Multiple Sources & Single Strategy: Strategy: Rule-based answering agent Sources Primary: AQUAINT corpus Supporting: TREC corpus, EB Multiple Sources & Multiple Strategies: Strategies: Rule-based and statistical answering agents Sources Primary: AQUAINT corpus Supporting: TREC corpus, EB

IBM - PIQUANT IBM Research Subcontractor: Cycorp Evaluation Results Overall impact of agents based on unstructured information 41.3% relative improvement in number of questions correctly answered 51.8% relative improvement in average precision Impact of agents based on structured knowledge sources KSP invoked 5 times, returned 5 correct answers Cyc sanity checker invoked 3 times, returned 1 definitive answer TREC2001 system Single source Rule-based strategy Single source Statistical strategy Multiple sources Rule-based strategy Multiple sources Multiple strategies % correct28.3%32.5%32.7%38.2%40.0% Avg prec