QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.

Slides:



Advertisements
Similar presentations
ThemeInformation Extraction for World Wide Web PaperUnsupervised Learning of Soft Patterns for Generating Definitions from Online News Author Cui, H.,
Advertisements

Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Evaluating Search Engine
Inverted Indices. Inverted Files Definition: an inverted file is a word-oriented mechanism for indexing a text collection in order to speed up the searching.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Mining and Summarizing Customer Reviews
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Web Document Analysis: How can Natural Language Processing Help in Determining Correct Content Flow? Hassan Alam, Fuad Rahman and Yuliya Tarnikova Human.
Rui Yan, Yan Zhang Peking University
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Query Relevance Feedback and Ontologies How to Make Queries Better.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.
Going Beyond Simple Question Answering Bahareh Sarrafzadeh CS 886 – Spring 2015.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
Question Answering From Zero to Hero Elena Eneva 11 Oct 2001 Advanced IR Seminar.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
AQUAINT Kickoff Meeting Advanced Techniques for Answer Extraction and Formulation Language Computer Corporation Dallas, Texas.
Splitting Complex Temporal Questions for Question Answering systems ACL 2004.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
How Do We Find Information?. Key Questions  What are we looking for?  How do we find it?  Why is it difficult? “A prudent question is one-half of wisdom”
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Using Semantic Relations to Improve Information Retrieval
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Queensland University of Technology
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
CSE 635 Multimedia Information Retrieval
CS246: Information Retrieval
Presentation transcript:

QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University of Singapore

Outline Introduction Factoid Subsystem List Subsystem Definition Subsystem Result Conclusion and Future Work

Introduction Given a question and a large text corpus, return an “answer” rather than relevant “documents” QA is at the intersection of IR + IE + NLP Our system - QUALIFIER Consists 3 subsystems External Resources – Web, WordNet, Ontology Event-based Question Answering New Modules introduced

Outline Introduction Factoid Subsystem List Subsystem Definition Subsystem Result Conclusion and Future Work

Factoid System Overview

Factoid Subsystem Detailed Question Analysis QA Event Construction QA Event Mining Answer Selection Answer Justification Fine-grained Named Entity Recognition Anaphora Resolution Canonicalization Coreference Successive Constraint Relaxation

Factoid Subsystem Detailed Question Analysis QA Event Construction QA Event Mining Answer Selection Answer Justification Fine-grained Named Entity Recognition Anaphora Resolution Canonicalization Coreference Successive Constraint Relaxation

Why Event-based QA - I The world consists of two basic types of things: entities and events and people often ask questions about them. From Question Answering’s Point of View Questions = “enquiries about entities or events”.

Why Event-based QA - II QA Entities “Anything having existence (living or nonliving)” E.g. “What is the democratic party symbol?” QA Events “Something that happens at a given place and time”. E.g. “How did donkey become democratic party symbol?” Thomas Nast 1870 Harper’s Weekly cartoon

Why Event-based QA - III Entity Questions Properties, or entities themselves definition questions. Event Questions Elements of events Location, Time, Subject, Object, Quantity Description Action, etc. WH-QuestionQA Event Elements WhoSubject, Object WhereLocation WhenTime What Subject, Object, Description, Action WhichSubject, Object, HowQuantity, Description Table 1: Correspondence of WH- Questions & Event Elements question :== event | event_element | entity | entity_property event :== { event_element } event_element :== time | location | subject | object | quantity | description | action | other entity :== object | subject entity_property :== quantity | description | other

Event-based QA Hypothesis Equivalency:  QA event E i,E j,if all_elements(E i ) = all_elements(E j ), then E i = E j, and vice versa; Generality: if all_elements(E i ) is a subset of all_elements(E j ), then E i is more general than E j ; Cohesiveness: if elements a, b both belong to an event E i, and a, c do not belong to a known event, then co-occurrence(a,b) is greater than co- occurrence(a,c); Predictability: if elements a, b both belong to an event E i, then a => b and b => a.

QA Event Space Consider an event to be a point in a multi-dimensional QA event space. If we know all the elements about an event, then we can easily answer different questions about it E.g. “When did Bob Marley die ?” As there are innate associations among these elements if they belong to the same event (Cohesiveness), we can use what are already known To narrow the search scope To find rest of the unknown event elements, the answer (Predictability)

Problems to be Solved However, for most of the cases, it is difficult to find the correct unknown element(s), i.e., the correct answer Two major problems: Insufficient known elements Inexact known elements Solution: Explore the use of world knowledge (Web and WordNet glosses) to find more known elements Exploit the lexical knowledge from (WordNet synsets and morphemics) to find exact forms.

How to Find a QA Event Using Web From original query term q (0), retrieve top N web documents  q i (0)  q (0), extract nearby non-trivial words in one sentence or n words away (in C q ) and rank them by computing its probability of correlation with q i (0) Using WordNet  q i (0)  q (0), extract terms that are lexically related to q i (0) by locating them in Gloss G q and Synset S q Combine the external knowledge resources to form term collection: K q = C q + (G q  S q )

QA Event Construction Structured Query Formulation We perform structural analysis on K q to form semantic groups of terms Given any two distinct terms t i, t j  K q, we compute their Lexical correlation Co-occurrence correlation Distance correlation

QA Event Construction For example, “What Spanish explorer discovered the Mississippi River?” The final Boolean query becomes: “(Mississippi) & (French|Spanish) & (Hernando & Soto & De) & (1541) & (explorer) & (first | European |river)”.

QA Event Mining Extract important association rules among the elements by using data mining techniques. Given a QA event E i, we define X, Y as two sets of event elements. Event mining studies the rules of the form X  Y, where X, Y are QA event element sets, X  Y = , and Y  {element original }= . if X  Y, ignore X  Y. if cardinality(Y) > 1, ignore X  Y. if Y  {element original } , ignore X  Y.

Passage & Answer Selection Select Passage based on Answer Event Score (AES) from the relevant documents in the QA corpus: Support (X  Y) = Confidence (X  Y) = The weight for answers candidate j is defined as:

Related Modules: Fine-grained Named Entity Recognition Fine-grained NE Tagging Non-ascii Character Remover Number Format Converter E.g. “one hundred eleven” => 111 Rule Confliction Revolver Longer Length Ontology Handcrafted Priorities

Related Modules: Answer Justification We generate axioms based on our manually constructed ontology. For example, q1425: What is the population of Maryland? Sentence: “Maryland 's population is 50,000 and growing rapidly.” Ontology Axiom (OA): Maryland (c1) & population (c1, c2) -> (c2) In this way, we could identify the wrong answer “50000”, which is the surface text shown.

Factoid Results

Outline Introduction Factoid Subsystem List Subsystem Definition Subsystem Result Conclusion and Future Work

List System Overview

List Subsystem Multiple Answers from Same Paragraph Canonicalization Resolution Unique answer “the States”, “USA”, “United States”, etc Pattern-based Answer Extraction, and + verb … … include:,, … “list of …” “top” + number + adj-superlative

List Results

Outline Introduction Factoid Subsystem List Subsystem Definition Subsystem Result Conclusion and Future Work

System Overview

Definition Subsystem

Pre-processing document filter anaphora resolution sentence “positive set” and “negative set” Sentence Ranking Sentence weighting in Corpus Sentence weighting in Web Overall weighting :

Definition Subsystem Answer Generation (Progressive Maximal Margin Relevance) 1.All sentences are ordered in descending order by weights. 2.Add the first sentence to the summary. 3.Examine the following sentences. If Weight(stc)- Weight(next_stc) >avg_sim(stc), Add next_stc to summary; 4.Go to Step 3) till the length limit of the target summary is satisfied.

Definition Results We empirically set the length of the summary for People and Objects based on question classification results.

Outline Introduction Factoid Subsystem List Subsystem Definition Subsystem Result Conclusion and Future Work

Overall Performance

Conclusion and Future Work Conclusion Event-based Question Answering Factoid question and list questions explore the power of Event-based QA Definition questions answering combines IR and Summarization Use Ontology to boost the performance of our NE and answer justification modules Future Work Give a formal proof of our QA event hypothesis Working towards an online question answering system Interactive QA Analysis and opinion questions VideoQA – question answering on news video