21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.

Slides:



Advertisements
Similar presentations
1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
Advertisements

Improved TF-IDF Ranker
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Semantic Access to Data from the Web Raquel Trillo *, Laura Po +, Sergio Ilarri *, Sonia Bergamaschi + and E. Mena * 1st International Workshop on Interoperability.
Application of NLP in Information Retrieval Nirdesh Chauhan Ajay Garg Veeranna A.Y. Neelmani Singh.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Automating Keyphrase Extraction with Multi-Objective Genetic Algorithms (MOGA) Jia-Long Wu Alice M. Agogino Berkeley Expert System Laboratory U.C. Berkeley.
Automatic Set Expansion for List Question Answering Richard C. Wang, Nico Schlaefer, William W. Cohen, and Eric Nyberg Language Technologies Institute.
Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.
Article by: Feiyu Xu, Daniela Kurz, Jakub Piskorski, Sven Schmeier Article Summary by Mark Vickers.
Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction Kiyoshi Sudo Ph.D. Research Proposal New York University Committee:
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
A Web-based Question Answering System Yu-shan & Wenxiu
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea Class web page:
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.
A Technical Seminar on Question Answering SHRI RAMDEOBABA COLLEGE OF ENGINEERING & MANAGEMENT Presented By: Rohini Kamdi Guided By: Dr. A.J.Agrawal.
Query Processing: Query Formulation Ling573 NLP Systems and Applications April 14, 2011.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
Semantic Search via XML Fragments: A High-Precision Approach to IR Jennifer Chu-Carroll, John Prager, David Ferrucci, and Pablo Duboue IBM T.J. Watson.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
TRECVID 2004 Search Task by NUS PRIS Tat-Seng Chua, et al. National University of Singapore.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
1 Query Operations Relevance Feedback & Query Expansion.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
NTCIR /21 ASQA: Academia Sinica Question Answering System for CLQA (IASL) Cheng-Wei Lee, Cheng-Wei Shih, Min-Yuh Day, Tzong-Han Tsai, Tian-Jian Jiang,
CIKM Recognition and Classification of Noun Phrases in Queries for Effective Retrieval Wei Zhang 1 Shuang Liu 2 Clement Yu 1
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
A Novel Pattern Learning Method for Open Domain Question Answering IJCNLP 2004 Yongping Du, Xuanjing Huang, Xin Li, Lide Wu.
Authors: Marius Pasca and Benjamin Van Durme Presented by Bonan Min Weakly-Supervised Acquisition of Open- Domain Classes and Class Attributes from Web.
© 2004 Chris Staff CSAW’04 University of Malta of 15 Expanding Query Terms in Context Chris Staff and Robert Muscat Department of.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Information Retrieval and Web Search Relevance Feedback. Query Expansion Instructor: Rada Mihalcea.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
WIDIT at TREC-2005 HARD Track Kiduk Yang, Ning Yu, Hui Zhang, Ivan Record, Shahrier Akram WIDIT Laboratory School of Library & Information Science Indiana.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
SEMANTIC VERIFICATION IN AN ONLINE FACT SEEKING ENVIRONMENT DMITRI ROUSSINOV, OZGUR TURETKEN Speaker: Li, HueiJyun Advisor: Koh, JiaLing Date: 2008/5/1.
(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications April 28, 2011.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
A Trainable Multi-factored QA System Radu Ion, Dan Ştefănescu, Alexandru Ceauşu, Dan Tufiş, Elena Irimia, Verginica Barbu-Mititelu Research Institute for.
Information Storage and Retrieval Fall Lecture 1: Introduction and History.
Traditional Question Answering System: an Overview
A method for WSD on Unrestricted Text
CS246: Information Retrieval
Topic: Semantic Text Mining
Presentation transcript:

21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing National University of Singapore

21/11/2002 Presentation Outline Introduction Pris QA System Design Result and Analysis Conclusion Future Work

21/11/2002 Open Domain QA Find answers to open-domain NLP questions by searching a large collection of documents Question Processing May involve question re-formulation To find answer type Query Expansion To overcome concept mis-match between query & info base Search for Candidate Answers Documents, paragraphs, or sentences Disambiguation Ranking (or re-ranking) of answers Location of exact answers

21/11/2002 Current Research Trends Web-based QA the Web redundancy Probabilistic algorithm Linguistic-based QA part-of-speech tagging syntactic parsing semantic relations named entity extraction dictionaries WordNet, etc

21/11/2002 System Overview Question Classification Question Parsing Query Formulation Document Retrieval Candidate Sentence Retrieval Answer Extraction

21/11/2002 Question Classification Question Parsing Question Analysis Web WordNet Query Formulation By External Knowledge Document Retrieval Sentence Ranking Answer Extraction Q A Original Content Words Expanded Content Words Relevant TREC doc Candidate sentences Reduce # of expanded content words

21/11/2002 Question Classification Based on question focus and answer type 7 main classes HUM, LOC, TME, NUM, OBJ, DES, UNKNOWN E.g. “Which city is the capital of Canada ? ” (Q-class: LOC) E.g. “Which state is the capital of Canada in? ” (Q-class: LOC) 54 sub-classes E.g. under LOC (location), we have 14 sub-classes: LOC_PLANET: 1 LOC_CITY: 18 LOC_CONTINENT: 3 LOC_COUNTRY: 18 LOC_COUNTY: 3 LOC_STATE: 3 LOC_PROVINCE: 2 LOC_TOWN: 2 LOC_RIVER: 3 LOC_LAKE: 2 LOC_MOUNTAIN: 1 LOC_OCEAN: 2 LOC_ISLAND: 3 LOC_BASIC: 3

21/11/2002 Question Parsing Content Words : q (0) Nouns, adjectives, numbers, some verbs E.g. “What mythical Scottish town appears for one day every 100 years ?” Q-class: LOC_TOWN q (0) : (mythical,Scottish,town,appears,one,day,100,years) Base Noun Phrases : n n : (“mythical Scottish town”) Head of the 1 st Noun Phrase: h h : (town) Quotation Words: u E.g. “What was the original name before " The Star Spangled Banner“ ? ” u : (“The Star Spangled Banner”)

21/11/2002 Query Formulation I Use original Content Words as query to search the Web (e.g. Google) Find new terms which have high correlation with the original query Use WordNet to find the Synsets and Glosses of original query terms Rank new query terms based on both Web and WordNet Form new boolean query

21/11/2002 Query Formulation II Original query q (0) = (q 1 (0), q 2 (0),…, q k (0) ) Use Web as generalized resource From q (0), retrieve top N documents  q i (0)  q (0), extract nearby non-trivial words in one sentence or n words away to get w i Rank w ik  w i by computing its probability of correlation with q i (0) # instances of (w ik /\ q i (0) ) Prob(w ik ) = # instances of (w ik \/ q i (0) ) Merge all w i to form C q for q (0)

21/11/2002 Query Formulation III Use WordNet as generalized resource  q i (0)  q (0), extract terms that are lexically related to q i (0) by locating them in GlossG i SynsetS i For q (0), we get G q and S q Re-rank w ik  w i by considering lexical relations  w ik  C q, if w ik  G i, w ik increases  if w ik  S i, w ik increases , (0<  <  <1) Get q (1) = q (0) + {top m terms from C q }

21/11/2002 Document Retrieval 1,033,461 documents from AP newswire, New York Times newswire, Xinhua News Agency, MG Tool Boolean search to retrieve the top N documents (N = 50)  t k  q (1), Q = (t 1  t 2  …  t n )

21/11/2002 Candidate Sentence Retrieval  sent j in the top N documents, match with : quotation words: W uj = % of term overlap between u and Sent j noun phrases: W nj = % of phrase overlap between n and Sent j head of first noun phrase: W hj = 1 if there is a match and 0 otherwise original content words: W cj = % of term overlap between q (0) and Sent j expanded content words: W ej = % of term overlap between q (1-0) and Sent j, where q (1-0) = q (1) - q (0) Final score, where  α i =1, W ij { W uj, W nj, W hj, W cj, W ej }.

21/11/2002 Answer Extraction I Fine-grained NE tagging for the top K sentences For each sentence, extract the string which matches the Question Class E.g. “Who is Tom Cruise married to ?” Q-class: HUM_BASIC Top ranked Candidate Sentence: Actor and his wife accepted `` substantial '' libel damages on from a newspaper that reported he was gay and that their marriage was a sham to cover it up. Answer string: Nicole Kidman

21/11/2002 Answer Extraction II For some questions, we cannot find any answer reduce the # of expanded query terms and repeat the Document Retrieval, Candidate Sentence Retrieval and Answer Extraction The whole process lasts for N iterations (N=5) If we still cannot find an exact answer, NIL is considered as the answer increase recall step by step while preserving precision

21/11/2002 Evaluation in TREC 2002 uninterpolated average precision sum for i=1 to 500 (#-correct-up-to-question-i/i) We answer correctly 290 questions Score 0.61

21/11/2002 Result Analysis I Num of Runs with Correct Answers Num Q with Correct Answer Total Num Q Our Num of Q

21/11/2002 Result Analysis II Recognize no answers (NIL) Precision : 41 / 170 = Recall : 41 / 46 = Non-Nil answers Precision: 249/330 = Recall: 249/444 = Overall Recall is low compare to precision – because Boolean search is strict.

21/11/2002 Result Analysis III

21/11/2002 Conclusion Integration of both Lexical Knowledge and External Resources Detailed Question Classification Use of Fine-grained Named Entities for Question Answering Successive Constraint Relaxation

21/11/2002 Future Work Refining our terms correlation by considering a combination of local context, global context and lexical correlations Exploring the structured use of external knowledge using the semantic perceptron net Developing template-based answer selection Longer-term research plan : Interactive QA, analysis and opinion questions

21/11/2002 Thank You !