Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.

Slides:



Advertisements
Similar presentations
ThemeInformation Extraction for World Wide Web PaperUnsupervised Learning of Soft Patterns for Generating Definitions from Online News Author Cui, H.,
Advertisements

Question Answering Gideon Mann Johns Hopkins University
Knowledge Base Completion via Search-Based Question Answering
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
Improved TF-IDF Ranker
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Answer Extraction Ling573 NLP Systems and Applications May 19, 2011.
The Informative Role of WordNet in Open-Domain Question Answering Marius Paşca and Sanda M. Harabagiu (NAACL 2001) Presented by Shauna Eggers CS 620 February.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK
Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
Query Processing: Query Formulation Ling573 NLP Systems and Applications April 14, 2011.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Michael Cafarella Alon HalevyNodira Khoussainova University of Washington Google, incUniversity of Washington Data Integration for Relational Web.
David L. Chen Fast Online Lexicon Learning for Grounded Language Acquisition The 50th Annual Meeting of the Association for Computational Linguistics (ACL)
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
A Language Independent Method for Question Classification COLING 2004.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Crawling and Aligning Scholarly Presentations and Documents from the Web By SARAVANAN.S 09/09/2011 Under the guidance of A/P Min-Yen Kan 10/23/
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Question Answering over Implicitly Structured Web Content
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
Supertagging CMSC Natural Language Processing January 31, 2006.
A Classification-based Approach to Question Answering in Discussion Boards Liangjie Hong, Brian D. Davison Lehigh University (SIGIR ’ 09) Speaker: Cho,
AQUAINT IBM PIQUANT ARDACYCORP Subcontractor: IBM Question Answering Update piQuAnt ARDA/AQUAINT December 2002 Workshop This work was supported in part.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
SEMANTIC VERIFICATION IN AN ONLINE FACT SEEKING ENVIRONMENT DMITRI ROUSSINOV, OZGUR TURETKEN Speaker: Li, HueiJyun Advisor: Koh, JiaLing Date: 2008/5/1.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
The P YTHY Summarization System: Microsoft Research at DUC 2007 Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki,
Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Queensland University of Technology
Presentation transcript:

Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun Tat-Seng Chua Min-Yen Kan {cuihang, likeya, sunrenxu, chuats,

Hang Cui et al. NUS at TREC-13 QA Main Task 2/20 System Architecture Passage Retrieval Using Query Expansion with Google snippets Answer Extraction Using Approximate Dependency Relation Matching Definition Generation with Soft Patterns Topic Analysis and Document Retrieval Question Analysis

Hang Cui et al. NUS at TREC-13 QA Main Task 3/20 What’s New This Year Approximate matching of grammatical dependency relations for answer extraction Soft matching patterns in identifying definition sentences. –See [Cui et al., 2004a] and [Cui et al., 2004b] Exploiting definitions to answer factoid and list questions.

Hang Cui et al. NUS at TREC-13 QA Main Task 4/20 Outline System architecture New Features in TREC-13 QA Main Task –Approximate Dependency Relation Matching for Answer Extraction –Soft Matching Patterns for Definition Generation –Definition Sentences in Answering Topically-Related Factoid/List Questions Conclusion

Hang Cui et al. NUS at TREC-13 QA Main Task 5/20 Dependency Relation Matching in QA Why need to consider dependency relations? –An upper bound of 70% for answer extraction (Light et al., 2001) Many NE’s with the same type appearing close to each other. –Some questions don’t have NE-type targets. E.g. what does AARP stand for? Tried before –PIQASso and MIT systems have applied dependency relations in QA. –However: Poor performance due to low recall. Used exact match of relations to extract answers directly.

Hang Cui et al. NUS at TREC-13 QA Main Task 6/20 Extracting Dependency Relation Triples Minipar-based (Lin, 1998) dependency parsing Relation triple: two anchor words and their relationship –E.g. for “on the desk”. Relation path: path of relations between two words –E.g., for “on the desk at the fourth floor”

Hang Cui et al. NUS at TREC-13 QA Main Task 7/20 Examples of relation triples Q: What American revolutionary general turned over West Point to the British? q1) General sub obj West Point q2) West Point mod pcomp-nBritish A: …… Benedict Arnold’s plot to surrender West Point to the British …… s1) Benedict Arnold poss ssobjWest Point s2) West Point mod pcomp-nBritish So, in most cases, correct answers can’t be extracted by exact match of relations.

Hang Cui et al. NUS at TREC-13 QA Main Task 8/20 Learning Relation Similarity We need a measure to find the similarity between two different paths. Adopt a statistical method to learn similarity from past QA pairs. Training data preparation –Around 1,000 factoid question-answer pairs from the past two years’ TREC QA task. –Extract all relation paths between all non-trivial words 2,557 path pairs. –Align the paths according to identical anchor nodes.

Hang Cui et al. NUS at TREC-13 QA Main Task 9/20 Using Mutual Information to Measure Relation Co-occurrence Two relations’ similarity measured by their co- occurrences in the question and answer paths. Variation of mutual information (MI) –a : reciprocal of the length sum of the two relation paths. to discount the score of two relations appearing in long paths. Relation-1Relation-2Similarity whnpcomp-n 0.43 whni 0.42 ipcomp-n 0.39 is 0.37 predmod0.37 appovrel 0.35

Hang Cui et al. NUS at TREC-13 QA Main Task 10/20 Measuring Path Similarity – 1 We adopt two methods to compute path similarity using different relation alignment methods. Option 1: ignore the words of those relations along the given paths – Total Path Matching. –A path consists of only a list of relations: no relation context (anchor words) considered. –Relation alignment by permutation of all possibilities. –Adopt IBM’s Model 1 for statistical translation:

Hang Cui et al. NUS at TREC-13 QA Main Task 11/20 Measuring Path Similarity – 2 Option 2: consider the words of those relations along a path – Triple Matching. –A path consists of a list of relations and their words. Requires match of relation context (anchor words). Only those relations with matched words count. –More strict match in relation alignment.

Hang Cui et al. NUS at TREC-13 QA Main Task 12/20 Selecting Answer Strings Statistically Use the top 50 ranked sentences from the passage retrieval module for answer extraction. Evaluate the path similarity for relation paths between the question target or answer candidate and other question terms. Non-NE questions: evaluate all noun/verb phrases.

Hang Cui et al. NUS at TREC-13 QA Main Task 13/20 Discussions on Evaluation Results The use of approximate relation matching outperforms our previous answer extraction technique. –22% improvement for overall questions. –45% improvement for Non-NE questions (69 out of 230 questions). The two path similarity measurements do not make obvious difference. –Total Path Matching performs slightly better than Triple Matching. –Minipar can’t resolve long distance dependency as well. BaselineNUSCHUA1NUSCHUA2 Overall average accuracy Questions w/ NE typed targets Questions w/o NE typed targets

Hang Cui et al. NUS at TREC-13 QA Main Task 14/20 Outline System architecture New Experiments in TREC-13 QA Main Task –Approximate Dependency Relation Matching for Answer Extraction –Soft Matching Patterns for Definition Generation –Definition Sentences in Answering Topically-Related Factoid/List Questions Conclusion

Hang Cui et al. NUS at TREC-13 QA Main Task 15/20 Question Typing and Passage Retrieval for Factoid/List Q’s Question typing –Leveraging our past question typology and rule-based question typing module. –Offline tagging of the whole TREC corpus using our rule-based named entity tagger. Passage retrieval – on two sources: –Topic-relevant document set by the document retrieval module: NUSCHUA1 and 2. –Definition sentences for a specific topic by the definition generation module: NUSCHUA3 Question-specific wrappers on definitions.

Hang Cui et al. NUS at TREC-13 QA Main Task 16/20 Exploiting Definition Sentences to Answer Factoid/List Questions Conduct passage retrieval for factoid/list questions on the definition sentences about the topic. –Much more efficient due to smaller search space. –Average accuracy of 0.50, lower than that over all topic-related documents. Due to low recall – imposed cut-off for selecting definition sentences (naïve use of definitions). Some sentences for answering factoid/list questions are not definition sentences.

Hang Cui et al. NUS at TREC-13 QA Main Task 17/20 Exploiting Definitions from External Knowledge Pre-complied wrappers for extraction of specific fields of information for list questions –Works, product names and person titles. –From both generated definition sentences and existing definitions: cross validation. –Achieves F-measure of 0.81 for 8 list questions about works.

Hang Cui et al. NUS at TREC-13 QA Main Task 18/20 Outline System architecture New Experiments in TREC-13 QA Main Task –Approximate Dependency Relation Matching for Answer Extraction –Soft Matching Patterns for Definition Generation –Definition Sentences in Answering Topically-Related Factoid/List Questions Conclusion

Hang Cui et al. NUS at TREC-13 QA Main Task 19/20 Conclusion Approximate relation matching for answer extraction –Still have a hard time in dealing with difficult questions. Dependency relation alignment problem – words often can’t be matched due to linguistic variations. Semantic matching of words/phrases is needed with relation matching. More effective use of topic related sentences in answering factoid/list questions.

Hang Cui et al. NUS at TREC-13 QA Main Task 20/20 Q & A Thanks!

Hang Cui et al. NUS at TREC-13 QA Main Task 21/20 A Question Example Topic #14: Horus –Q1: Horus is the god of what? 1.Osiris, the god of the underworld, his wife, Isis, the goddess of fertility, and their son, Horus, were worshiped by ancient Egyptians. 2.The mummified hawk probably was dedicated to one of several gods associated with falcons, such as the sky god Horus, the war god Montu and the sun god Re. 3.The stolen pieces included stones from the entrances of tombs and a statue of the god Horus, who was half-man, half-falcon. –No explicit question target –Relying on keyword matching or density-based answer extraction may lead to wrong answer.