August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Elliot Holt Kelly Peterson. D4 – Smells Like D3 Primary Goal – improve D3 MAP with lessons learned After many experiments: TREC 2004 MAP = >
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Knowledge Base Completion via Search-Based Question Answering
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
QA-LaSIE Components The question document and each candidate answer document pass through all nine components of the QA-LaSIE system in the order shown.
Group 3 Chad Mills Esad Suskic Wee Teck Tan. Outline  System and Data  Document Retrieval  Passage Retrieval  Results  Conclusion.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
T.Sharon - A.Frank 1 Internet Resources Discovery (IRD) IR Queries.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Hang Cui et al. NUS at TREC-13 QA Main Task 1/20 National University of Singapore at the TREC- 13 Question Answering Main Task Hang Cui Keya Li Renxu Sun.
Minimal Test Collections for Retrieval Evaluation B. Carterette, J. Allan, R. Sitaraman University of Massachusetts Amherst SIGIR2006.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
1 Formal Models for Expert Finding on DBLP Bibliography Data Presented by: Hongbo Deng Co-worked with: Irwin King and Michael R. Lyu Department of Computer.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
April 14, 2003Hang Cui, Ji-Rong Wen and Tat- Seng Chua 1 Hierarchical Indexing and Flexible Element Retrieval for Structured Document Hang Cui School of.
Structured Use of External Knowledge for Event-based Open Domain Question Answering Hui Yang, Tat-Seng Chua, Shuguang Wang, Chun-Keat Koh National University.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Probabilistic Query Expansion Using Query Logs Hang Cui Tianjin University, China Ji-Rong Wen Microsoft Research Asia, China Jian-Yun Nie University of.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
QUALIFIER in TREC-12 QA Main Task Hui Yang, Hang Cui, Min-Yen Kan, Mstislav Maslennikov, Long Qiu, Tat-Seng Chua School of Computing National University.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
Effective Query Formulation with Multiple Information Sources
INTERESTING NUGGETS AND THEIR IMPACT ON DEFINITIONAL QUESTION ANSWERING Kian-Wei Kor, Tat-Seng Chua Department of Computer Science School of Computing.
Chapter 6: Information Retrieval and Web Search
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Semantic v.s. Positions: Utilizing Balanced Proximity in Language Model Smoothing for Information Retrieval Rui Yan†, ♮, Han Jiang†, ♮, Mirella Lapata‡,
Finding Experts Using Social Network Analysis 2007 IEEE/WIC/ACM International Conference on Web Intelligence Yupeng Fu, Rongjing Xiang, Yong Wang, Min.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Using a Named Entity Tagger to Generalise Surface Matching Text Patterns for Question Answering Mark A. Greenwood and Robert Gaizauskas Natural Language.
Automatic Question Answering  Introduction  Factoid Based Question Answering.
A Critique and Improvement of an Evaluation Metric for Text Segmentation A Paper by Lev Pevzner (Harvard University) Marti A. Hearst (UC, Berkeley) Presented.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
(Pseudo)-Relevance Feedback & Passage Retrieval Ling573 NLP Systems & Applications April 28, 2011.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Automatic Question Answering Beyond the Factoid Radu Soricut Information Sciences Institute University of Southern California Eric Brill Microsoft Research.
Usefulness of Quality Click- through Data for Training Craig Macdonald, ladh Ounis Department of Computing Science University of Glasgow, Scotland, UK.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Meta-Path-Based Ranking with Pseudo Relevance Feedback on Heterogeneous Graph for Citation Recommendation By: Xiaozhong Liu, Yingying Yu, Chun Guo, Yizhou.
Statistical Machine Translation Part II: Word Alignments and EM
Linguistic Graph Similarity for News Sentence Searching
Web News Sentence Searching Using Linguistic Graph Similarity
Actively Learning Ontology Matching via User Interaction
Presentation transcript:

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui Renxu Sun Keya Li Min-Yen Kan Tat-Seng Chua Department of Computer Science National University of Singapore

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 2/28 Passage Retrieval in Question Answering Document Retrieval Answer Extraction Passage Retrieval QA System To narrow down the search scope Can answer questions with more context Lexical density based Distance between question words

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 3/28 Density Based Passage Retrieval Method However, density based can err when … What percent of the nation's cheese does Wisconsin produce? Incorrect: … the number of consumers who mention California when asked about cheese has risen by 14 percent, while the number specifying Wisconsin has dropped 16 percent. Incorrect: The wry “It's the Cheese” ads, which attribute California's allure to its cheese _ and indulge in an occasional dig at the Wisconsin stuff'' … sales of cheese in California grew three times as fast as sales in the nation as a whole 3.7 percent compared to 1.2 percent, … Incorrect: Awareness of the Real California Cheese logo, which appears on about 95 percent of California cheeses, has also made strides. Correct: In Wisconsin, where farmers produce roughly 28 percent of the nation's cheese, the outrage is palpable. Relationships between matched words differ …

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 4/28 Our Solution Examine the relationship between words –Dependency relations Exact match of relations for answer extraction Has low recall because same relations are often phrased differently Fuzzy match of dependency relationship –Statistical similarity of relations

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 5/28 Measuring Sentence Similarity Sentence 1Sentence 2 Sim (Sent1, Sent2) = ? Matched words Lexical matching Similarity of relations between matched words + Similarity of individual relations

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 6/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 7/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 8/28 What Dependency Parsing is Like Minipar (Lin, 1998) for dependency parsing Dependency tree –Nodes: words/chunks in the sentence –Edges (ignoring the direction): labeled by relation types What percent of the nation's cheese does Wisconsin produce?

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 9/28 Extracting Relation Paths Relation path –Vector of relations between two nodes in the tree produce Wisconsin percent cheese Two constraints for relation paths: 1.Path length (less than 7 relations) 2.Ignore those between two words that are within a chunk, e.g. New York.

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 10/28 Paired Paths from Question and Answer What percent of the nation's cheese does Wisconsin produce? In Wisconsin, where farmers produce roughly 28 percent of the nation's cheese, the outrage is palpable. Paired Relation Paths Sim Rel (Q, Sent) = ∑ i,j Sim (P i (Q), P j (Sent) )

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 11/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 12/28 Measuring Path Match Degree Employ a variation of IBM Translation Model 1 –Path match degree (similarity) as translation probability MatchScore (P Q, P S ) → Prob (P S | P Q ) Relations as words Why IBM Model 1? –No “word order” – bag of undirected relations –No need to estimate “target sentence length” Relation paths are determined by the parsing tree

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 13/28 Calculating Translation Probability (Similarity) of Paths Considering the most probable alignment (finding the most probable mapped relations) Take logarithm and ignore the constants (for all sentences, question path length is a constant) MatchScores of paths are combined to give the sentence’s relevance to the question. ? Given two relation paths from the question and a candidate sentence

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 14/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 15/28 Training and Testing TestingTraining Sim ( Q, Sent ) = ? Relation Mapping Scores Prob ( P Sent | P Q ) = ? P ( Rel (Sent) | Rel (Q) ) = ? Q - A pairs Paired Relation Paths Relation Mapping Model Similarity between relation vectors Similarity between individual relations 1.Mutual information (MI) based 2.Expectation Maximization (EM) based

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 16/28 Approach 1: MI Based Measures bipartite co-occurrences in training path pairs Accounts for path length (penalize those long paths) Uses frequencies to approximate mutual information

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 17/28 Approach – 2: EM Based Employ the training method from IBM Model 1 –Relation mapping scores = word translation probability –Utilize GIZA to accomplish training –Iteratively boosting the precision of relation translation probability Initialization – assign 1 to identical relations and a small constant otherwise

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 18/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations –Can relation matching help? –Can fuzzy match perform better than exact match? –Can long questions benefit more? –Can relation matching work on top of query expansion? Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 19/28 Evaluation Setup Training data –3k corresponding path pairs from 10k QA pairs (TREC-8, 9) Test data –324 factoid questions from TREC-12 QA task Passage retrieval on top 200 relevant documents by TREC

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 20/28 Comparison Systems MITRE –baseline –Stemmed word overlapping –Baseline in previous work on passage retrieval evaluation SiteQ – top performing density based method –using 3 sentence window NUS –Similar to SiteQ, but using sentences as passages Strict Matching of Relations –Simulate strict matching in previous work for answer selection –Counting the number of exactly matched paths Relation matching are applied on top of MITRE and NUS

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 21/28 Evaluation Metrics Mean reciprocal rank (MRR) –Measure the mean rank position of the correct answer in the returned rank list –On the top 20 returned passages Percentage of questions with incorrect answers Precision at the top one passage

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 22/28 Performance Evaluation All improvements are statistically significant (p<0.001) MI and EM do not make much difference given our training data –EM needs more training data –MI is more susceptible to noise, so may not scale well Passage retrieval systems MITRESiteQNUS Rel_Strict (MITRE) Rel_Strict (NUS) Rel_MI (MITRE) Rel_EM (MITRE) Rel_MI (NUS) Rel_EM (NUS) MRR % MRR improvement over MITRE SiteQ NUS N/A N/A N/A N/A N/A N/A N/A % Incorrect45.68%37.65%33.02%41.96%32.41%29.63%29.32%24.69%24.07% Precision at top one passage Fuzzy matching outperforms strict matching significantly.

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 23/28 Performance Variation to Question Length Long questions, with more paired paths, tend to improve more –Using the number of non-trivial question terms to approximate question length

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 24/28 Error Analysis Mismatch of question terms e.g. In which city is the River Seine Introduce question analysis Paraphrasing between the question and the answer sentence e.g. write the book → be the author of the book Most of current techniques fail to handle it Finding paraphrasing via dependency parsing (Lin and Pantel)

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 25/28 Performance on Top of Query Expansion On top of query expansion, fuzzy relation matching brings a further 50% improvement However –query expansion doesn’t help much on a fuzzy relation matching system –Expansion terms do not help in paring relation paths Passage Retrieval Systems NUS (baseline) NUS+QE Rel_MI (NUS+QE) Rel_EM (NUS+QE) MRR (% improvement over baseline) (+23.00%) (+83.94%) (+84.35%) % MRR improvement over NUS+QE N/A %+49.86% % Incorrect33.02%28.40%22.22% Precision at top one passage Rel_EM (NUS)

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 26/28 Outline Extracting and Paring Relation Paths Measuring Path Match Scores Learning Relation Mapping Scores Evaluations Conclusions

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 27/28 Conclusions Proposed a novel fuzzy relation matching method for factoid QA passage retrieval –Brings dramatic 70%+ improvement over the state-of- the-art systems –Brings further 50% improvement over query expansion –Future QA systems should bring in relations between words for better performance Query expansion should be integrated to relation matching seamlessly

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 28/28 Q & A Thanks!