1 Predicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System Hapnes Toba, Mirna Adriani, and Ruli Manurung.

Slides:



Advertisements
Similar presentations
ThemeInformation Extraction for World Wide Web PaperUnsupervised Learning of Soft Patterns for Generating Definitions from Online News Author Cui, H.,
Advertisements

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
1 Opinion Summarization Using Entity Features and Probabilistic Sentence Coherence Optimization (UIUC at TAC 2008 Opinion Summarization Pilot) Nov 19,
Re-ranking for NP-Chunking: Maximum-Entropy Framework By: Mona Vajihollahi.
Information Retrieval in Practice
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.
INFO 624 Week 3 Retrieval System Evaluation
Re-ranking Documents Segments To Improve Access To Relevant Content in Information Retrieval Gary Madden Applied Computational Linguistics Dublin City.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
Overview of Search Engines
Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Slide Image Retrieval: A Preliminary Study Guo Min Liew and Min-Yen Kan National University of Singapore Web IR / NLP Group (WING)
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
COMP423.  Query expansion  Two approaches ◦ Relevance feedback ◦ Thesaurus-based  Most Slides copied from ◦
A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
Question Answering.  Goal  Automatically answer questions submitted by humans in a natural language form  Approaches  Rely on techniques from diverse.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Abstract Question answering is an important task of natural language processing. Unification-based grammars have emerged as formalisms for reasoning about.
1 Query Operations Relevance Feedback & Query Expansion.
윤언근 DataMining lab.  The Web has grown exponentially in size but this growth has not been isolated to good-quality pages.  spamming and.
Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
Round-Robin Discrimination Model for Reranking ASR Hypotheses Takanobu Oba, Takaaki Hori, Atsushi Nakamura INTERSPEECH 2010 Min-Hsuan Lai Department of.
Methods of data fusion in information retrieval rank vs. score combination D. Frank Hsu Department of Computer and Information Science Fordham University.
1 Opinion Retrieval from Blogs Wei Zhang, Clement Yu, and Weiyi Meng (2007 CIKM)
Mining Binary Constraints in Feature Models: A Classification-based Approach Yi Li.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.
Ranking Definitions with Supervised Learning Methods J.Xu, Y.Cao, H.Li and M.Zhao WWW 2005 Presenter: Baoning Wu.
Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Michael Bendersky, W. Bruce Croft Dept. of Computer Science Univ. of Massachusetts Amherst Amherst, MA SIGIR
Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Improving QA Accuracy by Question Inversion John Prager, Pablo Duboue, Jennifer Chu-Carroll Presentation by Sam Cunningham and Martin Wintz.
A Framework to Predict the Quality of Answers with Non-Textual Features Jiwoon Jeon, W. Bruce Croft(University of Massachusetts-Amherst) Joon Ho Lee (Soongsil.
To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.
UIC at TREC 2006: Blog Track Wei Zhang Clement Yu Department of Computer Science University of Illinois at Chicago.
Wen Chan 1 , Jintao Du 1, Weidong Yang 1, Jinhui Tang 2, Xiangdong Zhou 1 1 School of Computer Science, Shanghai Key Laboratory of Data Science, Fudan.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Query expansion COMP423. Menu Query expansion Two approaches Relevance feedback Thesaurus-based Most Slides copied from
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Opinion spam and Analysis 소프트웨어공학 연구실 G 최효린 1 / 35.
Information Retrieval in Practice
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
CRF &SVM in Medication Extraction
ABDULLAH ALOTAYQ, DONG WANG, ED PHAM PROJECT BY:
Central Idea.
CS246: Information Retrieval
Deep Learning for the Soft Cutoff Problem
Presentation transcript:

1 Predicting Answer Location Using Shallow Semantic Analogical Reasoning in a Factoid Question Answering System Hapnes Toba, Mirna Adriani, and Ruli Manurung Faculty of Computer Science Universitas Indonesia

2 What is QAS Question answering sustem (QAS): –Input: a natural language question –Output: single answer

What is Factoid QAS Factoid QAS: –Input: an open-domin fact-based question –Output: answer –E.q: Question: –“Where was an Oviraptor fossil sitting on a nest discovered?” Answer: –‘Mongolia’s Gobi Desert’ 3

A Typical pipeline architecture Factoid QAS 4 Question analysis Query formulation Information retrieval Answer selection

A Typical pipeline architecture Factoid QAS Question analysis Determine the type of a given question, which in turn provides the expected answer type (EAT) –E.q. : person, organization, location. –named-entity recognizer (NER) is usually judged EAT 5

Semantic Analogical Reasoning (SAR) SAR predict the location of the final answer in a textual passage by employing the analogical reasoning (AR) framework from Silva et al. (2010). Author hypothesize that similar questions give similar answers. 6

Figure 1: Idea of Semantic Analogical Reasoning 7

SAR System Architecture 8

Semantic Analogical Reasoning (SAR) 9

Analogical Reasoning (AR) AR focus on the similarity between functions that map pairs to links. 10

Analogical Reasoning (AR) L ij ∈ {0, 1} : –indicator of the existence of a relation between two related objects i and j. Consider then that we also have K-dimensional vectors, each consisting of features which relates the objects i and j : = Θ [Θ 1... Θ k ] T. –This vector will represent the presence or absence of relation between two particular objects. 11

Analogical Reasoning (AR) Given the vectors of features Θ, the strength of the relation between two objects i and j is computed by performing logistic regression estimation as follows: P(L ij |x ij, Θ) = logistic(Θ T X ij ) where logistic(x) is defined as: 1 / (1 + e -x ) 12

Analogical Reasoning (AR) During AR training phase, the framework learns the weight (prior) for each feature by performing the following equation: 13

Analogical Reasoning (AR) During the AR retrieval phase, a final score that indicates the rank of predicted relations between two new objects i and j (query) and the related objects that have been learnt in a given set S is compute as follows: 14

Analogical Reasoning (AR) 15

Experiments and Evaluation objectives of experiments find the importance level of the feature set evaluate the potential of our approach to locate factoid answers in snippets and document retrieval scenarios without using any NER-tool –For this objective we run two kinds of experiments. 16

Experiments and Evaluation use the question answer pairs from CLEF 1 English monolingual of the year 2006, 2007 and training data2007 and factoid question answer pairs Testing data factoid questions

Experiments and Evaluation 18 Importance of feature

Experiments and Evaluation Gold Standard Snippets Assume: – IR process performed perfectly and returns the best snippet which covers the final answer. 19

Experiments and Evaluation 20 Gold Standard Snippets

Experiments and Evaluation Gold Standard Snippets: improve TIME and MEASURE –TIME: dd/mm/yy, dd-mmmyy, a single year number hh:mm a.m./p.m. –sometimes the chunker recognizes variations as numbers or as nouns. –MEASURE : A measurement can be written as numbers (for example: “40”) or as text (“forty”) 21

Experiments and Evaluation 22 Gold Standard Snippets ADVP = Adverb phrase NP = Noun phrase PP = Prepositional phrase O = Begin/End of a sentence or a coordinating conjinction

Experiments and Evaluation Indri Document Retrieval In the real situation, we will not have any information about the semantic chunk of the final answer. We assume that the best pair (i.e. the top-1 pair after the re-ranking process) of the AR answer features will supply us with that information. 23

Experiments and Evaluation Indri Document Retrieval performed IR process by using Indri Search Engine to retrieve the top-5 documents and pass them on to Open Ephyra and our system. Use same AR feature set as in the first experiment only use the question feature set Due to the lack of the answer features, we need to adjust the way of the re-ranking process. 24

Experiments and Evaluation 25 Indri Document Retrieval

Experiments and Evaluation 26 Indri Document Retrieval ADVP = Adverb phrase NP = Noun phrase PP = Prepositional phrase O = Begin/End of a sentence or a coordinating conjinction

Experiments and Evaluation 27 Indri Document Retrieval

Experiments and Evaluation 28 Indri Document Retrieval

Conclusion In this paper we have shown that by learning analogical linkages of question-answer pairs we can predict the location of factoid answers of a given snippet or document. Author approach achieves a very good accuracy in the OTHER answer-type 29

Experiments and Evaluation Gold Standard Snippets Compete: –Open Ephyra (Schlaefer et al., 2006) model-based NER (OpenNLP and Stanford NER) dictionary-based NER (that was specially design for TREC-QA competition) 30

Author classify the error types of our approach in three groups –(1) not covered by Indri retrieval –(2) decreasing rank of relevant document because of the AR re-ranking score function –(3) irrelevant example from the best AR pair. 31