Syntactic and semantic models and algorithms in Question Answering Alexander Solovyev Bauman Moscow Sate Technical University 20.10.20111RCDL.

Slides:

Advertisements

Similar presentations

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

Advertisements

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

Improved TF-IDF Ranker

Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.

1 Abdeslame ALILAOUAR, Florence SEDES Fuzzy Querying of XML Documents The minimum spanning tree IRIT - CNRS IRIT : IRIT : Research Institute for Computer.

Baselines for Recognizing Textual Entailment Ling 541 Final Project Terrence Szymanski.

A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven

Foundations of Comparative Analytics for Uncertainty in Graphs Lise Getoor, University of Maryland Alex Pang, UC Santa Cruz Lisa Singh, Georgetown University.

Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.

Normalized alignment of dependency trees for detecting textual entailment Erwin Marsi & Emiel Krahmer Tilburg University Wauter Bosma & Mariët Theune University.

NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

Aki Hecht Seminar in Databases (236826) January 2009

July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

3rd Answer Validation Exercise ( AVE 2008) QA subtrack at Cross-Language Evaluation Forum 2008 UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks.

Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.

Using Maximal Embedded Subtrees for Textual Entailment Recognition Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer.

Third Recognizing Textual Entailment Challenge Potential SNeRG Submission.

CMPT-825 (Natural Language Processing) Presentation on Zipf’s Law & Edit distance with extensions Presented by: Kaustav Mukherjee School of Computing Science,

Employing Two Question Answering Systems in TREC 2005 Harabagiu, Moldovan, et al 2005 Language Computer Corporation.

A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel 1.

Overview of Search Engines

Important Problem Types and Fundamental Data Structures

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Overview of the Fourth Recognising Textual Entailment Challenge NIST-Nov. 17, 2008TAC Danilo Giampiccolo (coordinator, CELCT) Hoa Trang Dan (NIST)

Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum 2007 UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at:

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.

Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.

An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.

BACKGROUND KNOWLEDGE IN ONTOLOGY MATCHING Pavel Shvaiko joint work with Fausto Giunchiglia and Mikalai Yatskevich INFINT 2007 Bertinoro Workshop on Information.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.

The main mathematical concepts that are used in this research are presented in this section. Definition 1: XML tree is composed of many subtrees of different.

Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.

Extracting Semantic Constraint from Description Text for Semantic Web Service Discovery Dengping Wei, Ting Wang, Ji Wang, and Yaodong Chen Reporter: Ting.

Answer Validation Exercise - AVE QA subtrack at Cross-Language Evaluation Forum UNED (coord.) Anselmo Peñas Álvaro Rodrigo Valentín Sama Felisa Verdejo.

Querying Structured Text in an XML Database By Xuemei Luo.

AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Relation Alignment for Textual Entailment Recognition Cognitive Computation Group, University of Illinois Experimental ResultsTitle Mark Sammons, V.G.Vinod.

1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,

Element Level Semantic Matching Pavel Shvaiko Meaning Coordination and Negotiation Workshop, ISWC 8 th November 2004, Hiroshima, Japan Paper by Fausto.

August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.

Automatic Question Answering  Introduction  Factoid Based Question Answering.

Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏

Evaluating Answer Validation in multi- stream Question Answering Álvaro Rodrigo, Anselmo Peñas, Felisa Verdejo UNED NLP & IR group nlp.uned.es The Second.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Toward an Open Source Textual Entailment Platform (Excitement Project) Bernardo Magnini (on behalf of the Excitement consortium) 1 STS workshop, NYC March.

Commonsense Reasoning in and over Natural Language Hugo Liu, Push Singh Media Laboratory of MIT The 8 th International Conference on Knowledge- Based Intelligent.

Answer Mining by Combining Extraction Techniques with Abductive Reasoning Sanda Harabagiu, Dan Moldovan, Christine Clark, Mitchell Bowden, Jown Williams.

Towards Entailment Based Question Answering: ITC-irst at Clef 2006 Milen Kouylekov, Matteo Negri, Bernardo Magnini & Bonaventura Coppola ITC-irst, Centro.

Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,

SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.

Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.

Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)

1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.

Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,

AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,

SERVICE ANNOTATION WITH LEXICON-BASED ALIGNMENT Service Ontology Construction Ontology of a given web service, service ontology, is constructed from service.

Clustering of Web pages

Element Level Semantic Matching

Traditional Question Answering System: an Overview

What is the Entrance Exams Task

UNED Anselmo Peñas Álvaro Rodrigo Felisa Verdejo Thanks to…

Presentation transcript:

Syntactic and semantic models and algorithms in Question Answering Alexander Solovyev Bauman Moscow Sate Technical University RCDL. Voronezh.

Agenda Question Answering and Answer Validation task Answer Validation via Recognizing Text Entailment – Bags of Words/Links intersection [Wang 2008] – Tree edit distance [Panyakanok, Roth, Yih 2004] – Trees alignment [Marsi, Krahmer, Bosma, Theune 2006] – Predicates matching [Schlaefer 2007] – Parallel traversal [Solovyev 2010] – Automatic logic prove for logical forms [Akhmatova 2005] Cross-application of syntax and semantic models in various algorithms RCDL. Voronezh.

Question Answering RCDL. Voronezh.

Meta-search architecture RCDL. Voronezh.

Answer Validation task RCDL. Voronezh.

Bag of words Used as baseline method or backup strategy Given two sentences – question and snippet Replace question focus by *ANS* Replace answer in snippet by *ANS* Remove stop words and punctuations Count sets of distinct words in question and supporting text – Q and P c=|Q∩P|/|Q| Answer is supported by snippet if c > threshold (e.g. 0.7) Backup strategy in [Wang, Neumann. Using Recognizing Textual Entailment as a Core Engine for Answer Validation. 2008] RCDL. Voronezh.

Bag of words example What is the fastest car in the world? The Jaguar XJ220 is the dearest, fastest and the most sought after car in the world. → *ANS* is the fastest car in the world? The *ANS* is the dearest, fastest and the most sought after car in the world. |Q∩P|={*ANS*, is, the, fastest, car, in, world} c=|Q∩P|/|Q|=7/7= RCDL. Voronezh.

Bag of links example *ANS*-is car-is the-car fastest-car in-is world-in the-world ?-is The-*ANS* *ANS*-is car-is the-car fastest-car in-is world-in the-world.-is … c = |Q∩P|/|Q| = 7/7 = RCDL. Voronezh.

Tree edit distance Given two ordered dependency trees representing question statement and snippet: T q, T p Cost of deleting a node from tree: γ(a→λ) Cost of inserting a node into tree: γ(λ→a) Cost of changing a node: γ(a→b) Cost of a sequence of operations S = is γ(S) =Σ γ(s i ) Find a minimum cost of transformation T p to T q : [Punyakanok et al. Natural Language Inference via Dependency Tree Mapping. An Application to Question Answering. 2004] RCDL. Voronezh.

Tree edit distance with subtree removal [Zhang, Shasha. Simple fast algorithms for the editing distance between tree and related problems. 1989] RCDL. Voronezh.

Tree edit distance vs Bag-of-words performance [Punyakanok et al. Natural Language Inference via Dependency Tree Mapping. An Application to Question Answering. 2004] TREC 2002 QA Significant limitation of Zhang-Shasha algorithm: ordered trees only! RCDL. Voronezh.

Trees alignment [Krahmer, Bosma. Normalized alignment of dependency trees for detecting textual entailment. 2006] RCDL. Voronezh.

Trees alignment Given two dependency trees representing question statement and snippet: T q, T p Skip penalty SP, Parent weight PW Calculate sub-trees match matrix S=|T q |x|T p | Every element s= to be calculated recursively Trees similarity is a score of predicates similarity Modification: To replace question focus by *ANS* To replace answer in snippet by *ANS* to rotate trees to have *ANS* in roots, and use similarities of these roots. [Krahmer, Bosma. Normalized alignment of dependency trees for detecting textual entailment. 2006] RCDL. Voronezh.

Trees alignment [Krahmer, Bosma. Normalized alignment of dependency trees for detecting textual entailment. 2006] root node v can be directly aligned to root node v’ any of the children of v can be aligned to v’ v can be aligned to any of the children of v’ with skip penalty P(v, v’) is the set of all possible pairings of the n children of v against the m children of v’, which amounts to the power set of {1…n}×{1…m} |v’ j |/|v’| represent the number of tokens dominated by the j-th child node of node v’ in the question divided by the total number of tokens dominated by node v’ RCDL. Voronezh.

Trees alignment performance in RTE-2 BUT, For the RTE-2 test set, Zanzotto et al. found that simple lexical overlapping (sophisticated bag-of-words) achieves accuracy of 60%, better than any other sophisticated lexical methods they tested [Krahmer, Bosma. Normalized alignment of dependency trees for detecting textual entailment. 2006] accuracy parameters RCDL. Voronezh.

Predicates matching Semantic Role Labeling: Terms labeled either as predicates or arguments Every term fills some predicate’s argument position Predicate-argument relationship is labeled by type of argument: ARG0, ARG1, ARGM-LOC, ARGM-TMP etc. Schlaefer’s method ignores labels and not uses deep syntax dependencies. SRL gives two-level hierarchy: predicates and arguments. Dependencies between arguments are not considered – they all depends on predicate. OpenEphyra: [Schlaefer. A Semantic Approach to Question Answering. 2007] In what year was the Carnegie Mellon campus at the west coast established ? The CMU campus at the US west cost was founded in the year RCDL. Voronezh.

Predicates matching Given two Semantic-Role-Labeled statements: question and snippet Calculate similarity between all possible predicate-predicate pairs Score of the best match to consider as answer confidence [Schlaefer. A Semantic Approach to Question Answering. 2007] -wordnet-based lexical similarity of terms RCDL. Voronezh.

Predicates matching performance [Schlaefer. A Semantic Approach to Question Answering. 2007] TechniqueQuestions Answered Questions Correct PrecisionRecall Answer type analysis Pattern learning Semantic parsing Precision and recall on TREC 11 questions with correct answers ( =447 factoid questions) RCDL. Voronezh.

Parallel traversal Given two directed graphs representing semantic relations in question statement and in snippet Replace focus by *ANS* in question and answer by *ANS* in snippet Shortcut every node in snippet: for every pair of incoming and outgoing edge (e i,e o ) create a new edge (source(e i ),target(e o )) (continued..) [Solovyev. Who is to blame and Where the dog is buried? Method of answers validations based on fuzzy matching of semantic graphs in Question answering system. Romip 2010] RCDL. Voronezh.

Parallel traversal [Solovyev. Who is to blame and Where the dog is buried? Method of answers validations based on fuzzy matching of semantic graphs in Question answering system. Romip 2010] Calculate similarity of nodes *ANS* and *ANS* by recursive formula: RCDL. Voronezh.

Parallel traversal [Solovyev. Who is to blame and Where the dog is buried? Method of answers validations based on fuzzy matching of semantic graphs in Question answering system. Romip 2010] RCDL. Voronezh.

Parallel traversal performance on ROMIP [Solovyev. Who is to blame and Where the dog is buried? Method of answers validations based on fuzzy matching of semantic graphs in Question answering system. Romip 2010] RecallError myrtle-lucene (bag-of-words) myrtle-seman (parallel traversal) RCDL. Voronezh.

Automatic logic prove for logical forms text. A boy bought a desk. hypothesis. A boy bought a table. Axiom extracted from WordNet: X is a desk→X is a table Input for Otter: exists x exists y exists e (boy(x) & bought(e, x, y) & desk(y)). all x (desk(x) → table(x)). -(exists x, y, e (boy(x) & bought(e, x, y) & table(y))). [Akhmatova et al. Recognizing Textual Entailment Via Atomic Propositions. 2006] RCDL. Voronezh.

Automatic logic prove. Example RCDL. Voronezh.

Cross-application of models and algorithms Bag of wordsSyntax dependencies Semantic dependencies Logic forms Sets intersection Wang 2008 Zanzotto 2006 AWang 2008 Predicates matching BSchlaefer 2007 Trees alignment Marsi, Krahmer, Bosma, Theune 2006 C Tree-edit distance Panyakanok, Roth, Yih 2004 D Parallel traversal ESolovyev 2010 Automatic theorem prove Akhmatova RCDL. Voronezh.

Conclusion There are many works on RTE and AVE in QA use Dependency trees Authors usually stick to single parsing model from the very beginning till the end There is an opportunity for replacing underlying model in existing algorithms (experiments A, B, C, D, E) Looks like in most works syntax relations are enough, authors ignores sophisticated semantic attributes Almost nothing done in Russian language In current project all algorithms above implemented Russian AVE collection is being developed… RCDL. Voronezh.

References QA overview: Ittycheriah, Abraham. A Statistical Approach for Open Domain Question Answering // Advances in Open Domain Question Answering. Springer Netherlands, Part 1. Vol.32. Prager, John. Open-Domain Question-Answering // Foundation and Trends in Information Retrieval, vol 1, no 2, pp , Rodrigo, Á., Peñas, A., and Verdejo, F Overview of the answer validation exercise In Proceedings of the 9th CLEF Lecture Notes In Computer Science. Springer-Verlag, Berlin, Heidelberg, RTE in QA: Wang, Neumann. Using Recognizing Textual Entailment as a Core Engine for Answer Validation // Working Notes for the CLEF 2008 Workshop. Panyakanok, V., Roth, D. and Yih, W. Natural language interface via dependency tree mapping: An application to question answering // AI and Math.- January Zhang, Shasha. Simple fast algorithms for the editing distance between tree and related problems Marsi, E. and Krahmer, E. and Bosma, W.E. and Theune, M. (2006) Normalized Alignment of Dependency Trees for Detecting Textual Entailment. In: Second PASCAL Recognising Textual Entailment Challenge, April 2006, Venice, Italy. Schlaefer, Nico. A Semantic Approach to Question Answering: Saarbrücken Akhmatova, E. Textual Entailment Resolution via Atomic Propositions // Proceedings of the PASCAL Challenges Workshop on Recognising Textual Entailment, Southampton, UK (2005) 61–64. Solovyev. Who is to blame and Where the dog is buried? Method of answers validation based on fuzzy matching of semantic graphs in Question answering system. Romip 2010 Tools: RCDL. Voronezh.

Thanks Questions? RCDL. Voronezh.