1 Tree-edit CRFs for RTE Mengqiu Wang and Chris Manning.

Slides:

Advertisements

Similar presentations

Expectation Maximization Dekang Lin Department of Computing Science University of Alberta.

Advertisements

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

CS 31003: Compilers Introduction to Phases of Compiler.

HMM II: Parameter Estimation. Reminder: Hidden Markov Model Markov Chain transition probabilities: p(S i+1 = t|S i = s) = a st Emission probabilities:

Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.

Bar Ilan University And Georgia Tech Artistic Consultant: Aviya Amir.

Learning with Probabilistic Features for Improved Pipeline Models Razvan C. Bunescu Electrical Engineering and Computer Science Ohio University Athens,

K-means clustering Hongning Wang

. Phylogeny II : Parsimony, ML, SEMPHY. Phylogenetic Tree u Topology: bifurcating Leaves - 1…N Internal nodes N+1…2N-2 leaf branch internal node.

Diego Milano, Monica Scannapieco and Tiziana Catarci Università di Roma “La Sapienza” Dipartimento di Informatica e Sistemistica

. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.

1 Learning with Latent Alignment Structures Quasi-synchronous Grammar and Tree-edit CRFs for Question Answering and Textual Entailment Mengqiu Wang Joint.

CS 206 Introduction to Computer Science II 10 / 14 / 2009 Instructor: Michael Eckmann.

Sequence Alignment Bioinformatics. Sequence Comparison Problem: Given two sequences S & T, are S and T similar? Need to establish some notion of similarity.

Using Information Extraction for Question Answering Done by Rani Qumsiyeh.

Using Maximal Embedded Subtrees for Textual Entailment Recognition Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer.

. Hidden Markov Models with slides from Lise Getoor, Sebastian Thrun, William Cohen, and Yair Weiss.

What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.

Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.

CMPT-825 (Natural Language Processing) Presentation on Zipf’s Law & Edit distance with extensions Presented by: Kaustav Mukherjee School of Computing Science,

Copyright N. Friedman, M. Ninio. I. Pe’er, and T. Pupko. 2001RECOMB, April 2001 Structural EM for Phylogentic Inference Nir Friedman Computer Science &

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

K-d tree k-dimensional indexing. Jaruloj Chongstitvatana k-d trees 2 Definition Let k be a positive integer. Let t be a k -d tree, with a root node p.

A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel 1.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at:

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.

Transformation of Timed Automata into Mixed Integer Linear Programs Sebastian Panek.

1 CSA4050: Advanced Topics in NLP Spelling Models.

B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)

1 CS546: Machine Learning and Natural Language Latent-Variable Models for Structured Prediction Problems: Syntactic Parsing Slides / Figures from Slav.

A Language Independent Method for Question Classification COLING 2004.

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation Kei Hashimoto, Hirohumi Yamamoto, Hideo Okuma, Eiichiro.

. Correctness proof of EM Variants of HMM Sequence Alignment via HMM Lecture # 10 This class has been edited from Nir Friedman’s lecture. Changes made.

BLAST: Basic Local Alignment Search Tool Altschul et al. J. Mol Bio CS 466 Saurabh Sinha.

S. Salzberg CMSC 828N 1 Three classic HMM problems 2.Decoding: given a model and an output sequence, what is the most likely state sequence through the.

Automatic Generation of Programming Feedback: A Data-Driven Approach Kelly Rivers and Ken Koedinger 1.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

Rooting Phylogenetic Trees with Non-reversible Substitution Models Von Bing Yap* and Terry Speed § *Statistics and Applied Probability, National University.

. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.

Table Extraction Using Conditional Random Fields D. Pinto, A. McCallum, X. Wei and W. Bruce Croft - on SIGIR03 - Presented by Vitor R. Carvalho March 15.

Cross Language Clone Analysis Team 2 February 3, 2011.

Dynamic Programming: Edit Distance

Review of Parnas’ Criteria for Decomposing Systems into Modules Zheng Wang, Yuan Zhang Michigan State University 04/19/2002.

Towards Entailment Based Question Answering: ITC-irst at Clef 2006 Milen Kouylekov, Matteo Negri, Bernardo Magnini & Bonaventura Coppola ITC-irst, Centro.

Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.

Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.

NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.

Measuring the Structural Similarity of Semistructured Documents Using Entropy Sven Helmer University of London, Birkbeck VLDB’07, September 23-28, 2007,

Edit Distances William W. Cohen.

January 2012Spelling Models1 Human Language Technology Spelling Models.

Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.

Melody Recognition with Learned Edit Distances Amaury Habrard Laboratoire d’Informatique Fondamentale CNRS Université Aix-Marseille José Manuel Iñesta,

More on HMMs and Multiple Sequence Alignment BMI/CS 776 Mark Craven March 2002.

Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.

4.2 - Algorithms Sébastien Lemieux Elitra Canada Ltd.

Learning to Align: a Statistical Approach

Piecewise linear gap alignment.

Parsing and Parser Parsing methods: top-down & bottom-up

SOFTWARE DESIGN AND ARCHITECTURE

The ideal approach is simultaneous alignment and tree estimation.

SDC – SDLC integration.

Bioinformatics: The pair-wise alignment problem

Statistical NLP Spring 2011

(2,4) Trees 11/15/2018 9:25 AM Sorting Lower Bound Sorting Lower Bound.

(2,4) Trees 12/4/2018 1:20 PM Sorting Lower Bound Sorting Lower Bound.

Presentation transcript:

1 Tree-edit CRFs for RTE Mengqiu Wang and Chris Manning

2 Tree-edit CRFs for RTE Extension to McCallum et al. UAI2005 work on CRFs for finite-state String Edit Distance Key attractions: Models the transformation of dependency parse trees (thus directly models syntax), unlike McCallum et al. ’05, which only models word strings Discriminatively trained

3 TE-CRFs model in details First of all, let’s look at the correspondence between alignment (with constraints) and edit operations

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod substitute delete insert Fancy substitute

5 TE-CRFs model in details Each valid tree edit operation sequence that transforms one tree into the other corresponds to an alignment. A tree edit operation sequence is models as a transition sequence among a set of states in a FSM S1 S2 S3 D, S, I D, E, I D, S, I substitute deletesubstitute insert substitute S1 S2 S1 S3S1 S2 S3 S2 S1S2 S1S3 S2 S1 S3 S2 … … … … … … …

6 FSM This is for one edit operation sequence substitute deletesubstitute insert substitute S1 S2 S1 S3S1 S2 S3 S2 S1S2 S1S3 S2 S1 S3 S2 … … … … … … … delete substitute insert substitute S1 S2 S1 S3S1 … … … … … … … substitute deletesubstitute insert S1 S2 S1 S3S1 … … … … … … … substitute deletesubstitute insert substitute S1 S2 S1 S3S1 … … … … … … … There are many other valid edit sequences

7 FSM cont. S1 S2 S3 D, S, I Start Stop ε ε S1 S2 S3 D, S, I Positive State Set Negative State Set ε ε

8 FSM transitions S3 S2 S1 S3 S2 Start S2 S3 S1 S2 S1 S2 S1 S3 … …… … S2 … … … … … … … Stop S3 S2 S1 S3 S2 S3 S1 S2 S1 S2 S1 S3 … …… … S2 … … … … … … … Positive State Set Negative State Set

9 What is the semantic interpretation of the FSM states? At this moment since all the states in the FSM are all fully-connected, it’s unclear what they mean. We fix the number of states to 3, and experiments shows that setting it to 1 or 6 hurts performance. We are running new experiments with more meaningfully designed FSM topologies, e.g., each states deterministically corresponds to a particular edit operation.

10 Parameterization S1 S2 substitute positive or negative positive and negative

11 Training using EM E-step M-step Using L-BFGS Jensen’s Inequality

12 Features for RTE Substitution Same -- Word/WordWithNE/Lemma/NETag/Verb/Noun/Adj/Adv/Other Sub/MisSub -- Punct/Stopword/ModalWord Antonym/Hypernym/Synonym/Nombank/Country Different – NE/Pos Unrelated words Delete Stopword/Punct/NE/Other/Polarity/Quantifier/Likelihood/Condition al/If Insert Stopword/Punct/NE/Other/Polarity/Quantifier/Likelihood/Condition al/If Tree RootAligned/RootAlignedSameWord Parent,Child,DepRel triple match/mismatch Date/Time/Numerical DateMismatch, hasNumDetMismatch, normalizedFormMismatch

13 Tree-edit CRFs for Textual Entailment Preliminary results Trained on RTE2 dev, tested on RTE2 test. model taken after 50 EM iterations acc:0.6275, map: SUM, acc=0.675 QA, acc=0.64 IR, acc=0.615 IE, acc=0.58

14 Work in progress Implementing a unordered tree-edit algorithm, which would allow swapping of sub-trees Use Stanford Parser dependency structure. Need to getting rid of cycles in CollapsedDependencyGraph (almost there, only have a few self-loops now). Experiment with deterministic topologies More features!! Training a separate model for each sub-task (is task information given at test time?)