A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel 1.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.

Slides from: Doug Gray, David Poole

CSC411Artificial Intelligence 1 Chapter 3 Structures and Strategies For Space State Search Contents Graph Theory Strategies for Space State Search Using.

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

G53MLE | Machine Learning | Dr Guoping Qiu

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.

Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

Support Vector Machines

ISCOL 2011 – Bar Ilan University /151 A Probabilistic Model for Lexical Entailment Eyal Shnarch, Jacob Goldberger, Ido Dagan Bar Ilan University.

LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.

Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.

Normalized alignment of dependency trees for detecting textual entailment Erwin Marsi & Emiel Krahmer Tilburg University Wauter Bosma & Mariët Theune University.

Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson

Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.

Using Maximal Embedded Subtrees for Textual Entailment Recognition Sophia Katrenko & Pieter Adriaans Adaptive Information Disclosure project Human Computer.

1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.

Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.

An SVM Based Voting Algorithm with Application to Parse Reranking Paper by Libin Shen and Aravind K. Joshi Presented by Amit Wolfenfeld.

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Linear Discriminators Chapter 20 From Data to Knowledge.

Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.

SPAM DETECTION USING MACHINE LEARNING Lydia Song, Lauren Steimle, Xiaoxiao Xu.

Overview of the Fourth Recognising Textual Entailment Challenge NIST-Nov. 17, 2008TAC Danilo Giampiccolo (coordinator, CELCT) Hoa Trang Dan (NIST)

Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.

Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern and Ido Dagan (earlier partial version by Roy Bar-Haim) Download at:

Learning with Positive and Unlabeled Examples using Weighted Logistic Regression Wee Sun Lee National University of Singapore Bing Liu University of Illinois,

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

Processing of large document collections Part 2 (Text categorization, term selection) Helena Ahonen-Myka Spring 2005.

Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.

Sampletalk Technology Presentation Andrew Gleibman

Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.

K Nearest Neighbors Classifier & Decision Trees

Partially Supervised Classification of Text Documents by Bing Liu, Philip Yu, and Xiaoli Li Presented by: Rick Knowles 7 April 2005.

One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.

Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Powerpoint Templates Page 1 Powerpoint Templates Scalable Text Classification with Sparse Generative Modeling Antti PuurulaWaikato University.

Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.

For Monday Finish chapter 19 Take-home exam due. Program 4 Any questions?

Today’s Topics HW1 Due 11:55pm Today (no later than next Tuesday) HW2 Out, Due in Two Weeks Next Week We’ll Discuss the Make-Up Midterm Be Sure to Check.

Joseph Xu Soar Workshop Learning Modal Continuous Models.

Iterative similarity based adaptation technique for Cross Domain text classification Under: Prof. Amitabha Mukherjee By: Narendra Roy Roll no: Group:

Toward an Open Source Textual Entailment Platform (Excitement Project) Bernardo Magnini (on behalf of the Excitement consortium) 1 STS workshop, NYC March.

John Lafferty Andrew McCallum Fernando Pereira

Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.

Towards Entailment Based Question Answering: ITC-irst at Clef 2006 Milen Kouylekov, Matteo Negri, Bernardo Magnini & Bonaventura Coppola ITC-irst, Centro.

Computational Intelligence: Methods and Applications Lecture 21 Linear discrimination, linear machines Włodzisław Duch Dept. of Informatics, UMK Google:

NTU & MSRA Ming-Feng Tsai

Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Linear Models & Clustering Presented by Kwak, Nam-ju 1.

Lecture 2 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 2/1 Dr.-Ing. Erwin Sitompul President University

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Machine Learning Basics

Mining the Data Charu C. Aggarwal, ChengXiang Zhai

Asymmetric Gradient Boosting with Application to Spam Filtering

Tomasz Maszczyk and Włodzisław Duch Department of Informatics,

Recognizing Partial Textual Entailment

Presentation transcript:

A Confidence Model for Syntactically-Motivated Entailment Proofs Asher Stern & Ido Dagan ISCOL June 2011, Israel 1

Recognizing Textual Entailment (RTE) Given a text, T, and a hypothesis, H Does T entail H 2 T: An explosion caused by gas took place at a Taba hotel H: A blast occurred at a hotel in Taba. Example

Proof Over Parse Trees 3 T = T 0 → T 1 → T 2 →... → T n = H

Bar Ilan Proof System - Entailment Rules 4 explosion blast Generic Syntactic Lexical Syntactic Lexical

Bar Ilan Proof System 5 H: A blast occurred at a hotel in Taba. LexicalLexical syntacticSyntactic An explosion caused by gas took place at a Taba hotel A blast caused by gas took place at a Taba hotel A blast took place at a Taba hotel A blast occurred at a Taba hotel A blast occurred at a hotel in Taba.

Tree-Edit-Distance 6 Insurgents attacked soldiers -> Soldiers were attacked by insurgents

Proof over parse trees Which steps? Tree-Edits – Regular or custom Entailment Rules How to classify? Decide “yes” if and only if a proof was found – Almost always “no” – Cannot handle knowledge inaccuracies Estimate a confidence to the proof correctness 7

Proof systems TED based Estimate the cost of a proof Complete proofs Arbitrary operations Limited knowledge Entailment Rules based Linguistically motivated Rich knowledge No estimation of proof correctness Incomplete proofs – Mixed system with ad-hoc approximate match criteria 8 Our System The benefits of both worlds, and more! – Linguistically motivated complete proofs – Confidence model

Our Method 1.Complete proofs – On the fly operations 2.Cost model 3.Learning model parameters 9

On the fly Operations “On the fly” operations – Insert node on the fly – Move node / move sub-tree on the fly – Flip part of speech – Etc. More syntactically motivated than Tree Edits Not justified, but: Their impact on the proof correctness can be estimated by the cost model. 10

Cost Model 11 The Idea: 1.Represent the proof as a feature-vector 2.Use the vector in a learning algorithm

Cost Model Represent a proof as F (P) = (F 1, F 2 … F D ) Define weight vector w=(w 1,w 2,…,w D ) Define proof cost Classify a proof – b is a threshold Learn the parameters (w,b) 12

Search Algorithm 13 Need to find the “best” proof “Best Proof” = proof with lowest cost ‒Assuming a weight vector is given Search space is exponential ‒pruning

Parameter Estimation Goal: find good weight vector and threshold (w,b) Use a standard machine learning algorithm (logistic regression or linear SVM) But: Training samples are not given as feature vectors – Learning algorithm requires training samples – Training samples construction requires weight vector – Learning weight vector done by learning algorithm Iterative learning 14

Parameter Estimation 15 Weight Vector Training Samples Learning Algorithm

Parameter Estimation 1.Start with w 0, a reasonable guess for weight vector 2.i=0 3.Repeat until convergence 1.Find the best proofs and construct vectors, using w i 2.Use a linear ML algorithm to find a new weight vector, w i+1 3.i = i+1 16

Results 17 SystemRTE-1RTE-2RTE-3RTE-5 Logical Resolution Refutation (Raina et al. 2005) 57.0 Probabilistic Calculus of Tree Transformations (Harmeling, 2009) Probabilistic Tree Edit model (Wang and Manning, 2010) Deterministic Entailment Proofs (Bar-Haim et al., 2007) Our System OperationAvg. in positives Avg. in negatives Ratio Insert Named Entity Insert Content Word DIRT Change “subject” to “object” and vice versa Flip Part-of-speech Lin similarity WordNet

Conclusions 1.Linguistically motivated proofs – Complete proofs 2.Cost model – Estimation of proof correctness 3.Search best proof 4.Learning parameters 5.Results – Reasonable behavior of learning scheme 18

Thank you Q & A 19