Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.

Slides:

Advertisements

Similar presentations

Using Syntax to Disambiguate Explicit Discourse Connectives in Text Source: ACL-IJCNLP 2009 Author: Emily Pitler and Ani Nenkova Reporter: Yong-Xiang Chen.

Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Supervised Learning Techniques over Twitter Data Kleisarchaki Sofia.

The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.

NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.

WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.

COURSE: JUST 3900 INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE Instructor: Dr. John J. Kerbs, Associate Professor Joint Ph.D. in Social Work and Sociology.

FUNDAMENTAL RESEARCH ISSUES © 2012 The McGraw-Hill Companies, Inc.

A Database of Nate Chambers and Dan Jurafsky Stanford University Narrative Schemas.

Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Unit 2: Research Methods in Psychology

Experimental Design, Statistical Analysis CSCI 4800/6800 University of Georgia Spring 2007 Eileen Kraemer.

Statistics MP Oakes (1998) Statistics for corpus linguistics. Edinburgh University Press.

Chapter 11: Sequential Clinical Trials Descriptive Exploratory Experimental Describe Find Cause Populations Relationships and Effect Sequential Clinical.

Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.

Studying Behavior. Midterm Review Session The TAs will conduct the review session on Wednesday, October 15 th. If you have questions, your TA and.

Predicting the Semantic Orientation of Adjectives

Patent Search QUERY Log Analysis Shariq Bashir Department of Software Technology and Interactive Systems Vienna.

Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun

Chapter 2: The Research Enterprise in Psychology

Chapter 1 Psychology as a Science

Chapter 2: The Research Enterprise in Psychology

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification on Reviews Peter D. Turney Institute for Information Technology National.

QNT 531 Advanced Problems in Statistics and Research Methods

Learning Narrative Schemas Nate Chambers, Dan Jurafsky Stanford University IBM Watson Research Center Visit.

Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.

Opinion Mining Using Econometrics: A Case Study on Reputation Systems Anindya Ghose, Panagiotis G. Ipeirotis, and Arun Sundararajan Leonard N. Stern School.

Lecture 6: The Ultimate Authorship Problem: Verification for Short Docs Moshe Koppel and Yaron Winter.

Chapter 1: The What and the Why of Statistics

The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.

Chapter 1: The Research Enterprise in Psychology.

The Research Enterprise in Psychology. The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each.

Much of the meaning of terms depends on context. 1.

Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,

+ Recommending Branded Products from Social Media Jessica CHOW Yuet Tsz Yongzheng Zhang, Marco Pennacchiotti eBay Inc. eBay Inc.

Chapter 2 The Research Enterprise in Psychology. Table of Contents The Scientific Approach: A Search for Laws Basic assumption: events are governed by.

Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.

The What and the Why of Statistics The Research Process Asking a Research Question The Role of Theory Formulating the Hypotheses –Independent & Dependent.

Historical Thinking Skills

Chapter 1: The What and the Why of Statistics  The Research Process  Asking a Research Question  The Role of Theory  Formulating the Hypotheses  Independent.

CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.

Parallel and Distributed Searching. Lecture Objectives Review Boolean Searching Indicate how Searches may be carried out in parallel Overview Distributed.

STUDYING BEHAVIOR © 2009 The McGraw-Hill Companies, Inc.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)

LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.

Erasmus University Rotterdam Introduction Content-based news recommendation is traditionally performed using the cosine similarity and TF-IDF weighting.

Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-

1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.

Chapter 2 The Research Enterprise in Psychology. Table of Contents The Scientific Approach: A Search for Laws Basic assumption: events are governed by.

Inference Protocols for Coreference Resolution Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Nick Rizzolo, Mark Sammons, and Dan Roth This research.

Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.

A Critique and Improvement of an Evaluation Metric for Text Segmentation A Paper by Lev Pevzner (Harvard University) Marti A. Hearst (UC, Berkeley) Presented.

Inferential Statistics Inferential statistics allow us to infer the characteristic(s) of a population from sample data Slightly different terms and symbols.

DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.

FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.

Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.

Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.

Introduction to Data Analysis Why do we analyze data?  Make sense of data we have collected Basic steps in preliminary data analysis  Editing  Coding.

© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.

Natural Language Processing Vasile Rus

Language Identification and Part-of-Speech Tagging

Eye-tracking-while-reading Experiment Coherence Judgment Experiment

Improving a Pipeline Architecture for Shallow Discourse Parsing

Inferential statistics,

Unsupervised Learning of Narrative Schemas and their Participants

Much of the meaning of terms depends on context.

Presentation transcript:

Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011

2 The police arrested him because he killed someone. Event Causality

3 The police arrested him because he killed someone. event trigger Event Causality

4 The police arrested him because he killed someone. causality event trigger We identify causality between event pairs, but not the direction

Event Causality 5 The police arrested him because he killed someone. calculate causality association : co-occurrence counts, pointwise mutual information (PMI)…

Event Causality 6 The police arrested him because he killed someone. contingency discourse relation connective

Event Causality Identify multiple cues to jointly identify event causality:  Distributional association scores  discourse relation predictions 7 The police arrested him because he killed someone. discourse relation prediction distributional association score Distributional Discourse

Cause-Effect Association (CEA) and Discourse Relations We define an event e as: p ( a 1, a 2, …, a n ): 8 association between event predicates association between the predicate of an event and the arguments of the other event association between event arguments [ … … … ] connective [ … … … ] e e e e A connective is associated with two text spans Training on the Penn Discourse Treebank (PDTB), we developed a system that predicts the discourse relations of expressed by the connectives Distributional Discourse

Event Definition We define an event e as: p ( a 1, a 2, …, a n ):  predicate p : the event trigger word  a 1, a 2, …, a n : arguments associated with e Examples:  Verbs: “… he killed someone …”  Nominals: “… the attack by the troops …” 9

Contributions (Event Causality) We identify causality between event pairs in context:  verb-verb, verb-noun, noun-noun triggered event pairs  (prior work usually focus on just verb triggers) A minimally supervised approach to detect event causality using distributional similarity methods Leverage the interactions between event causality prediction and discourse relations prediction 10

Overview (Event Causality) Event causality:  Interaction between event causality and discourse relations  Event predicates: verbs, nominals Cause-Effect Association (CEA) Discourse and Causality:  Discourse relations  Constraints for joint inference with CEA Experiments:  Settings  Evaluation  Analysis Conclusion 11

Overview (Event Causality) Event causality:  Interaction between event causality and discourse relations  Event predicates: verbs, nominals Cause-Effect Association (CEA) Discourse and Causality:  Discourse relations  Constraints for joint inference Experiments:  Settings  Evaluation  Analysis Conclusion 12

Cause-Effect Association (CEA) 13 The police arrested him because he killed someone. CEA: prediction of whether two events are causally related

Cause-Effect Association (CEA) We define an event e as: p ( a 1, a 2, …, a n ):  predicate p : the event trigger word (e.g.: arrested, killed)  a 1, a 2, …, a n : arguments associated with e 14 association between event predicates association between the predicate of an event and the arguments of the other event association between event arguments

Predicate-Predicate Association 15

Predicate-Predicate Association 16 D : total number of documents in the collection N : number of documents that p occurs in

Predicate-Predicate Association 17 awards event pairs that are closer together in the texts (in terms of num# of sentences apart), while penalizing event pairs that are further apart

Predicate-Predicate Association 18 takes into account whether predicates (events) p i and p j appear most frequently with each other

Predicate-Predicate Association 19 takes into account whether predicates (events) p i and p j appear most frequently with each other u i will be maximized if there is no other predicate p k (as compared to p j ) having a higher co-occurrence probability with p i

Predicate-Argument Association 20 Pair up the predicates and arguments across events, calculate the PMI for each link, then average them

Argument-Argument Association 21 calculate the PMI for each possible pairings of the arguments (across the two events), then average them

Cause-Effect Association (CEA) 22 The police arrested him because he killed someone. CEA score: predicts whether the two events are causally related

Overview (Event Causality) Event causality:  Interaction between event causality and discourse relations  Event predicates: verbs, nominals Cause-Effect Association (CEA) Discourse and Causality:  Discourse relations  Constraints for joint inference with CEA Experiments:  Settings  Evaluation  Analysis Conclusion 23

Discourse and Causality Interaction 24 [ … … … ] connective [ … … … ] e e e e Interaction between: Discourse relation evoked by the connective c Relations between ep (event pairs that crosses the two text spans) causal? not-causal?

Penn Discourse Treebank (PDTB) Relations Discourse relations:  Comparison: Concession, Contrast, Pragmatic-concession, Pragmatic-contrast  Contingency: Cause, Condition, Pragmatic-cause, Pragmatic-condition  Expansion: Alternative, Conjunction, Exception, Instantiation, List, Restatement  Temporal: Asynchronous, Synchronous 25

Discourse Relations Comparison:  Highlights differences between the situations described in the text spans:  Negative evidence for causality Contingency:  The situation described in one text span causally influences the situation in the other:  Positive evidence 26 Contrast : [According to the survey, x% of Chinese Internet users prefer Google] whereas [ y% prefer Baidu]. Cause : [The first priority is search and rescue] because [many people are trapped under the rubble].

Discourse Relations Expansion:  Providing additional information, illustrating alternative situations, etc.:  Negative evidence, except for Conjunction (which connects arbitrary pieces of text spans) Temporal: Temporal precedence of the (cause) event over the (effect) event is a necessary, but not sufficient requisite for causality 27 Conjunction : [Over the past decade, x women were killed] and [ y went missing]. Synchrony : [He was sitting at his home] when [the whole world started to shake].

Discourse and Causality Interaction 28 [ … … … ] connective [ … … … ] e e e e Cause, Condition eiei ejej At least one (crossing) ep is causal 1 eiei ejej Cause, Condition, Temporal, Asynchronous, Synchrony, Conjunction If we have a (crossing) ep which is causal 2 Comparison, Concession, Contrast, Pragmatic-concession, Pragmatic-contrast, Expansion, Alternative, Exception, Instantiation, List, Restatement eiei ejej No (crossing) ep is casual 3

Joint Inference: Discourse & Distributional Causality Objective function: 29 Probability that connective c is predicted with discourse relation dr CEA prediction that event pair ep takes on the causal or not-causal label er discourse relation indicator variable event pair causality indicator variable

Constraints 30 If the connective is predicted with a “Cause” discourse relation, then the CEA system should predict that at least one of the (crossing) event pair is causally related Cause, Condition eiei ejej At least one (crossing) ep is causal 1 [ … … … ] connective [ … … … ] e e e e

Constraints 31 If a (crossing) event pair is predicted by CEA as causally related, then the associated connective should be predicted as having discourse relation; “Cause”, “Condition”, …, “Conjunction” [ … … … ] connective [ … … … ] e e e e eiei ejej Cause, Condition, Temporal, Asynchronous, Synchrony, Conjunction If we have a (crossing) ep which is causal 2

Constraints 32 If the connective is predicted with discourse relation “Comparison”, “Concession”, …, “Restatement”; no (crossing) event pair is causally related { “Comparison”,”Concession”… } [ … … … ] connective [ … … … ] e e e e Comparison, Concession, Contrast, Pragmatic-concession, Pragmatic-contrast, Expansion, Alternative, Exception, Instantiation, List, Restatement eiei ejej No (crossing) ep is casual 3

Overview (Event Causality) Event causality:  Interaction between event causality and discourse relations  Event predicates: verbs, nominals Cause-Effect Association (CEA) Discourse and Causality:  Discourse relations  Constraints for joint inference Experiments:  Settings  Evaluation  Analysis Conclusion 33

Experimental Settings To collect the distributional statistics for measuring CEA: 760K documents in the English Gigaword corpus 25 CNN documents from first three months of 2010:  20 documents for evaluation  5 documents for development 34

Annotation for Causal Event Pairs Annotation guidelines:  The Cause event should temporally precede the Effect event; the Effect event occurs because the Cause event occurs 35

Annotation for Causal Event Pairs 36 … S i-1 S i S i+1 … C (causality) R (relatedness) Drawing links between event predicates:  Event arguments are not annotated, but annotators are free to look at the entire document text  Annotators are not restricted to a fixed sentence window size Document

Annotation for Causal Event Pairs Annotators overlap on 10 evaluation documents. Agreement ratio:  0.67 for C+R  0.58 for C 37 # relationsEvalDev C41471 C+R49292

Performance on Extracting Causality 38

Performance on Extracting Causality and Relatedness 39

Analysis of CEA mistakes 50 (randomly selected) false-positives (precision errors):  56%: CEA assigns a high score to event pairs that are not causal  22%: involves events containing pronouns (“he”, “it”, etc.) as arguments 50 false-negatives (recall errors):  23%: CEA assigns a low score to causal event pairs  19%: involving nominal predicates that are not in our list of event evoking noun types  17%: involving nominal predicates without any argument (less information for CEA)  15%: involves events containing pronouns as arguments 40

Conclusion (Event Causality) Developed a minimally supervised approach to identify event causality Use distributional scores and discourse relations to jointly identify event causality 41