Liberal Event Extraction and Event Schema Induction

Liberal Event Extraction and Event Schema Induction
Lifu Huang1, Taylor Cassidy2, Xiaocheng Feng3, Heng Ji1, Clare R. Voss2, Jiawei Han4, Avirup Sil5 1 Rensselaer Polytechnic Institute, 2 US Army Research Lab, 3 Harbin Institute of Technology, 4 University of Illinois at Urbana-Champaign, 5 IBM T.J. Watson Research Center

Task Definition Event Mention: a string of words in text denoting a particular event; Event Trigger: the word in the event mention that holds the bulk of its semantic content; Event Argument: a concept that serves as a participant or attribute with a specific role in an event mention; Argument Role: the function or purpose of the argument with respect to the corresponding event; e.g.:

Comparison Traditional Event Extraction vs. Liberal Event Extraction

Motivating Examples - Top 8 most similar words based on lexical embedding

Hypothesis 1 Event triggers that occur in similar contexts and share the same sense tend to have similar types. WSD (word sense disambiguation) to learn a distinct embedding for each sense;

Motivating Examples

Hypothesis 2 Beyond the lexical semantics of a particular event trigger, its type is also dependent on its arguments and their roles, as well as other words contextually connected to the trigger; Event Structure;

Approach Overview Hypothesis 2 Hypothesis 1

Identification Trigger Identification: Arguments Identification:
AMR Parsing: map concepts to OntoNotes; FrameNet: war, theft, pickpocket; Arguments Identification: AMR Parsing: Most arguments can be captured by rich semantic parsing.

Representation WSD based Lexical Embedding:
Preprocess: IMS (Zhong and Ng. 2010) for word sense disambiguation; Skip-gram Word2Vec (Mikolov et al., 2013); Top 10 most similar words: Fire-1 Score firing cannon grenades grenade gun arm explosive point-blank 0.829 0.774 0.767 0.760 0.757 0.755 0.742 0.740 Fire-2 Score rehired hire-1 resign-1 rehire sacked quit-1 sack-1 quits 0.790 0.626 0.618 0.596 0.591 0.565 0.563

Representation Event Structure Representation:
Recursive Neural Tensor AutoEncoder

Joint Constraint Clustering
Joint Optimized Trigger and Argument Clusters: Spectral Clustering; Internal Measures: cohesion (Dintra) and separation(Dinter);

Naming Trigger Type Naming: Argument Role Naming:
The trigger which is nearest to the centroid; Argument Role Naming: Borrow from existing linguistic resource: map AMR roles to the roles in FrameNet, VerbNet, PropBank;

Naming Trigger Type Naming: Argument Role Naming:
The trigger which is nearest to the centroid; Argument Role Naming:

Experiments and Evaluation
Data Set: ACE (Automatic Content Extraction); ERE (Entity Relation Event);

Schema Discovery: Event Types and Arguments:

Schema Discovery:

Schema Discovery: Coverage Comparison with ACE and ERE: New Event types and Arguments: e.g., Arguments for Attack Event: Attacker, Target, Instrument, Time, Place The Dutch government, facing strong public anti-war pressure, said it would not commit fighting forces to the war against Iraq but added it supported the military campaign to disarm Saddam. Attack New Argument: Purpose

Event Extraction for All Types: Fully annotate 100 sentences with inter-annotator agreement: 83% for triggers and 79% for arguments; Missing Triggers and Arguments:

Event Extraction for All Types: Fully annotate 100 sentences with inter-annotator agreement: 83% for triggers and 79% for arguments; Missing Triggers and Arguments: Multi-word expressions (e.g., took office) or not verb or noun concepts:

Event Extraction for All Types: Fully annotate 100 sentences with inter-annotator agreement: 83% for triggers and 79% for arguments; Missing Triggers and Arguments: Multi-word expressions (e.g., took office) or not verb or noun concepts: e.g.: As well as previously holding senior positions at Barclays Bank, BZW and Kleinwort Benson, McCarthy was formerly a top civil servant at the Department of Trade and Industry.

Event Extraction for All Types: Fully annotate 100 sentences with inter-annotator agreement: 83% for triggers and 79% for arguments; Missing Triggers and Arguments: Multi-word expressions (e.g., took office) or not verb or noun concepts Arguments that are not direct semantic related with event triggers:

Event Extraction for All Types: Fully annotate 100 sentences with inter-annotator agreement: 83% for triggers and 79% for arguments; Missing Triggers and Arguments: Multi-word expressions (e.g., took office) or not verb or noun concepts Arguments that are not direct semantic related with event triggers: e.g., : Anti-corruption judge Saul Pena stated Montesinos has admitted to the abuse of authority charge.

Impact of Representations: AMR vs. Dependency: AMR: Instrument Dependency Parsing: compound modifier e.g., Approximately 25 kilometers southwest of Sringar 2 militants were killed in a second gun battle.

Event Extraction on ACE and ERE Types Manually asses triggers to ACE/ERE; Comparison Supervised Methods: DMCNN: A dynamic multi-pooling convolutional neural network based on distributed word representations (Chen et al., 2015); Joint IE: A Structured perceptron model based on symbolic semantic features (Li et al., 2013); LSTM: A long short-term memory neural network (Hochreiter and Schmidhuber, 1997) based on distributed semantic features;

Event Extraction on ACE and ERE Types Manually asses triggers to ACE/ERE; Comparison Supervised Methods:

Event Extraction on ACE and ERE Types Manually asses triggers to ACE/ERE; Comparison Supervised Methods: Supervised methods heavily rely on the quality and quantity of the training data: ERE training documents contain 1,068 events and 2,448 arguments, while ACE training documents contain more than 4,700 events and 9,700 arguments;

Event Extraction on Biomedical Data set: 14 biomedical articles (755 sentences) with perfect AMR annotations; Randomly sample 100 sentences and manually asses the correctness of each event and argument; 83.1% precision on trigger labeling (619 events); 78.4% precision on argument labeling (1,124 arguments);

Event Extraction on Biomedical Data set: 14 biomedical articles (755 sentences) with perfect AMR annotations; Randomly sample 100 sentences and manually asses the correctness of each event and argument;

Questions and Comments? Thanks!

Liberal Event Extraction and Event Schema Induction

Similar presentations

Presentation on theme: "Liberal Event Extraction and Event Schema Induction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Liberal Event Extraction and Event Schema Induction

Similar presentations

Presentation on theme: "Liberal Event Extraction and Event Schema Induction"— Presentation transcript:

Similar presentations

About project

Feedback