Jan 4 th 2013 Event Extraction Using Distant Supervision Kevin Reschke.

Slides:



Advertisements
Similar presentations
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Advertisements

Overview of the TAC2013 Knowledge Base Population Evaluation: Temporal Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji,
Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
Distant Supervision for Knowledge Base Population Mihai Surdeanu, David McClosky, John Bauer, Julie Tibshirani, Angel Chang, Valentin Spitkovsky, Christopher.
The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.
NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.
Event Extraction Using Distant Supervision Kevin Reschke, Martin Jankowiak, Mihai Surdeanu, Christopher D. Manning, Daniel Jurafsky 30 May 2014 Language.
Self Taught Learning : Transfer learning from unlabeled data Presented by: Shankar B S DMML Lab Rajat Raina et al, CS, Stanford ICML 2007.
Multiple Instance Learning
Ang Sun Ralph Grishman Wei Xu Bonan Min November 15, 2011 TAC 2011 Workshop Gaithersburg, Maryland USA.
CS4705.  Idea: ‘extract’ or tag particular types of information from arbitrary text or transcribed speech.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
Open Information Extraction From The Web Rani Qumsiyeh.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
Information Extraction with Unlabeled Data Rayid Ghani Joint work with: Rosie Jones (CMU) Tom Mitchell (CMU & WhizBang! Labs) Ellen Riloff (University.
Information Retrieval in Practice
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
2013 Patriot Day.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Truc-Vien T. Nguyen Lab: Named Entity Recognition.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.
Event Extraction Using Distant Supervision Kevin Reschke, Mihai Surdeanu, Martin Jankowiak, David McClosky, Christopher Manning Nov 15, 2012.
The Necessity of Combining Adaptation Methods Cognitive Computation Group, University of Illinois Experimental Results Title Ming-Wei Chang, Michael Connor.
Knowledge Representation CPTR 314. The need of a Good Representation  The representation that is used to represent a problem is very important  The.
Querying Text Databases for Efficient Information Extraction Eugene Agichtein Luis Gravano Columbia University.
Wei Xu, Ralph Grishman, Le Zhao (CMU) New York University Novmember 24, 2011.
A Language Independent Method for Question Classification COLING 2004.
1 Automating Slot Filling Validation to Assist Human Assessment Suzanne Tamang and Heng Ji Computer Science Department and Linguistics Department, Queens.
Presenter: Shanshan Lu 03/04/2010
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
9/11 Remembrance WHAT IS IMPORTANT ABOUT 9/11/2011?
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
Ang Sun Director of Research, Principal Scientist, inome
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.
Template-Based Event Extraction Kevin Reschke – Aug 15 th 2013 Martin Jankowiak, Mihai Surdeanu, Dan Jurafsky, Christopher Manning.
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Date: 2013/10/23 Author: Salvatore Oriando, Francesco Pizzolon, Gabriele Tolomei Source: WWW’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang SEED:A Framework.
RESEARCH POSTER PRESENTATION DESIGN © Triggers in Extraction 5. Experiments Data Development set: KBP SF 2012 corpus.
Week 8 Seminar Role Of The Company Officer And The Safety Officer.
Relational Duality: Unsupervised Extraction of Semantic Relations between Entities on the Web Danushka Bollegala Yutaka Matsuo Mitsuru Ishizuka International.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.
DeepDive Case Study Dongfang Xu School of Information.
Enhanced hypertext categorization using hyperlinks Soumen Chakrabarti (IBM Almaden) Byron Dom (IBM Almaden) Piotr Indyk (Stanford)
Cold-Start KBP Something from Nothing Sean Monahan, Dean Carpenter Language Computer.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Understanding unstructured texts via Latent Dirichlet Allocation Raphael Cohen DSaaS, EMC IT June 2015.
Learning Relational Dependency Networks for Relation Extraction
Automatically Labeled Data Generation for Large Scale Event Extraction
A Brief Introduction to Distant Supervision
Introduction to Information Extraction
Introduction Task: extracting relational facts from text
Using Uneven Margins SVM and Perceptron for IE
Presentation transcript:

Jan 4 th 2013 Event Extraction Using Distant Supervision Kevin Reschke

Event Extraction “… Delta Flight 14 crashed in Mississippi killing 40 … ” … … News Corpus Knowledge Base

Event Extraction 1)Generate Candidates Flight 14 crashed in Mississippi. 2) Classify Mentions Features: (Unigram:Mississippi) (NEType:Location) (PrevWord:in) (ObjectOf:crashed) Label: CrashSite 3) Aggregate Labels Final Label: CrashSite Run Named Entity Recognition on relevant docs

Training a Mention Classifier Need Labeled Training Data Problems: - Expensive - Does not scale One year after [USAir] Operator [Flight 11] FlightNumber crashed in [Toronto] CrashSite, families of the [200] Fatalities victims attended a memorial service in [Vancouver] NIL.

Distant Supervision Solution: Use known events to automatically label training data. Training Knowledge-Base One year after [USAir] Operator [Flight 11] FlightNumber crashed in [Toronto] CrashSite, families of the [200] Fatalities victims attended a memorial service in [Vancouver] NIL.

Distant Supervision (High Level) Begin with set of known facts. Use this set to automatically label training instances from corpus. Train and classify (handle noise) 6

Distant Supervision for Relation Extraction Slot filling for named entity relations. Minz et al (ACL); Surdeanu et al (TAC-KBP). Example: Company:,,,, etc. Known relations: founder_of(Steve Jobs, Apple) Noisy Labeling Rule: Slot value and entity name must be in same sentence. 1.(+) Apple co-founder Steve Jobs passed away yesterday. 2.(-) Steve Jobs delivered the Stanford commencement address. 3.(+) Steve Jobs was fired from Apple in

Distant Supervision for Event Extraction Sentence level labeling rule doesn’t work. 1.Events lack proper names. “The crash of USAir Flight 11” 2.Slots values occur separate from names. The plane went down in central Texas. 10 died and 30 were injured in yesterday’s tragic incident. 8

Automatic Labeling: Event Extraction Solution: Document Level Noisy Labeling Rule. Heuristic: Use Flight Number as proxy for event name. Labeling Rule: Slot value and Flight Number must appear in same document. 9 Training Fact: {, } …Flight 11 crash Sunday… …The plane went down in [Toronto] CrashSite …

Evaluation: 80 plane crashes from Wikipedia infoboxes. Training set: 32; Dev set: 8; Test set: 40 Corpus: Newswire data from 1989 – present.

Automatic Labeling 38,000 Training Instances. 39% Noise: Examples: Good: At least 52 people survived the crash of the Boeing 737. Bad: First envisioned in 1964, the Boeing 737 entered service in 1968.

Extraction Models Local Model Train and classify each mention independently. Pipeline Model Classify sequentially; use previous label as feature. Captures dependencies between labels. E.g., Passengers and Crew go together: “4 crew and 200 passengers were on board.” Joint Model Searn Algorithm (Daumé III et al., 2009). Jointly models all mentions in a sentence.

Results

Label Aggregation Exhaustive Aggregation 14 Four Four Four Four

Label Aggregation: Noisy-OR Key idea: Classifier gives us distribution over labels: Stockholm Compute Noisy-OR for each label. If Noisy-OR > threshold, use label. 15

Results: Noisy-OR

Next Step Compare Distant Supervision with state of the art supervised approach (Huang & Rilloff, ACL-2011). MUC-4 Shared Task: Terrorist Attacks. Slot Template:,,,, Distant Supervision Source: rorist_incidents rorist_incidents Short summaries of several hundred terrorist attacks.