Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction.

Slides:



Advertisements
Similar presentations
Information Extraction Lecture 7 – Linear Models (Basic Machine Learning) CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Advertisements

SEMANTIC ROLE LABELING BY TAGGING SYNTACTIC CHUNKS
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
Natural Language Processing - Parsing 1 - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment / Binding Bottom vs. Top Down Parsing.
Mining External Resources for Biomedical IE Why, How, What Malvina Nissim
The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.
Max-Margin Matching for Semantic Role Labeling David Vickrey James Connor Daphne Koller Stanford University.
Semantic Role Labeling Abdul-Lateef Yussiff
Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods William W. Cohen, Sunita Sarawagi.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language, Syntax, Parsing Problems in Parsing Ambiguity, Attachment.
 Christel Kemke /08 COMP 4060 Natural Language Processing PARSING.
Information Extraction Shallow Processing Techniques for NLP Ling570 December 5, 2011.
CS4705.  Idea: ‘extract’ or tag particular types of information from arbitrary text or transcribed speech.
Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall –
July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Information Extraction from HTML: General Machine Learning Approach Using SRV.
Artificial Intelligence 2004 Natural Language Processing - Syntax and Parsing - Language Syntax Parsing.
The classification problem (Recap from LING570) LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/08 1.
Introduction to Machine Learning Approach Lecture 5.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts G. Melli, M. Ester, A. Sarkar Dec. 6, 2007
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Automatic Extraction of Opinion Propositions and their Holders Steven Bethard, Hong Yu, Ashley Thornton, Vasileios Hatzivassiloglou and Dan Jurafsky Department.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Using Text Mining and Natural Language Processing for Health Care Claims Processing Cihan ÜNAL
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Ling 570 Day 17: Named Entity Recognition Chunking.
Lecture 6 Hidden Markov Models Topics Smoothing again: Readings: Chapters January 16, 2013 CSCE 771 Natural Language Processing.
Lecture 13 Information Extraction Topics Name Entity Recognition Relation detection Temporal and Event Processing Template Filling Readings: Chapter 22.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
Natural Language - General
Deep Questions without Deep Understanding
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
PAIR project progress report Yi-Ting Chou Shui-Lung Chuang Xuanhui Wang.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.
Natural Language Processing Information Extraction Jim Martin (slightly modified by Jason Baldridge)
Natural Language Processing Vasile Rus
Automatically Labeled Data Generation for Large Scale Event Extraction
Approaches to Machine Translation
CSC 594 Topics in AI – Natural Language Processing
A Brief Introduction to Distant Supervision
Relation Extraction CSCI-GA.2591
Improving a Pipeline Architecture for Shallow Discourse Parsing
CSCE 590 Web Scraping – Information Retrieval
CSCI 5832 Natural Language Processing
LING 388: Computers and Language
CSCI 5832 Natural Language Processing
Natural Language - General
Automatic Detection of Causal Relations for Question Answering
Lecture 13 Information Extraction
Approaches to Machine Translation
CSCI 5832 Natural Language Processing
CSCI 5832 Natural Language Processing
Presentation transcript:

Relation Extraction (RE) via Supervised Classification See: Jurafsky & Martin SLP book, Chapter 22 Exploring Various Knowledge in Relation Extraction. ZHOU GuoDong SU Jian ZHANG Jie ZHANG Min, ACL

Relations between Entities Classification instance: a (ordered) pair of entities –Typically in a sentence –Arguments not always entities, can be common noun phrases (e.g. for attack) This requires segmentation (IOB – like NER) May target single or multiple relations Annotated training for relation instances –relation type, argument spans and their roles –Negative examples may be all entity pairs that are not annotated as having a relation A restricted case of Information Extraction (IE) 2

Classification Architectures Binary class for each relation, one-versus-all –Highest classification score wins (or ranking of positives) –All classifications negative implies no relation Multi-class classifiers, with no-relation as a class Two tier classification: –Is there a relation? (binary) –Relation type multi-class, possibly one vs. all (highest negative score may win) Argument role may be distinguished by its NER type (e.g. employee-of), or by directional features May classify each participant to its role –Usually done in template-filling IE 3

Speech and Language Processing - Jurafsky and Martin 4 Features (based on James Martin – 4 slides) We can group the RE features into three categories –Features of the named entities/arguments involved –Features derived from the words between and around the named entities –Features derived from the syntactic environment that governs the two entities

5 Features Features of the entities –Their types Concatenation of the types –Headwords of the entities George Washington Bridge –Words in the entities Notice: arguments aren’t only named entities, can be (common-) noun phrases Features between and around –Particular positions to the left and right of the entities +/- 1, 2, 3 Bag of words/ n-grams between –Words related to the predicate words, e.g. WordNet synonyms Speech and Language Processing - Jurafsky and Martin

6 Features Syntactic environment –Constituent path through the tree from one to the other –Base syntactic chunk sequence from one to the other –Dependency path –Indicators of certain edges/labels along the path E.g. appositive –Tree-distance between arguments Speech and Language Processing - Jurafsky and Martin

7 Example For the following example, we’re interested in the possible relation between American Airlines and Tim Wagner. –American Airlines, a unit of AMR, immediately matched the move, spokesman Tim Wagner said.

Tuning and Analysis Look at the data Examine feature weights –most positive/negative Analyze classification errors –False positives, false negatives Try alternative feature selection policies 8

What about lexical variability? Relevant for both relation and argument words Without external resources - variability needs to be covered in training data External lexical similarity resources, manual and/or statistical, may be used for “lexical expansion”; but it’s not trivial to gain substantial benefit from them in a supervised setting –DIRT-style rules may be useful for relation variability, there has been work in this direction in the IE field 9

Template/Event Information Extraction Goal: extract complete templates with slots, often about events –attack, acquisition, conviction, … Extending the RE supervised scheme Possible architecture –Classifier for event trigger –Classifier for each slot –Possibly joint classification rather than pipeline 10