INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction.

Slides:



Advertisements
Similar presentations
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
Advertisements

Tracking L2 Lexical and Syntactic Development Xiaofei Lu CALPER 2010 Summer Workshop July 14, 2010.
Features, Formalized Stephen Mayhew Hyung Sul Kim 1.
K-NEAREST NEIGHBORS AND DECISION TREE Nonparametric Supervised Learning.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Relation Extraction.
807 - TEXT ANALYTICS Massimo Poesio Lecture 8: Relation extraction.
Machine learning continued Image source:
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
Product Feature Discovery and Ranking for Sentiment Analysis from Online Reviews. __________________________________________________________________________________________________.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Reporter: Longhua Qian School of Computer Science and Technology
Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.
The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.
Sentence Classifier for Helpdesk s Anthony 6 June 2006 Supervisors: Dr. Yuval Marom Dr. David Albrecht.
Learning syntactic patterns for automatic hypernym discovery Rion Snow, Daniel Jurafsky and Andrew Y. Ng Prepared by Ang Sun
A Global Relaxation Labeling Approach to Coreference Resolution Coling 2010 Emili Sapena, Llu´ıs Padr´o and Jordi Turmo TALP Research Center Universitat.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Invitation to Computer Science 5th Edition
Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts G. Melli, M. Ester, A. Sarkar Dec. 6, 2007
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Natural Language Processing Group Department of Computer Science University of Sheffield, UK Improving Semi-Supervised Acquisition of Relation Extraction.
Nir Grinberg and William M. Pottenger, Ph.D. Rutgers University 03/30/
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
October 2005CSA3180: Text Processing II1 CSA3180: Natural Language Processing Text Processing 2 Shallow Parsing and Chunking Python and NLTK NLTK Exercises.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Querying Text Databases for Efficient Information Extraction Eugene Agichtein Luis Gravano Columbia University.
A Language Independent Method for Question Classification COLING 2004.
Date: 2014/02/25 Author: Aliaksei Severyn, Massimo Nicosia, Aleessandro Moschitti Source: CIKM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Building.
A S URVEY ON I NFORMATION E XTRACTION FROM D OCUMENTS U SING S TRUCTURES OF S ENTENCES Chikayama Taura Lab. M1 Mitsuharu Kurita 1.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Constructing Knowledge Graph from Unstructured Text Image Source: Kundan Kumar Siddhant Manocha.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
SemEval-2010 Task 8 Multi-Way Classification of Semantic Relations Between Pairs of Nominals 1. Task Description Iris Hendrickx, Su Nam Kim, Zornitsa Kozareva,
Support vector machine LING 572 Fei Xia Week 8: 2/23/2010 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1.
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
October 2005CSA3180: Text Processing II1 CSA3180: Natural Language Processing Text Processing 2 Python and NLTK Shallow Parsing and Chunking NLTK Lite.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.
Dependency Parsing Niranjan Balasubramanian March 24 th 2016 Credits: Many slides from: Michael Collins, Mausam, Chris Manning, COLNG 2014 Dependency Parsing.
A CRF-BASED NAMED ENTITY RECOGNITION SYSTEM FOR TURKISH Information Extraction Project Reyyan Yeniterzi.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
An Integrated Approach for Relation Extraction from Wikipedia Texts Yulan Yan Yutaka Matsuo Mitsuru Ishizuka The University of Tokyo WWW 2009.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
The University of Illinois System in the CoNLL-2013 Shared Task Alla RozovskayaKai-Wei ChangMark SammonsDan Roth Cognitive Computation Group University.
Automatically Labeled Data Generation for Large Scale Event Extraction
k-Nearest neighbors and decision tree
PRESENTED BY: PEAR A BHUIYAN
Relation Extraction CSCI-GA.2591
Improving a Pipeline Architecture for Shallow Discourse Parsing
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Deep learning and applications to Natural language processing
Word embeddings based mapping
Word embeddings based mapping
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Artificial Intelligence 2004 Speech & Natural Language Processing
Bidirectional LSTM-CRF Models for Sequence Tagging
Presentation transcript:

INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction

RE AS A CLASSIFICATION TASK Binary relations Entities already manually/automatically recognized Examples are generated for all sentences with at least 2 entities Number of examples generated per sentence is NC2 – Combination of N distinct entities selected 2 at a time

GENERATING CANDIDATES TO CLASSIFY

RE AS A BINARY CLASSIFICATION TASK

NUMBER OF CANDIDATES TO CLASSIFY – SIMPLE MINDED VERSION

THE SUPERVISED APPROACH TO RE Most current approaches to RE are kernel- based Different information is used – Sequences of words, e.g., through the GLOBAL CONTEXT / LOCAL CONTEXT kernels of Bunescu and Mooney / Giuliano Lavelli & Romano – Syntactic information through the TREE KERNELS of Zelenko et al / Moschitti et al – Semantic information in recent work

KERNEL METHODS: A REMINDER Embedding the input data in a feature space Using a linear algorithm for discovering non-linear patterns Coordinates of images are not needed, only pairwise inner products Pairwise inner products can be efficiently computed directly from X using a kernel function K:X×X→R

MODULARITY OF KERNEL METHODS

THE WORD-SEQUENCE APPROACH Shallow linguistic Information: – tokenization – Lemmatization – sentence splitting – PoS tagging Claudio Giuliano, Alberto Lavelli, and Lorenza Romano (2007), FBK-IRST: Kernel methods for relation extraction, Proc. Of SEMEVAL-2007

LINGUISTIC REALIZATION OF RELATIONS Bunescu & Mooney, NIPS 2005

WORD-SEQUENCE KERNELS Two families of “basic” kernels – Global Context – Local Context Linear combination of kernels Explicit computation – Extremely sparse input representation

THE GLOBAL CONTEXT KERNEL

THE LOCAL CONTEXT KERNEL

LOCAL CONTEXT KERNEL (2)

KERNEL COMBINATION

EXPERIMENTAL RESULTS Biomedical data sets – AIMed – LLL Newspaper articles – Roth and Yih SEMEVAL 2007

EVALUATION METHODOLOGIES

EVALUATION (2)

EVALUATION (3)

EVALUATION (4)

RESULTS ON AIMED

OTHER APPROACHES TO RE Using syntactic information Using lexical features

Syntactic information for RE Pros: – more structured information useful when dealing with long-distance relations Cons: – not always robust – (and not available for all languages)

Zelenko et al JMLR 2003 TREE KERNEL defined over a shallow parse tree representation of the sentences – approach vulnerable to unrecoverable parsing errors data set: 200 news articles (not publicly available) two types of relations : person-affiliation and organization-location

ZELENKO ET AL

CULOTTA & SORENSEN 2004 generalized version of Zelenko’s kernel based on dependency trees (smallest dependency tree containing the two entities of the relation) a bag-of-words kernel is used to compensate syntactic errors data set: ACE 2002 & 2003 results: syntactic information improves performance w.r.t. bag-of-words (good precision but low recall)

CULOTTA AND SORENSEN (2)

EVALUATION CAMPAIGNS FOR RE Much of modern evaluation of methods is done by competing with other teams on evaluation campaigns like MUC and ACE Modern evaluation campaigns for RE: SEMEVAL (now *SEM) Interesting to look also at the problems of – DATA CREATION – EVALUATION METRICS

SEMEVAL th International Workshop on Semantic Evaluations Task 04: Classification of Semantic Relations between Nominals – organizers: Roxana Girju, Marti Hearst, Preslav Nakov, ViviNastase, Stan Szpakowicz, Peter Turney, Deniz Yuret – 14 participating teams

SEMEVAL 2007: THE RELATIONS

SEMEVAL 2007: DATASET CREATION

SEMEVAL 2007: DATASET CREATION (2)

SEMEVAL 2007 – DATASET CREATION (3)

SEMEVAL 2007 – DATASET CREATION (4)

SEMEVAL 2007: DATASET

SEMEVAL 2007: COMPETITION

SEMEVAL 2007: COMPETITION (2)

SEMEVAL 2007: BEST RESULTS

INFLUENCE OF NER ON RE

INFLUENCE OF NER ON RE (2)

GENERATING CANDIDATES

ACKNOWLEDGMENTS Many slides borrowed from – Roxana Girju – Alberto Lavelli