Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond.

Slides:



Advertisements
Similar presentations
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advertisements

Features, Formalized Stephen Mayhew Hyung Sul Kim 1.
Online Max-Margin Weight Learning for Markov Logic Networks Tuyen N. Huynh and Raymond J. Mooney Machine Learning Group Department of Computer Science.
Coreference Based Event-Argument Relation Extraction on Biomedical Text Katsumasa Yoshikawa 1), Sebastian Riedel 2), Tsutomu Hirao 3), Masayuki Asahara.
Page 1 SRL via Generalized Inference Vasin Punyakanok, Dan Roth, Wen-tau Yih, Dav Zimak, Yuancheng Tu Department of Computer Science University of Illinois.
Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
A Linear Programming Formulation for Global Inference in Natural Language Tasks Dan RothWen-tau Yih Department of Computer Science University of Illinois.
计算机科学与技术学院 Chinese Semantic Role Labeling with Dependency-driven Constituent Parse Tree Structure Hongling Wang, Bukang Wang Guodong Zhou NLP Lab, School.
A Joint Model For Semantic Role Labeling Aria Haghighi, Kristina Toutanova, Christopher D. Manning Computer Science Department Stanford University.
Shallow Parsing CS 4705 Julia Hirschberg 1. Shallow or Partial Parsing Sometimes we don’t need a complete parse tree –Information extraction –Question.
Page-level Template Detection via Isotonic Smoothing Deepayan ChakrabartiYahoo! Research Ravi KumarYahoo! Research Kunal PuneraUniv. of Texas at Austin.
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
Page 1 Generalized Inference with Multiple Semantic Role Labeling Systems Peter Koomen, Vasin Punyakanok, Dan Roth, (Scott) Wen-tau Yih Department of Computer.
Dept. of Computer Science & Engg. Indian Institute of Technology Kharagpur Part-of-Speech Tagging and Chunking with Maximum Entropy Model Sandipan Dandapat.
CPSC 422, Lecture 18Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 18 Feb, 25, 2015 Slide Sources Raymond J. Mooney University of.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
STRUCTURED PERCEPTRON Alice Lai and Shi Zhi. Presentation Outline Introduction to Structured Perceptron ILP-CRF Model Averaged Perceptron Latent Variable.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts G. Melli, M. Ester, A. Sarkar Dec. 6, 2007
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
Graphical models for part of speech tagging
Illinois-Coref: The UI System in the CoNLL-2012 Shared Task Kai-Wei Chang, Rajhans Samdani, Alla Rozovskaya, Mark Sammons, and Dan Roth Supported by ARL,
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Learning from Observations Chapter 18 Through
JM - 1 Introduction to Bioinformatics: Lecture III Genome Assembly and String Matching Jarek Meller Jarek Meller Division of Biomedical.
Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Xiangnan Kong,Philip S. Yu Multi-Label Feature Selection for Graph Classification Department of Computer Science University of Illinois at Chicago.
CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Slides for “Data Mining” by I. H. Witten and E. Frank.
Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.
CPSC 422, Lecture 19Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 19 Oct, 23, 2015 Slide Sources Raymond J. Mooney University of.
Supertagging CMSC Natural Language Processing January 31, 2006.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Improving the Classification of Unknown Documents by Concept Graph Morteza Mohagheghi Reza Soltanpour
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
1 Relational Factor Graphs Lin Liao Joint work with Dieter Fox.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Network Management Lecture 13. MACHINE LEARNING TECHNIQUES 2 Dr. Atiq Ahmed Université de Balouchistan.
Mastering the Pipeline CSCI-GA.2590 Ralph Grishman NYU.
Lecture 7: Constrained Conditional Models
Raymond J. Mooney University of Texas at Austin
Web News Sentence Searching Using Linguistic Graph Similarity
Relation Extraction CSCI-GA.2591
By Dan Roth and Wen-tau Yih PowerPoint by: Reno Kriz CIS
(Entity and) Event Extraction CSCI-GA.2591
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Automatic Detection of Causal Relations for Question Answering
Discriminative Probabilistic Models for Relational Data
Hierarchical, Perceptron-like Learning for OBIE
Ping LUO*, Fen LIN^, Yuhong XIONG*, Yong ZHAO*, Zhongzhi SHI^
Dan Roth Department of Computer Science
Presentation transcript:

Department of Computer Science The University of Texas at Austin USA Joint Entity and Relation Extraction using Card-Pyramid Parsing Rohit J. Kate Raymond J. Mooney

2 Entity and Relation Extraction Information Extraction is the task of extracting structured information from text Entity Extraction Person, Location, Organization Relation Extraction Located_In(Location, Location) Work_For(Person, Organization) OrgBased_In(Organization, Location) Live_In(Person, Location) Kill(Person, Person)

3 Entity and Relation Extraction Austin lives in Los Angeles, California and works there for an American company called ABC Inc. PersonLocation OtherOrganization Work_For OrgBased_In Live_In Located_In

4 Entity and Relation Extraction Traditionally, entity and relation extraction is done in a pipeline First entities are extracted Then relations are extracted assuming that the extracted entities are correct

5 Entity and Relation Extraction However, relations can influence entity extraction Austin lives in Los Angeles, California and works there for an American company called ABC Inc. Person? Location? Location Live_In Person

6 Entity and Relation Extraction Relations can also influence extracting other relations Austin lives in Los Angeles, California and works there for an American company called ABC Inc. PersonLocation Organization Work_For OrgBased_In Live_In

7 Joint Entity and Relation Extraction Both entity and relation extraction can benefit if done jointly –Correct errors of each other –Influence each other A brute force algorithm to find the most probable joint extraction is intractable –If there are n entities in a sentence then O(n 2 ) possible relations between them and for r relation labels O(r n^2 ) possibilities We present a new method for joint extraction

8 Joint Entity and Relation Extraction Treat it analogous to parsing with the following productions: –Entity productions: Person  Candidate_entity Location  Candidate_entity Organization  Candidate_entity –Relation productions: Located_In  Location Location Work_For  Person Organization OrgBased_In  Organization Location Live_In  Person Location Kill  Person Person

9 Joint Entity and Relation Extraction However, many entities are in multiple relations, with a lot of overlapping Context-free grammar (CFG) tree structure is not adequate We introduce a new structure we call card-pyramid Austin lives in Los Angeles, California and works there for an American company called ABC Inc. PersonLocation OtherOrganization Work_For OrgBased_In Live_In Located_In

10 Joint Entity and Relation Extraction using Card-Pyramid Austin lives in Los Angeles, California and works there for an American company called ABC Inc. PersonLocation OtherOrganization Work_For OrgBased_In Live_In Located_In Person Location Other Organization (Austin)(Los Angeles)(California)(American)(ABC Inc) Work_For OrgBased_In Not_Related Candidate entities Live_In Located_In

11 Joint Entity and Relation Extraction using Card-Pyramid Entities and their relations are compactly represented in a card-pyramid graph Joint entity and relation extraction reduces to finding the most probable joint labeling of its nodes We developed an efficient bottom-up card- pyramid parsing algorithm which uses dynamic programming and beam search, given entity and relation classifiers

12 Distinction from CFG Tree No overlap Overlap CFG Tree Card-Pyramid

13 Distinction from CFG Tree No overlap Overlap CFG Tree Card-Pyramid

14 Card-Pyramid Parsing Assumes candidate entities are given –Can be obtained automatically [Punyakanok & Roth, 2001] –Use a simple heuristic, like all noun-phrase chunks –In the worst case include every substring, they will get label Other if they are none of the given types Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc)

15 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Apply entity classifiers at the leaf nodes SVM with standard features: words, POS tags, capitalization, gazetteer, suffixes etc. Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Beam Beam element

16 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Apply relation classifiers bottom-up at the internal nodes relating leftmost and rightmost leaves SVM with word subsequence kernel for before, between and after patterns of the two entities [Bunescu & Mooney, 2005] Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01

17 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Consider every combination of children’s beam elements

18 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01Work_For0.1 Work_For  Per Org

19 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01Work_For Work_For  Per Org

20 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01Work_For

21 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Located_In Work_For Located_In  Loc Loc

22 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Located_In Work_For

23 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Live_In  Per Loc

24 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related …………..

25 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related …………..

26 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related …………..

27 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In0.9 Live_In  Per Loc Relations with in-between entities are also used as features, for e.g. “Live_In -- Located_In” In general, any features can be used from the sub-card-pyramid underneath.

28 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In Live_In  Per Loc Divide the probability which gets multiplied twice.

29 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In A beam element represents a sub-card-pyramid.

30 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In

31 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In X An O(1) check for consistency of overlap.

32 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In X An O(1) check for consistency of overlap.

33 Card-Pyramid Parsing Austin lives in Los Angeles, California and works there for an American company called ABC Inc. (Austin)(Los Angeles)(California)(American)(ABC Inc) Loc0.9 Per0.8 Org0.1 Loc0.95 Org0.2 Oth0.01 Loc0.98 Per0.1 Oth0.02 Oth0.8 Per0.2 Org0.1 Org0.92 Loc0.21 Oth0.01 Live_In Located_In Work_For Located_In OrgBased_In Not_Related ………….. Live_In Most probable card-pyramid is represented by the top beam element at the root. An approximation because of the finite beam size.

34 Training Classifiers Obtain correctly labeled card-pyramids for the annotated sentences in the training data Collect the positive examples for all the classifiers from the labels of the card-pyramids Positive examples for an entity classifier become negative examples for all other entity classifiers Pairs of entities with correct entity types but not related by a relation become negative examples for that relation’s classifier

35 Related Work Roth & Yih [2004, 2007] –Employs independent entity and relation classifiers –Uses linear programming to find a consistent global solution from the classifier outputs –Output of other classifiers can’t be used as features Riedel et al. [2009] –Solves a related problem of extracting bio-molecular events and their arguments using Markov Logic Network –Single joint probabilistic model –Restricts extractors’ learning algorithm to Markov Logic Network’s learning algorithm, for example, cannot use kernel- based SVM for relation extraction Kate & Mooney [2006] –Parse using a suite of classifiers to find the most probable semantic parse

36 Experiments Dataset used by Roth & Yih [2004, 2007] Number of sentences: 1437 Entities: Person (1685), Location (1968), Organization (978), Other (705) Relation Extraction Located_In(Location, Location) (406) Work_For(Person, Organization) (394) OrgBased_In(Organization, Location) (451) Live_In(Person, Location) (521) Kill(Person, Person) (268) Not_Related (17007)

37 Experiments Performed five-fold cross-validation Measured: –Precision (percentage of output labels correct) –Recall (percentage of gold-standard labels correctly identified) –F-measure (harmonic mean) Compared with: –A pipelined approach using our entity and relation classifiers –Best results of Roth & Yih [2007] on joint extraction

38 Results: Entity Extraction F-measure

39 Results: Relation Extraction F-measure An unusual sentence with 20 Locations separated by commas.

40 A General Method to Extract Structured Information from a Sentence Encode what you want to extract and constraints between them in the productions Train a classifier for every production Apply the classifiers to find the most probable structure allowed by the productions to jointly find the structured information

41 Future Work Extract higher order relations: relations between relations such as temporal or causal relations Jointly perform co-reference resolution with entity and relation extraction –Add a new production: Coref  Person Person Model the structure of card-pyramid using a probabilistic graphical model A kernel to compute similarity between two card- pyramids and use it for relation classifier

42 Conclusions Introduced a card-pyramid structure for joint entity and relation extraction Compactly encode entities and relations in a sentence Joint extraction reduces to jointly labeling the nodes Presented an efficient parsing algorithm for joint labeling Experiments demonstrated benefits of the approach

43 Thanks! Questions?