NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen.

Slides:



Advertisements
Similar presentations
Information Extraction Lecture 4 – Named Entity Recognition II CIS, LMU München Winter Semester Dr. Alexander Fraser, CIS.
Advertisements

Overview of the TAC2013 Knowledge Base Population Evaluation: English Slot Filling Mihai Surdeanu with a lot help from: Hoa Dang, Joe Ellis, Heng Ji, and.
Overview of the KBP 2013 Slot Filler Validation Track Hoa Trang Dang National Institute of Standards and Technology.
1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.
The Impact of Task and Corpus on Event Extraction Systems Ralph Grishman New York University Malta, May 2010 NYU.
Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,
Automatic indexing and retrieval of crime-scene photographs Katerina Pastra, Horacio Saggion, Yorick Wilks NLP group, University of Sheffield Scene of.
A Self Learning Universal Concept Spotter By Tomek Strzalkowski and Jin Wang Original slides by Iman Sen Edited by Ralph Grishman.
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Ang Sun Ralph Grishman Wei Xu Bonan Min November 15, 2011 TAC 2011 Workshop Gaithersburg, Maryland USA.
Information Extraction CS 652 Information Extraction and Integration.
Information Extraction CS 652 Information Extraction and Integration.
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
July 9, 2003ACL An Improved Pattern Model for Automatic IE Pattern Acquisition Kiyoshi Sudo Satoshi Sekine Ralph Grishman New York University.
J. Turmo, 2006 Adaptive Information Extraction Summary Information Extraction Systems Multilinguality Introduction Language guessers Machine Translators.
Machine Learning for Information Extraction Li Xu.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
Introduction to Information Extraction Chia-Hui Chang Dept. of Computer Science and Information Engineering, National Central University, Taiwan
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Empirical Methods in Information Extraction - Claire Cardie 자연어처리연구실 한 경 수
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Online Learning for Web Query Generation: Finding Documents Matching a Minority Concept on the Web Rayid Ghani Accenture Technology Labs, USA Rosie Jones.
Automatic Acquisition of Lexical Classes and Extraction Patterns for Information Extraction Kiyoshi Sudo Ph.D. Research Proposal New York University Committee:
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
Introduction to Machine Learning Approach Lecture 5.
Information Extraction with Unlabeled Data Rayid Ghani Joint work with: Rosie Jones (CMU) Tom Mitchell (CMU & WhizBang! Labs) Ellen Riloff (University.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Logic Programming for Natural Language Processing Menyoung Lee TJHSST Computer Systems Lab Mentor: Matt Parker Analytic Services, Inc.
Learning Information Extraction Patterns Using WordNet Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield,
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
Natural Language Processing Group Department of Computer Science University of Sheffield, UK Improving Semi-Supervised Acquisition of Relation Extraction.
Reyyan Yeniterzi Weakly-Supervised Discovery of Named Entities Using Web Search Queries Marius Pasca Google CIKM 2007.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark Greenwood Natural Language Processing Group University of Sheffield, UK.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Bug Localization with Machine Learning Techniques Wujie Zheng
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali Vasileios Hatzivassiloglou The University.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Querying Text Databases for Efficient Information Extraction Eugene Agichtein Luis Gravano Columbia University.
Exploiting Subjectivity Classification to Improve Information Extraction Ellen Riloff University of Utah Janyce Wiebe University of Pittsburgh William.
NLP: An Information Extraction Perspective Ralph Grishman September 2005 NYU.
Efficiently Computed Lexical Chains As an Intermediate Representation for Automatic Text Summarization H.G. Silber and K.F. McCoy University of Delaware.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Information extraction from text Spring 2003, Part 4 Helena Ahonen-Myka.
A Semantic Approach to IE Pattern Induction Mark Stevenson and Mark A. Greenwood Natural Language Processing Group University of Sheffield, UK.
NYU: Description of the Proteus/PET System as Used for MUC-7 ST Roman Yangarber & Ralph Grishman Presented by Jinying Chen 10/04/2002.
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Advantages of Query Biased Summaries in Information Retrieval by A. Tombros and M. Sanderson Presenters: Omer Erdil Albayrak Bilge Koroglu.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Acquisition of Categorized Named Entities for Web Search Marius Pasca Google Inc. from Conference on Information and Knowledge Management (CIKM) ’04.
Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Learning Extraction Patterns for Subjective Expressions 2007/10/09 DataMining Lab 안민영.
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Automatically Labeled Data Generation for Large Scale Event Extraction
Introduction to Information Extraction
Director, Proteus Project Research in Natural Language Processing
Social Knowledge Mining
Applying Key Phrase Extraction to aid Invalidity Search
Natural Language Processing at NYU: the Proteus Project
Presentation transcript:

NYU ANLP-00 1 Automatic Discovery of Scenario-Level Patterns for Information Extraction Roman Yangarber Ralph Grishman Pasi Tapanainen Silja Huttunen

NYU 2 Outline  Information Extraction: background  Problems in IE  Prior Work: Machine Learning for IE  Discover patterns from raw text  Experimental results  Current work

NYU 3 Quick Overview  What is Information Extraction ?  Definition: –finding facts about a specified class of events from free text –filling a table in a data base (slots in a template)  Events: instances in relations, with many arguments

NYU 4 –George Garrick, 40 years old, president of the London-based European Information Services Inc., was appointed chief executive officer of Nielsen Marketing Research, USA. Example: Management Succession

NYU 5 Example: Management Succession –George Garrick, 40 years old, president of the London-based European Information Services Inc., was appointed chief executive officer of Nielsen Marketing Research, USA.

NYU 6 discourse sentence Lexical Analysis System Architecture: Proteus Name Recognition Partial Syntax Scenario Patterns Reference Resolution Discourse Analyzer Output Generation Input Text Extracted Information

NYU 7 discourse sentence Lexical Analysis System Architecture: Proteus Name Recognition Partial Syntax Scenario Patterns Reference Resolution Discourse Analyzer Output Generation Input Text Extracted Information

NYU 8 Problems  Customization  Performance

NYU 9 Problems: Customization To customize a system for a new extraction task, we have to develop –new patterns for new types of events –word classes for the domain –inference rules This can be a large job requiring skilled labor –expense of customization limits uses of extraction

NYU 10 Problems: Performance  Performance on event IE is limited  On MUC tasks, typical top performance is recall < 55%, precision < 75%  Errors propagate through multiple phases: –name recognition errors –syntax analysis errors –missing patterns –reference resolution errors –complex inference required

NYU 11 Missing Patterns  As with many language phenomena –a few common patterns –a large number of rare patterns  Rare patterns do not surface sufficiently often in limited corpus  Missing patterns make customization expensive and limit performance  Finding good patterns is necessary to improve customization and performance Freq Rank

NYU 12 Prior Research  build patterns from examples –Yangarber ‘97  generalize from multiple examples: annotated text –Crystal, Whisk (Soderland), Rapier (Califf)  active learning: reduce annotation –Soderland ‘99, Califf ‘99  learning from corpus with relevance judgements –Riloff ‘96, ‘99  co-learning/bootstrapping –Brin ‘98, Agichtein ‘00

NYU 13 Our Goals  Minimize manual labor required to construct pattern bases for new domain –un-annotated text –un-classified text –un-supervised learning  Use very large corpora -- larger than we could ever tag manually -- to improve coverage of patterns

NYU 14 Principle: Pattern Density  If we have relevance judgements for documents in a corpus, for the given task, then the patterns which are much more frequent in relevant documents will generally be good patterns  Riloff (1996) finds patterns related to terrorist attacks

NYU 15 Principle: Duality  Duality between patterns and documents: –relevant documents are strong indicators of good patterns –good patterns are strong indicators of relevant documents

NYU 16 Outline of Procedure  Initial query: a small set of seed patterns which partially characterize the topic of interest repeat  Initial query: a small set of seed patterns which partially characterize the topic of interest  Retrieve documents containing seed patterns: “relevant documents”  Initial query: a small set of seed patterns which partially characterize the topic of interest  Retrieve documents containing seed patterns: “relevant documents”  Rank patterns in relevant documents by frequency in relevant docs vs. overall frequency  Initial query: a small set of seed patterns which partially characterize the topic of interest  Retrieve documents containing seed patterns: “relevant documents”  Rank patterns in relevant documents by frequency in relevant docs vs. overall frequency  Add top-ranked pattern to seed pattern set

17 #1: pick seed pattern Seed:

18 #2: retrieve relevant documents Seed:  Fred retired. ... Harry was named president.  Maki retired. ... Yuki was named president.  Relevant documents  Other documents

19 #3: pick new pattern Seed: appears in several relevant documents (top-ranked by Riloff metric)  Fred retired. ... Harry was named president.  Maki retired. ... Yuki was named president.

20 #4: add new pattern to pattern set Pattern set:

NYU 21 Pre-processing  For each document, find and classify names: –{ person | location | organization | …}  Parse document –(regularize passive, relative clauses, etc.)  For each clause, collect a candidate pattern: tuple: heads of –[ subject verb direct object object/subject complement locative and temporal modifiers … ]

NYU 22 Experiment  Task: Management succession (as MUC-6)  Source: Wall Street Journal  Training corpus: ~ 6,000 articles  Test corpus: –100 documents: MUC-6 formal training –+ 150 documents judged manually

NYU 23 Experiment: two seed patterns  v-appoint = { appoint, elect, promote, name }  v-resign = { resign, depart, quit, step-down }  Run discovery procedure for 80 iterations

NYU 24 Evaluation  Look at discovered patterns –new patterns, missed in manual training  Document filtering  Slot filling

NYU 25 Discovered patterns

NYU 26 Evaluation: new patterns  Not found in manual training

NYU 27 Evaluation: Text Filtering  How effective are discovered patterns at selecting relevant documents? –IR-style –documents matching at least one pattern

NYU 28

NYU 29

NYU 30 Evaluation: Slot filling  How effective are patterns within a complete IE system?  MUC-style IE on MUC-6 corpora  Caveat

NYU 31 Evaluation: Slot filling  How effective are patterns within a complete IE system?  MUC-style IE on MUC-6 corpora  Caveat

NYU 32 Evaluation: Slot filling  How effective are patterns within a complete IE system?  MUC-style IE on MUC-6 corpora  Caveat

NYU 33 Conclusion: Automatic discovery  Performance comparable to human (4-week development)  From un-annotated text: allows us to take advantage of very large corpora –redundancy –duality  Will likely help wider use of IE

NYU 34

NYU 35 Good Patterns  U - universe of all documents R - set of relevant documents H= H(p) - set of documents where pattern p matched  Density criterion:

NYU 36 Graded Relevance  Documents matching seed patterns considered 100% relevant  Discovered patterns are considered less certain  Documents containing them are considered partially relevant

NYU 37  document frequency in relevant documents overall document frequency  document frequency in relevant documents –(metrics similar to those used in Riloff-96) Scoring Patterns