Unsupervised Models for Named Entity Classifcation Michael Collins Yoram Singer AT&T Labs, 1999.

Slides:



Advertisements
Similar presentations
Multi-Document Person Name Resolution Michael Ben Fleischman (MIT), Eduard Hovy (USC) From Proceedings of ACL-42 Reference Resolution workshop 2004.
Advertisements

CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Perceptron Learning Rule
Specialized models and ranking for coreference resolution Pascal Denis ALPAGE Project Team INRIA Rocquencourt F Le Chesnay, France Jason Baldridge.
A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
Semi-Supervised Learning & Summary Advanced Statistical Methods in NLP Ling 572 March 8, 2012.
Semi-supervised Relation Extraction with Large-scale Word Clustering Ang Sun Ralph Grishman Satoshi Sekine New York University June 20, 2011 NYU.
Part-Of-Speech Tagging and Chunking using CRF & TBL
A Self Learning Universal Concept Spotter By Tomek Strzalkowski and Jin Wang Original slides by Iman Sen Edited by Ralph Grishman.
Shallow Parsing CS 4705 Julia Hirschberg 1. Shallow or Partial Parsing Sometimes we don’t need a complete parse tree –Information extraction –Question.
1 A Sentence Boundary Detection System Student: Wendy Chen Faculty Advisor: Douglas Campbell.
Semi-supervised learning and self-training LING 572 Fei Xia 02/14/06.
Faculty Of Applied Science Simon Fraser University Cmpt 825 presentation Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary Jiri.
Unsupervised Models for Named Entity Classification Michael Collins and Yoram Singer Yimeng Zhang March 1 st, 2007.
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Shallow Parsing.
1 Noun Homograph Disambiguation Using Local Context in Large Text Corpora Marti A. Hearst Presented by: Heng Ji Mar. 29, 2004.
Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.
Combining Labeled and Unlabeled Data for Multiclass Text Categorization Rayid Ghani Accenture Technology Labs.
A Self Learning Universal Concept Spotter By Tomek Strzalkowski and Jin Wang Presented by Iman Sen.
Part of speech (POS) tagging
Semi-Supervised Natural Language Learning Reading Group I set up a site at: ervised/
Bootstrapping Goals: –Utilize a minimal amount of (initial) supervision –Obtain learning from many unlabeled examples (vs. selective sampling) General.
Anaphora Resolution Sanghoon Kwak Takahiro Aoyama.
Information Extraction with Unlabeled Data Rayid Ghani Joint work with: Rosie Jones (CMU) Tom Mitchell (CMU & WhizBang! Labs) Ellen Riloff (University.
Semantic Grammar CSCI-GA.2590 – Lecture 5B Ralph Grishman NYU.
1 Named Entity Recognition based on three different machine learning techniques Zornitsa Kozareva JRC Workshop September 27, 2005.
Partial Parsing CSCI-GA.2590 – Lecture 5A Ralph Grishman NYU.
Evaluation CSCI-GA.2590 – Lecture 6A Ralph Grishman NYU.
1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
NYU: Description of the Proteus/PET System as Used for MUC-7 ST Roman Yangarber & Ralph Grishman Presented by Jinying Chen 10/04/2002.
BioSnowball: Automated Population of Wikis (KDD ‘10) Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/11/30 1.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Bootstrapping Information Extraction with Unlabeled Data Rayid Ghani Accenture Technology Labs Rosie Jones Carnegie Mellon University & Overture (With.
CSA2050 Introduction to Computational Linguistics Parsing I.
Combining labeled and unlabeled data for text categorization with a large number of categories Rayid Ghani KDD Lab Project.
Page 1 NAACL-HLT 2010 Los Angeles, CA Training Paradigms for Correcting Errors in Grammar and Usage Alla Rozovskaya and Dan Roth University of Illinois.
COP5992 – DATA MINING TERM PROJECT RANDOM SUBSPACE METHOD + CO-TRAINING by SELIM KALAYCI.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
NLP. Parsing Manually label a set of instances. Split the labeled data into training and testing sets. Use the training data to find patterns. Apply.
Semi-automatic Product Attribute Extraction from Store Website
Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.
11 Project, Part 3. Outline Basics of supervised learning using Naïve Bayes (using a simpler example) Features for the project 2.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
Concept-Based Analysis of Scientific Literature Chen-Tse Tsai, Gourab Kundu, Dan Roth UIUC.
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.
Learning to Extract CSCI-GA.2590 – Supplement for Lecture 9 Ralph Grishman NYU.
Relation Extraction: Rule-based Approaches CSCI-GA.2590 Ralph Grishman NYU.
Learning to Extract CSCI-GA.2590 Ralph Grishman NYU.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Relation Extraction CSCI-GA.2591
NYU Coreference CSCI-GA.2591 Ralph Grishman.
Construct State Modification in the Arabic Treebank
Prototype-Driven Learning for Sequence Models
LING/C SC 581: Advanced Computational Linguistics
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui
Introduction Task: extracting relational facts from text
Unit 2 Lesson 3: Singular and Plural Nouns
Automatic Detection of Causal Relations for Question Answering
PREPOSITIONAL PHRASES
KnowItAll and TextRunner
Presentation transcript:

Unsupervised Models for Named Entity Classifcation Michael Collins Yoram Singer AT&T Labs, 1999

The Task Tag phrases with “person”, “organization” or “location”. For example, R alph Grishman, of NYU, sure is swell.

WHY? Unlabeled data Labeled data

Spelling Rules The approach uses two kinds of rules –Spelling Simple look up to see “Honduras” is a location! Look for words in string, like “Mr.”

Contextual Rules –Contextual Words surrounding the string –A rule that any proper name modified by an appositive whose head is “president” is a person.

Two Categories of Rules The key to the method is redundancy in the two kind of rules. …says Mr. Cooper, a vice president of… spelling contextual Unlabeled data gives us these hints! contextualspelling

The Experiment 970,000 New York Times sentences were parsed. Sequences of NNP and NNPS were then extracted as named entity examples if they met one of two critereon.

Kinds of Noun Phrases 1.There was an appositive modifier to the NP, whose head is a singular noun (tagged NN). –…says Maury Cooper, a vice president… 2. The NP is a compliment to a preposition which is the head of a PP. This PP modifies another NP whose head is a singular noun. –… fraud related to work on a federally funded sewage plant in Georgia.

(spelling, context) pairs created …says Maury Cooper, a vice president… –(Maury Cooper, president) … fraud related to work on a federally funded sewage plant in Georgia. –(Georgia, plant_in)

Rules Set of rules –Full-string=x (full-string=Maury Cooper) –Contains(x) (contains(Maury)) –Allcap1 IBM –Allcap2 N.Y. –Nonalpha=x A.T.&T. (nonalpha=..&.) –Context = x (context = president) –Context-type = xappos or prep

SEED RULES Full-string = New York Full-string = California Full-string = U.S. Contains(Mr.) Contains(Incorporated) Full-string=Microsoft Full-string=I.B.M.

The Algorithm 1.Initialize: Set the spelling decision list equal to the set of seed rules. 2.Label the training set using these rules. 3.Use these to get contextual rules. (x = feature, y = label) 4.Label set using contextual rules, and use to get sp. rules. 5.Set spelling rules to seed plus the new rules. 6.If less than threshold new rules, go to 2 and add 15 more. 7.When finished, label the training data with the combined spelling/contextual decision list, then induce a final decision list from the labeled examples where all rules are added to the decision list.

Example (IBM, company) –…IBM, the company that makes… (General Electric, company) –..General Electric, a leading company in the area,… (General Electric, employer ) –… joined General Electric, the biggest employer… (NYU, employer) –NYU, the employer of the famous Ralph Grishman,…

The Power I.B.M. Mr. Two classifiers both give labels on 49.2% of unlabeled examples Agree on 99.25% of them!

Evaluation 88,962 (spelling, context) pairs. –971,746 sentences 1,000 randomly extracted to be test set. Location, person, organization, noise 186, 289, 402, 123 Took out 38 temporal noise. Clean Accuracy: Nc/ 962 Noise Accuracy: Nc/(962-85)

Results AlgorithmClean AccuracyNoise Accuracy Baseline45.8%41.8% EM83.1%75.8% Yarowsky %74.1% Yarowsky Cautious91.2%83.2% DL-CoTrain91.3%83.3% CoBoost91.1%83.1%

QUESTIONS

Thank you! pbskids.org/