1 Part-of-Speech (POS) Tagging Revisited Mark Sharp CS-536 Machine Learning Term Project Fall 2003.

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Part-Of-Speech Tagging and Chunking using CRF & TBL
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Learning with Probabilistic Features for Improved Pipeline Models Razvan C. Bunescu Electrical Engineering and Computer Science Ohio University Athens,
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.
ANLE1 CC 437: Advanced Natural Language Engineering ASSIGNMENT 2: Implementing a query expansion component for a Web Search Engine.
Part-of-speech Tagging cs224n Final project Spring, 2008 Tim Lai.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Shallow Parsing.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Transformation-based error- driven learning (TBL) LING 572 Fei Xia 1/19/06.
1 I256: Applied Natural Language Processing Marti Hearst Sept 25, 2006.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור חמישי POS Tagging Algorithms עידו.
Part of speech (POS) tagging
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
Part-of-Speech Tagging & Sequence Labeling
BIOI 7791 Projects in bioinformatics Spring 2005 March 22 © Kevin B. Cohen.
From Textual Information to Numerical Vectors Chapters Presented by Aaron Hagan.
CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.
TopicTrend By: Jovian Lin Discover Emerging and Novel Research Topics.
TagHelper & SIDE Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
EMNLP’01 19/11/2001 ML: Classical methods from AI –Decision-Tree induction –Exemplar-based Learning –Rule Induction –T ransformation B ased E rror D riven.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
NLP LINGUISTICS 101 David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Distributional Part-of-Speech Tagging Hinrich Schütze CSLI, Ventura Hall Stanford, CA , USA NLP Applications.
Ling 570 Day 17: Named Entity Recognition Chunking.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
HW7 Extracting Arguments for % Ang Sun March 25, 2012.
Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Conversion of Penn Treebank Data to Text. Penn TreeBank Project “A Bank of Linguistic Trees” (as of 11/1992) University of Pennsylvania, LINC Laboratory.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-16: Probabilistic parsing; computing probability of.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
Digital Text and Data Processing Week 4. □ Making computers understand languages spoken by human beings □ Applications: □ Part of Speech Tagging □ Sentiment.
February 2007CSA3050: Tagging III and Chunking 1 CSA2050: Natural Language Processing Tagging 3 and Chunking Transformation Based Tagging Chunking.
Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.
Part-of-speech tagging
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Machine Learning in Practice Lecture 13 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Course Review #2 and Project Parts 3-6 LING 572 Fei Xia 02/14/06.
LING/C SC/PSYC 438/538 Lecture 9 Sandiway Fong. Adminstrivia Homework 4 graded Homework 5 out today – Due Saturday night by midnight – (Gives me Sunday.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
Modified from Diane Litman's version of Steve Bird's notes 1 Rule-Based Tagger The Linguistic Complaint –Where is the linguistic knowledge of a tagger?
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
Prototype-Driven Grammar Induction Aria Haghighi and Dan Klein Computer Science Division University of California Berkeley.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Part-of-Speech Tagging CSE 628 Niranjan Balasubramanian Many slides and material from: Ray Mooney (UT Austin) Mausam (IIT Delhi) * * Mausam’s excellent.
Introduction to Machine Learning and Text Mining
Syntactic Category Prediction for Improving Translation Quality in English-Korean Machine Translation Sung-Dong Kim, Dept. of Computer Engineering, Hansung.
Progress report on Semantic Role Labeling
Presentation transcript:

1 Part-of-Speech (POS) Tagging Revisited Mark Sharp CS-536 Machine Learning Term Project Fall 2003

2 The False Drop Problem blind venetian vs venetian blind dog bites man vs man bites dog

3 POS Tagging blind/adj venetian/propernoun vs venetian/adj blind/noun dog/noun bites/verb man/noun vs man/noun bites/verb dog/noun

4 Syntactic Parsing blind/adj venetian/propernoun vs venetian/adj blind/noun dog/noun/subj bites/verb/pred man/noun/DO vs man/noun/subj bites/verb/pred dog/noun/DO

5 Semantic Tagging blind/adj venetian/propernoun vs venetian/adj blind/noun dog/noun/subj bites/verb/pred man/noun/DO vs man/noun/subj bites/verb/pred dog/noun/DO

6 Wanted: automatic POS tagger ~95% of words are unambiguous Many hand-tagged corpora for training Rule-based: tag n  feature 1, feature 2,… Stochastic: P(tag n |tag n-1, tag n-2,…) Brill: search “rule space” for most effective “transformations” of dominant/majority tags and paint them onto tagged text dynamically.

7 Problem: rule space is infinite ………………………………………………… ……………………………………………… ……………………………………………… ……………………………………………… ………………………?……………………… ……………………………………………… ……………………………………………… ……………………………………………… ………………………………………………

8 Brill’s solution: narrow window + rule templates …?…

9 Shotgun Approach Weka J48 (C4.5 pruned decision tree) Susanne corpus: 571 unique token/words with >1 POS tag out of tokens. Attributes: 1. token n 2. tag n (class to predict) tag n-10, tag n-9,… tag n+9, tag n token n-10, token n-9,… token n+10 Result: 4.1%  2.2% error rate (curve C)

10 TBL Approach (true) For each of 571 unique token/words with >1 POS tag, compute and apply trans- formation rule in order of error impact. Attributes: 1. token n 2. tag n (class to predict) tag n-10, tag n-9,… tag n+9, tag n token n-10, token n-9,… token n+10 Big Java project!

11.………?………. Shotgun Approach

12.………?………. ………..………. TBL Approach (true)

13 TBL Approach (simulated) Do a shotgun 22-attribute J48 for 544 low- impact tokens (curve A). For each of 17 high-impact tokens, do a single-token 42-attribute J48 and apply transformation rule in order of error impact (curve B). Result: 4.1%  2.1% error rate For single tokens, J48 rules look a lot like Brill’s templates !!!

14 TBL Approach (simulated)

15 A J48 transformation rule if (token[n] EQUALS "to") then set tag[n] to "TO" if (tag[n+1] MEMBER_OF {CD COL DOL DT JJ JJR NN NNP NNS PDT PP PPS RB VBG VBN WDT WRB} [6 new cases] then change tag[n] to "IN" if (tag[n+1] EQUALS "RB") if (tag[n+2] EQUALS "VB") [3 new cases] then change tag[n] back to "TO"

16 Details, paper, data, scripts, etc.