Part-of-speech tagging

Slides:



Advertisements
Similar presentations
Copy the following exactly as it is. DO NOT make corrections!
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Outline Why part of speech tagging? Word classes
BİL711 Natural Language Processing
Used in place of a noun pronoun.
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Sentence Analysis Week 1 – DGP for Pre-AP.
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
Part of Speech Tagging with MaxEnt Re-ranked Hidden Markov Model Brian Highfill.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
1 A Sentence Boundary Detection System Student: Wendy Chen Faculty Advisor: Douglas Campbell.
Part-of-speech Tagging cs224n Final project Spring, 2008 Tim Lai.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
POS based on Jurafsky and Martin Ch. 8 Miriam Butt October 2003.
1 Combining Lexical and Syntactic Features for Supervised Word Sense Disambiguation Saif Mohammad Ted Pedersen University of Toronto University of Minnesota.
Part of speech (POS) tagging
1 Complementarity of Lexical and Simple Syntactic Features: The SyntaLex Approach to S ENSEVAL -3 Saif Mohammad Ted Pedersen University of Toronto, Toronto.
Part-of-Speech Tagging & Sequence Labeling
BIOI 7791 Projects in bioinformatics Spring 2005 March 22 © Kevin B. Cohen.
Part-of-speech tagging A simple but useful form of linguistic analysis Christopher Manning.
Part-of-Speech Tagging
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
DGP Week Two.
Syntax The study of how words are ordered and grouped together Key concept: constituent = a sequence of words that acts as a unit he the man the short.
인공지능 연구실 정 성 원 Part-of-Speech Tagging. 2 The beginning The task of labeling (or tagging) each word in a sentence with its appropriate part of speech.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
IVAN CAPP The 8 Parts of Speech.
Hindi Parts-of-Speech Tagging & Chunking Baskaran S MSRI.
S1: Chapter 1 Mathematical Models Dr J Frost Last modified: 6 th September 2015.
10/30/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 7 Giuseppe Carenini.
The Parts of Speech By Ms. Walsh The 8 Parts of Speech… Nouns Adjectives Pronouns Verbs Adverbs Conjunctions Prepositions Interjections Walsh Publishing.
13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.
Word classes and part of speech tagging Chapter 5.
Conversion of Penn Treebank Data to Text. Penn TreeBank Project “A Bank of Linguistic Trees” (as of 11/1992) University of Pennsylvania, LINC Laboratory.
The Parts of Speech The 8 Parts of Speech… Nouns Adjectives Pronouns Verbs Adverbs Conjunctions Prepositions Interjections.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
Word classes and part of speech tagging 09/28/2004 Reading: Chap 8, Jurafsky & Martin Instructor: Rada Mihalcea Note: Some of the material in this slide.
Sentence Analysis Week 2 – DGP for Pre-AP.
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
Parts of Speech Review. A Noun is a person, place, thing, or idea.
PARTS OF SPEECH ANSWER: QUESTION: HOW MANY PARTS OF SPEECH ARE IN THE ENGLISH LANGUAGE? A.4 B.6 C.8.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
Machine Learning in Practice Lecture 13 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
The Parts of Speech nouns verbs adjectives adverbs prepositions interjections conjunctions pronouns.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Word classes and part of speech tagging Chapter 5.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
POS TAGGING AND HMM Tim Teks Mining Adapted from Heng Ji.
Part-of-Speech Tagging CSE 628 Niranjan Balasubramanian Many slides and material from: Ray Mooney (UT Austin) Mausam (IIT Delhi) * * Mausam’s excellent.
PARTS OF SPEECH English - Grade 6 NOUN - A word that names a person, a place, a thing, or an idea. Proper nouns name a particular person, place, thing.
Lecture 9: Part of Speech
Parts of Speech Review.
PARTS OF SPEECH English - Grade 5.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
Dept. of Computer Science University of Liverpool
GRAMMAR: PARTS OF SPEECH
Core Concepts Lecture 1 Lexical Frequency.
FIRST SEMESTER GRAMMAR
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong.
Natural Language Processing
Part-of-Speech Tagging Using Hidden Markov Models
Natural Language Processing (NLP)
Presentation transcript:

Part-of-speech tagging A simple but useful form of linguistic analysis Christopher Manning

Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE), there was the idea of having parts of speech a.k.a lexical categories, word classes, “tags”, POS It comes from Dionysius Thrax of Alexandria (c. 100 BCE) the idea that is still with us that there are 8 parts of speech in Greek But actually his 8 aren’t exactly the ones we are taught today Thrax : noun, verb, article, adverb, preposition, conjunction, participle, pronoun School English grammar: noun, verb, adjective, adverb, preposition, conjunction, pronoun, interjection 2

Why POS is important? Parts-of-speech are useful because of the large amount of information they give about a word and its neighbors: For instance: Knowing whether a word is a noun or a verb tells us a lot about likely neighboring words (nouns are preceded by determiners and adjectives, verbs by nouns) and about the syntactic structure around the word (nouns are generally part of noun phrases), which makes part-of-speech tagging an important component of syntactic parsing(ch12) Parts of speech are useful features for finding named entities like people or organizations in text and other information extraction tasks Parts of speech are useful in summarization for improving the selection of nouns or other important words from a document

Open class (lexical) words Nouns Verbs Adjectives old older oldest Proper Common Main Adverbs slowly IBM Italy cat / cats snow see registered Numbers … more 122,312 one Closed class (functional) auxiliary Determiners the some can had Prepositions to with Conjunctions and or Particles off up … more Pronouns he its Interjections Ow Eh

Open vs. Closed classes Parts-of-speech can be divided into two broad supercategories: Closed class: are those with relatively fixed membership , such as: Prepositions: on, under, over, near, by, at, from, to, with . Determiners: a, an, the pronouns: she, who, I, Conjunctions: and, but, or, as, if, when . Auxiliary verbs: can, may, should, are particles: up, down, on, off, in, out, at, by. Numerals: one, two, three, first, second, third. Why “closed”? Open class: new nouns and verbs like iPhone or to fax are continually being created or borrowed. Nouns, main Verbs, Adjectives, Adverbs. Why “open”? 5

Penn Treebank POS (45 tags now used instead of 8)

POS Tagging The POS tagging problem is to determine the POS tag for a particular instance of a word because words are ambiguous have more than one possible POS. Example of word has more than one POS: “back” The back door = JJ “adjective” On my back = NN “noun” Win the voters back = RB “adverb” Promised to back the bill = VB “verb base”

Deciding on the correct part of speech can be difficult even for people Mrs/NNP Shaefer/NNP never/RB got/VBD around/RP to/TO joining/VBG All/DT we/PRP gotta/VBN do/VB is/VBZ go/VB around/IN the/DT corner/NN Chateau/NNP Petrus/NNP costs/VBZ around/RB 250/CD

How hard is the tagging problem? And how common is tag ambiguity? Fig. 9.2 shows the answer for the Brown and WSJ corpora tagged using the 45-tag Penn tree bank

POS tagging performance How many tags are correct? Is called (Tag accuracy) Baseline is already 90% (Tag accuracy) Baseline is algorithm for part of speech tagging POS which its goal to reduce that ambiguous words which is Tagging every word with its most frequent tag and Tagging unknown words as nouns Baseline is partly easy (why?) because Many words are unambiguous (85%-86%) You get points for them (the, a, etc.) and for punctuation marks! Because it easy to disambiguate. For example: “a” can be a determiner or the letter a (perhaps as part of an acronym or an initial). But the determiner sense of a is much more likely About 97% (Tag accuracy )currently by HMM, MEMM and other models.

Part-of-speech tagging A simple but useful form of linguistic analysis Christopher Manning

Part-of-speech tagging revisited A simple but useful form of linguistic analysis Christopher Manning

Sources of information What are the main sources of information for POS tagging? Knowledge of neighboring words Bill saw that man yesterday NNP NN DT NN NN VB VB(D) IN VB NN Knowledge of word probabilities (Knowledge of most frequent tag to word) man is rarely used as a verb…. These were the features of HMM taggers model

More and Better Features  Feature-based tagger Can do surprisingly well just looking at a word by itself: Word the: the  DT Lowercased word Importantly: importantly  RB Prefixes unfathomable: un-  JJ Suffixes Importantly: -ly  RB Capitalization Meridian: CAP  NNP Word shapes 35-year: d-x  JJ Then build a maxent (or whatever) model to predict tag Maxent P(t|w): 93.7% overall / 82.6% unknown Doing better

Overview: POS Tagging Accuracies Rough accuracies: Most freq tag: ~90% / ~50% Trigram HMM: ~95% / ~55% Maxent P(t|w): 93.7% / 82.6% TnT (HMM++): 96.2% / 86.0% MEMM tagger: 96.9% / 86.9% Bidirectional dependencies: 97.2% / 90.0% Upper bound: ~98% (human agreement) Most errors on unknown words

How to improve supervised results? Build better features! We could fix this with a feature that looked at the next word We could fix this by linking capitalized words to their lowercase versions RB PRP VBD IN RB IN PRP VBD . They left as soon as he arrived . JJ NNP NNS VBD VBN . Intrinsic flaws remained undetected .

Tagging Without Sequence Information Baseline Three Words t0 t0 w0 w-1 w0 w1 Model Features Token Unknown Sentence Baseline 56,805 93.69% 82.61% 26.74% 3Words 239,767 96.57% 86.78% 48.27% Using words only in a straight classifier works as well as a basic (HMM or discriminative) sequence model!!