Evaluation Which of the three taggers did best?

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.
Outline Why part of speech tagging? Word classes
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
1 Part of Speech tagging Lecture 9 Slides adapted from: Dan Jurafsky, Julia Hirschberg, Jim Martin.
BİL711 Natural Language Processing
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.
Natural Language Processing Lecture 8—9/24/2013 Jim Martin.
Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
Hidden Markov Model (HMM) Tagging  Using an HMM to do POS tagging  HMM is a special case of Bayesian inference.
Part of Speech Tagging with MaxEnt Re-ranked Hidden Markov Model Brian Highfill.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
POS Tagging Markov Models. POS Tagging Purpose: to give us explicit information about the structure of a text, and of the language itself, without necessarily.
CS Catching Up CS Porter Stemmer Porter Stemmer (1980) Used for tasks in which you only care about the stem –IR, modeling given/new distinction,
Predicting the Semantic Orientation of Adjective Vasileios Hatzivassiloglou and Kathleen R. McKeown Presented By Yash Satsangi.
More about tagging, assignment 2 DAC723 Language Technology Leif Grönqvist 4. March, 2003.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
1 I256: Applied Natural Language Processing Marti Hearst Sept 27, 2006.
1 I256: Applied Natural Language Processing Marti Hearst Sept 20, 2006.
POS Tagging HMM Taggers (continued). Today Walk through the guts of an HMM Tagger Address problems with HMM Taggers, specifically unknown words.
Part of speech (POS) tagging
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
Word classes and part of speech tagging Chapter 5.
1 CSI5388 Data Sets: Running Proper Comparative Studies with Large Data Repositories [Based on Salzberg, S.L., 1997 “On Comparing Classifiers: Pitfalls.
Albert Gatt Corpora and Statistical Methods Lecture 9.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Parts of Speech Sudeshna Sarkar 7 Aug 2008.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:
Part-of-Speech Tagging Foundation of Statistical NLP CHAPTER 10.
S1: Chapter 1 Mathematical Models Dr J Frost Last modified: 6 th September 2015.
Approximating a Deep-Syntactic Metric for MT Evaluation and Tuning Matouš Macháček, Ondřej Bojar; {machacek, Charles University.
13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.
Word classes and part of speech tagging Chapter 5.
Improving Morphosyntactic Tagging of Slovene by Tagger Combination Jan Rupnik Miha Grčar Tomaž Erjavec Jožef Stefan Institute.
Speech and Language Processing Ch8. WORD CLASSES AND PART-OF- SPEECH TAGGING.
Tokenization & POS-Tagging
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
Word classes and part of speech tagging 09/28/2004 Reading: Chap 8, Jurafsky & Martin Instructor: Rada Mihalcea Note: Some of the material in this slide.
CSA3202 Human Language Technology HMMs for POS Tagging.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
Computing Science 1P Large Group Tutorial 14 Simon Gay Department of Computing Science University of Glasgow 2006/07.
Digital Text and Data Processing Week 4. □ Making computers understand languages spoken by human beings □ Applications: □ Part of Speech Tagging □ Sentiment.
Part-of-speech tagging
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
Word classes and part of speech tagging Chapter 5.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
1 Natural Language Processing Vasile Rus
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
Language Identification and Part-of-Speech Tagging
Lecture 9: Part of Speech
CSC 594 Topics in AI – Natural Language Processing
Parts of a Lab Write-up.
Digital Text and Data Processing
CS4705 Part of Speech tagging
CSCI 5832 Natural Language Processing
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
Automatically Enhancing Tagging Accuracy and Readability for Common Freeware Taggers Martin Weisser Center for Linguistics & Applied Linguistics Guangdong.
Improving an Open Source Question Answering System
LING 581: Advanced Computational Linguistics
PART OF SPEECH TAGGING (POS)
Lecture 6: Part of Speech Tagging (II): October 14, 2004 Neal Snider
Brian Nisonger Shauna Eggers Joshua Johanson
Evaluation and Its Methods
Part-of-Speech Tagging Using Hidden Markov Models
Presentation transcript:

Evaluation Which of the three taggers did best? Also introduce a “baseline” Most-freq-tag Meaning that no one would use it if they really wanted some data tagged But it’s useful as a comparison

Most Frequent Tab We could count in a corpus would/MD prohibit/VB a/DT suit/NN for/IN refund/NN of/IN section/NN 381/CD (/( a/NN )/) ./. We could count in a corpus Counts from the Brown Corpus part of speech tagged at U Penn: 21830 DT 6 NN 3 FW

The Most Frequent Tag algorithm For each word Create dictionary with each possible tag for a word Take a tagged corpus Count the number of times each tag occurs for that word Given a new sentence For each word, pick the most frequent tag for that word from the corpus. NOTE: Dictionary comes from the corpus

Using a corpus to build a dictionary The/DT City/NNP Purchasing/NNP Department/NNP ,/, the/DT jury/NN said/VBD,/, is/VBZ lacking/VBG in/IN experienced/VBN clerical/JJ personnel/NNS … From this sentence, dictionary is: clerical department experienced in is jury …

Evaluating performance How do we know how well a tagger does? Gold Standard: a test sentence, or a set of test sentences, already tagged by a human We could run a tagger on this set of test sentences And see how many of the tags we got right. This is called “Tag accuracy” or “Tag percent correct”

Test set We take a set of test sentences Hand-label them for part of speech The result is a “Gold Standard” test set Who does this? Brown corpus: done by U Penn Grad students in linguistics Don’t they disagree? Yes! But on about 97% of tags no disagreements And if you let the taggers discuss the remaining 3%, they often reach agreement

Training and test sets But we can’t train our frequencies on the test set sentences. (Why not?) So for testing the Most-Frequent-Tag algorithm (or any other stochastic algorithm), we need 2 things: A hand-labeled training set: the data that we compute frequencies from, etc A hand-labeled test set: The data that we use to compute our % correct.

Computing % correct Of all the words in the test set For what percent of them did the tag chosen by the tagger equal the human-selected tag. Human tag set: (“Gold Standard” set)

Training and Test sets Often they come from the same labeled corpus! We just use 90% of the corpus for training and save out 10% for testing! Even better: cross-validation Take 90% training, 10% test, get a % correct Now take a different 10% test, 90% training, get % correct Do this 10 times and average

Evaluation and rule-based taggers Does the same evaluation metric work for rule-based taggers? Yes! Rule-based taggers don’t need the training set. But they still need a test set to see how well the rules are working.

Results Baseline: 91% Rule-based: not reported TBL: Markov model 97.2% accuracy when trained on 600,000 words 96.7% accuracy when trained on 64,000 words Markov model 96.7% when trained on 1 million words 96.3% when trained on 64,000 words 11/30/2018