CS 47051 Catching Up CS 4705. 2 Porter Stemmer Porter Stemmer (1980) Used for tasks in which you only care about the stem –IR, modeling given/new distinction,

Slides:

Advertisements

Similar presentations

Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.

Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING PoS-Tagging theory and terminology COMP3310 Natural Language Processing.

Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.

CS Morphological Parsing CS Parsing Taking a surface input and analyzing its components and underlying structure Morphological parsing:

Chpter#5 -part#1 Project Scope and Human Resource Planning

Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.

1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.

Word Classes and Part-of-Speech (POS) Tagging

Part of Speech Tagging Importance Resolving ambiguities by assigning lower probabilities to words that don’t fit Applying to language grammatical rules.

Chunk Parsing CS1573: AI Application Development, Spring 2003 (modified from Steven Bird’s notes)

Probabilistic Detection of Context-Sensitive Spelling Errors Johnny Bigert Royal Institute of Technology, Sweden

Shallow Parsing CS 4705 Julia Hirschberg 1. Shallow or Partial Parsing Sometimes we don’t need a complete parse tree –Information extraction –Question.

1 Part of Speech Tagging (Chapter 5) September 2009 Lecture #6.

Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.

Methodologies for Evaluating Dialog Structure Annotation Ananlada Chotimongkol Presented at Dialogs on Dialogs Reading Group 27 January 2006.

CS 4705 Lecture 8 Word Classes and Part-of- Speech (POS) Tagging.

Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.

1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Shallow Parsing.

Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.

1 I256: Applied Natural Language Processing Marti Hearst Sept 20, 2006.

Transformation-based error- driven learning (TBL) LING 572 Fei Xia 1/19/06.

POS Tagging and Context-Free Grammars

Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.

Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור חמישי POS Tagging Algorithms עידו.

Part of speech (POS) tagging

Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.

(Some issues in) Text Ranking. Recall General Framework Crawl – Use XML structure – Follow links to get new pages Retrieve relevant documents – Today.

BIOI 7791 Projects in bioinformatics Spring 2005 March 22 © Kevin B. Cohen.

Distributed Iterative Training Kevin Gimpel Shay Cohen Severin Hacker Noah A. Smith.

Maximum Entropy Model & Generalized Iterative Scaling Arindam Bose CS 621 – Artificial Intelligence 27 th August, 2007.

Albert Gatt Corpora and Statistical Methods Lecture 9.

Keyphrase Extraction in Scientific Documents Thuy Dung Nguyen and Min-Yen Kan School of Computing National University of Singapore Slides available at.

Intro to NLP - J. Eisner1 Part-of-Speech Tagging A Canonical Finite-State Task.

Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.

Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.

EMNLP’01 19/11/2001 ML: Classical methods from AI –Decision-Tree induction –Exemplar-based Learning –Rule Induction –T ransformation B ased E rror D riven.

Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.

Dialogue Act Tagging Using TBL Sachin Kamboj CISC 889: Statistical Approaches to NLP Spring 2003 September 14, 2015September 14, 2015September 14, 2015.

Fall 2005 Lecture Notes #8 EECS 595 / LING 541 / SI 661 Natural Language Processing.

CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)

Machine learning system design Prioritizing what to work on

Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.

CS 6998 NLP for the Web Columbia University 04/22/2010 Analyzing Wikipedia and Gold-Standard Corpora for NER Training William Y. Wang Computer Science.

13-1 Chapter 13 Part-of-Speech Tagging POS Tagging + HMMs Part of Speech Tagging –What and Why? What Information is Available? Visible Markov Models.

Speech and Language Processing Ch8. WORD CLASSES AND PART-OF- SPEECH TAGGING.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Basic Implementation and Evaluations Aj. Khuanlux MitsophonsiriCS.426 INFORMATION RETRIEVAL.

CSA3202 Human Language Technology HMMs for POS Tagging.

McGraw-Hill/Irwin © 2012 The McGraw-Hill Companies, Inc. All rights reserved. Obtaining Valid and Reliable Classroom Evidence Chapter 4:

Program Design. The design process How do you go about writing a program? –It’s like many other things in life Understand the problem to be solved Develop.

NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.

February 2007CSA3050: Tagging III and Chunking 1 CSA2050: Natural Language Processing Tagging 3 and Chunking Transformation Based Tagging Chunking.

Automatic Grammar Induction and Parsing Free Text - Eric Brill Thur. POSTECH Dept. of Computer Science 심 준 혁.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

CS 4705 Lecture 17 Semantic Analysis: Robust Semantics.

Discovering Relations among Named Entities from Large Corpora Takaaki Hasegawa *, Satoshi Sekine 1, Ralph Grishman 1 ACL 2004 * Cyberspace Laboratories.

NLP. Introduction to NLP Rule-based Stochastic –HMM (generative) –Maximum Entropy MM (discriminative) Transformation-based.

Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.

Modified from Diane Litman's version of Steve Bird's notes 1 Rule-Based Tagger The Linguistic Complaint –Where is the linguistic knowledge of a tagger?

CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)

Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.

Two Level Morphology Alexander Fraser & Liane Guillou CIS, Ludwig-Maximilians-Universität München Computational Morphology.

Psychology Unit Research Methods - Statistics

Machine Learning in Natural Language Processing

Evaluation Which of the three taggers did best?

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Lecture 6: Part of Speech Tagging (II): October 14, 2004 Neal Snider

Chunk Parsing CS1573: AI Application Development, Spring 2003

Morphological Parsing

Presentation transcript:

CS Catching Up CS 4705

2 Porter Stemmer Porter Stemmer (1980) Used for tasks in which you only care about the stem –IR, modeling given/new distinction, topic detection, document similarity Lexicon-free morphological analysis Cascades rewrite rules (e.g. misunderstanding --> misunderstand --> understand --> …) Easily implemented as an FST with rules e.g. –ATIONAL  ATE –ING  ε Not perfect …. –Doing  doe

3 Policy  police Does stemming help? –IR, little –Topic detection, more

4 Statistical POS Tagging Goal: choose the best sequence of tags T for a sequence of words W in a sentence – –By Bayes Rule –Since we can ignore P(W), we have

5 Brill Tagging: TBL Start with simple (less accurate) rules…learn better ones from tagged corpus –Tag each word initially with most likely POS –Examine set of transformations to see which improves tagging decisions compared to tagged corpustransformations –Re-tag corpus –Repeat until, e.g., performance doesn’t improve –Result: tagging procedure which can be applied to new, untagged text

6 An Example The horse raced past the barn fell. The/DT horse/NN raced/VBN past/IN the/DT barn/NN fell/VBD./. 1) Tag every word with most likely tag and score The/DT horse/NN raced/VBD past/NN the/DT barn/NN fell/VBD./. 2) For each template, try every instantiation (e.g. Change VBN to VBD when the preceding word is tagged NN, add rule to ruleset, retag corpus, and score

7 3) Stop when no transformation improves score 4) Result: set of transformation rules which can be applied to new, untagged data (after initializing with most common tag) ….What problems will this process run into?

8 Methodology: Evaluation For any NLP problem, we need to know how to evaluate our solutions Possible Gold Standards -- ceiling: –Annotated naturally occurring corpus –Human task performance (96-7%) How well do humans agree? Kappa statistic: avg pairwise agreement corrected for chance agreement –Can be hard to obtain for some tasks: sometimes humans don’t agree

9 Baseline: how well does simple method do? –For tagging, most common tag for each word (91%) –How much improvement do we get over baseline?

10 Methodology: Error Analysis Confusion matrix: –E.g. which tags did we most often confuse with which other tags? –How much of the overall error does each confusion account for?

11 More Complex Issues Tag indeterminacy: when ‘truth’ isn’t clear Carribean cooking, child seat Tagging multipart words wouldn’t --> would/MD n’t/RB Unknown words –Assume all tags equally likely –Assume same tag distribution as all other singletons in corpus –Use morphology, word length,….