CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)

Slides:



Advertisements
Similar presentations
Three Basic Problems Compute the probability of a text: P m (W 1,N ) Compute maximum probability tag sequence: arg max T 1,N P m (T 1,N | W 1,N ) Compute.
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
LING 388: Language and Computers Sandiway Fong Lecture 2.
Statistical NLP: Lecture 3
Part-of-speech tagging. Parts of Speech Perhaps starting with Aristotle in the West (384–322 BCE) the idea of having parts of speech lexical categories,
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 29– AI and Probability (exemplified through NLP) 4 th Oct, 2010.
CS460/449 : Speech, Natural Language Processing and the Web/Topics in AI Programming (Lecture 2– Introduction+ML and NLP) Pushpak Bhattacharyya CSE Dept.,
Hidden Markov Model (HMM) Tagging  Using an HMM to do POS tagging  HMM is a special case of Bayesian inference.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
Part of speech (POS) tagging
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
Albert Gatt Corpora and Statistical Methods Lecture 9.
Methods in Computational Linguistics II Queens College Lecture 5: List Comprehensions.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Prof. Pushpak Bhattacharyya, IIT Bombay.1 Application of Noisy Channel, Channel Entropy CS 621 Artificial Intelligence Lecture /09/05.
인공지능 연구실 정 성 원 Part-of-Speech Tagging. 2 The beginning The task of labeling (or tagging) each word in a sentence with its appropriate part of speech.
Natural Language Processing Lecture 6 : Revision.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 3 (10/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Statistical Formulation.
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
CS460/626 : Natural Language Processing/Speech, NLP and the Web Some parse tree examples (from quiz 3) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th.
Linguistic Essentials
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-16: Probabilistic parsing; computing probability of.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 17 (14/03/06) Prof. Pushpak Bhattacharyya IIT Bombay Formulation of Grammar.
CPSC 503 Computational Linguistics
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 24 (14/04/06) Prof. Pushpak Bhattacharyya IIT Bombay Word Sense Disambiguation.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 6 (14/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down and Bottom-Up.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 13 (17/02/06) Prof. Pushpak Bhattacharyya IIT Bombay Top-Down Bottom-Up.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
Part-of-Speech Tagging & Sequence Labeling Hongning Wang
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Parsing Algos.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 11: Evidence for Deeper Structure; Top Down Parsing.
Natural Language Processing Vasile Rus
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Statistical NLP: Lecture 3
Pushpak Bhattacharyya CSE Dept., IIT Bombay
CS : Speech, NLP and the Web/Topics in AI
CSCI 5832 Natural Language Processing
LING/C SC 581: Advanced Computational Linguistics
Probabilistic and Lexicalized Parsing
Linguistic Essentials
Classical Part of Speech (PoS) Tagging
Pushpak Bhattacharyya CSE Dept., IIT Bombay
Artificial Intelligence 2004 Speech & Natural Language Processing
Prof. Pushpak Bhattacharyya, IIT Bombay
Presentation transcript:

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS) Tagging

Prof. Pushpak Bhattacharyya, IIT Bombay Tagging or Annotation Purpose is Disambiguation A word can have a number of labels The problem is to give unique label. PoS tagging makes use of the “local context”, whereas Sense tagging needs “long distance dependency” and hence difficult too. PoS tagging is needed in mainly parsing and also in other applications. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Approaches Rule Based approach Statistical approach we will mainly focus on the statistical approach 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Types of Tagging Tasks PoS Named entity Sense Parse tree 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay PoS Tagging Example “The Orange ducks clean the bills.” Assign tags to each word from the lexicon; multiple possibilities exist 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Lexicon dictionary The: DT (Determiner) Orange: NN (Noun) JJ (Adjective) Duck: NN VB ( Basic verb) Clean: NN VB Bill: JJ, VB, NN are called as Syntactic entities or PoS tags 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

PoS tagging as a sequence labelling task Task is to assign the correct PoS tag sequence to the words. It can be: Unigram: Consider one word while deciding the sequence. Multigram: Consider multiple words. 16 (=1*2*2*2*1*2) possible sequences for the “Duck” example. It is a classification problem: classify each word’s tag correctly into the right category. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Challenges Lexical ambiguity: Multiple choices Morphology analysis: Find the root word Tokenization: Find word boundaries In Thai language there is no blank space Non trivial (example: capturing boundaries when the word is continued to the next line with a “-”) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Named Entity tagging Example 1: “Mohan went to school in Kolkata” Tagged as: “Mohan_Person went to School_Place in Kolkata_Place”. Example 2: “Kolkata bore the brunt of 1947 riots when 1947 children died at Kolkata. “Kolkata_? bore the brunt of 1947_year riots when 1947_num children died at Kolkata_Place. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Sense tagging Detecting the meaning. Our example tagged as: The Orange_{colour} ducks_{bird} clean the bills_{body_part} Sense tagging has been done by means of hypernymy. Semantic relations like hypernymy are stored in the lexical resource called “WordNet”. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Parse Tree tagging Example parse tree: 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Parse Tree tagging (contd.) Given a grammar, one can construct the parse tree. Annotation will produce following structure: [ [The_DT [Orange_JJ Ducks_NN]NP]NP [clean_VB[the_VB [bills_NN]NP]NP]VP]S This structure is called the Penn Treebank form From the Treebank form, one can arrive at a grammar through learning. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Statistical Formulation of the PoS tagging problem Input: W1,W2,...Wn words C1,C2,....Cm Lexical tags reposition (DT,JJ, NN et. al.) Output: “Best” PoS tag sequence Ci1, Ci2, Ci3....Cin for the given words. Best means: P(Ci1, Ci2, Ci3....Cin|W1,W2,...Wn) is the maximum of all possible C-sequence. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Statistical Formation of PoS tagging problem Example: P(DT JJ NN| The Orange duck) > P(DT NN VB| The Orange duck) is required Why?: Because given the phrase “The orange duck”, there is overwhelming evidence in the corpus that “DT JJ NN” is the right tag sequence. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Mathematical machinery 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Bayes Theorem P(A|B) = (P(A).P(B|A)) / P(B) Where, P(A): Prior probability P(A|B): Posterior probability P(B|A): likelihood Why apply Bayes theorem: This is the Generative Vs Discriminative model question. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Apply Bayes theorem P(Ci1, Ci2, Ci3....Cin|W1,W2,...Wn) = P(C|W) = where, C = <Ci1, Ci2, Ci3....Cin> W = <W1,W2,...Wn> P(C). P(W|C) P(W) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Best tag sequence C* = <Ci1, Ci2, Ci3....Cin>* , where * signifies best C- sequence = argmax(P(C|W)) As denominator is common in all the tag sequences Therefore, C* = argmax(P(C).P(W|C)) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Processing the1st part P(C) = P(Ci1, Ci2, Ci3....Cin) = P(Ci1).P(Ci2|Ci1).P(Ci3|Ci1. Ci2)..P(Cin|Ci1Ci2.. Cin-1) (on applying chain rule of probability) Ex: P(DT JJ NN) = P(DT).P(JJ|DT).P(NN|DT JJ) 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Markov assumption Tag depends only on a window, not on everything that the “chain law” of probability demands. Kth order Markov assumption considers only previous K tags. Typical values of K = 3 for English, and (it seems) 5 for Hindi. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Prof. Pushpak Bhattacharyya, IIT Bombay Apply assumption With K=2, our problem will be: P(C) = P(Ci|Ci-1), i: 1..n C0: sentence beginning marker. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay

Exercise given in the lecture Contrast PoS tagging with Sense tagging. Find an example to show the difference. 06/01/06 Prof. Pushpak Bhattacharyya, IIT Bombay