Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.

Slides:



Advertisements
Similar presentations
Introduction to Syntax and Context-Free Grammars Owen Rambow
Advertisements

Translator Architecture Code Generator ParserTokenizer string of characters (source code) string of tokens abstract program string of integers (object.
Mrach 1, 2009Dr. Muhammed Al-Mulhem1 ICS482 Formal Grammars Chapter 12 Muhammed Al-Mulhem March 1, 2009.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
1 Context Free Grammars Chapter 12 (Much influenced by Owen Rambow) September 2012 Lecture #5.
Syntax and Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Context-Free Grammars Julia Hirschberg CS 4705 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
Introduction to Syntax Owen Rambow September 30.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing Probabilistic Context Free Grammars (Chapter 14) Muhammed Al-Mulhem March 1,
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
6/9/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 11 Giuseppe Carenini.
1 Introduction to Computational Natural Language Learning Linguistics (Under: Topics in Natural Language Processing ) Computer Science (Under:
Introduction to Syntax Owen Rambow September
Introduction to Syntax Owen Rambow October
Syntax: Structural Descriptions of Sentences. Why Study Syntax? Syntax provides systematic rules for forming new sentences in a language. can be used.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Introduction to Syntax and Context-Free Grammars Owen Rambow September
Syntax and Context-Free Grammars CMSC 723: Computational Linguistics I ― Session #6 Jimmy Lin The iSchool University of Maryland Wednesday, October 7,
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
Announcements Main CSE file server went down last night –Hand in your homework using ‘submit_cse467’ as soon as you can – no penalty if handed in today.
Stochastic POS tagging Stochastic taggers choose tags that result in the highest probability: P(word | tag) * P(tag | previous n tags) Stochastic taggers.
Models of Generative Grammar Smriti Singh. Generative Grammar  A Generative Grammar is a set of formal rules that can generate an infinite set of sentences.
11 CS 388: Natural Language Processing: Syntactic Parsing Raymond J. Mooney University of Texas at Austin.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin and Rada Mihalcea.
Introduction Syntax: form of a sentence (is it valid) Semantics: meaning of a sentence Valid: the frog writes neatly Invalid: swims quickly mathematics.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 CPE 480 Natural Language Processing Lecture 5: Parser Asst. Prof. Nuttanart Facundes, Ph.D.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
CFGS – TAKE 2 David Kauchak CS30 – Spring Admin Today’s mentor hours moved to 6-8pm Assignment 4 graded Assignment 5 - how’s it going? - part A.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
Natural Language Processing Lecture 6 : Revision.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
THE BIG PICTURE Basic Assumptions Linguistics is the empirical science that studies language (or linguistic behavior) Linguistics proposes theories (models)
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity, Probabilistic Parsing, sample seminar 17.
PARSING David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.
1 Basic Parsing with Context- Free Grammars Slides adapted from Julia Hirschberg and Dan Jurafsky.
Albert Gatt Corpora and Statistical Methods Lecture 11.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 4.
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
Parsing with Context-Free Grammars for ASR Julia Hirschberg CS 4706 Slides with contributions from Owen Rambow, Kathy McKeown, Dan Jurafsky and James Martin.
CPE 480 Natural Language Processing Lecture 4: Syntax Adapted from Owen Rambow’s slides for CSc Fall 2006.
CSA2050 Introduction to Computational Linguistics Parsing I.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
1 Context Free Grammars October Syntactic Grammaticality Doesn’t depend on Having heard the sentence before The sentence being true –I saw a unicorn.
1 Context Free Grammars Chapter 9 (Much influenced by Owen Rambow) October 2009 Lecture #7.
CPSC 503 Computational Linguistics
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
Natural Language Processing Lecture 15—10/15/2015 Jim Martin.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Natural Language Processing Lecture 14—10/13/2015 Jim Martin.
December 2011CSA3202: PCFGs1 CSA3202: Human Language Technology Probabilistic Phrase Structure Grammars (PCFGs)
GRAMMARS David Kauchak CS457 – Spring 2011 some slides adapted from Ray Mooney.
Speech and Language Processing Formal Grammars Chapter 12.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Natural Language Processing Vasile Rus
Introduction to Syntax and Context-Free Grammars
CSC NLP -Context-Free Grammars
Introduction to Syntax and Context-Free Grammars cs
Machine Learning in Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
Introduction to Syntax
David Kauchak CS159 – Spring 2019
Presentation transcript:

Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19

Admin Stuff These slides available at o For Eliza in homework, you can use a tagger or chunker, if you want – details at: o Special office hours (Ani): today after class, tomorrow at 10am in CEPSR 721

Statistical POS Tagging Want to choose most likely string of tags (T), given the string of words (W) W = w 1, w 2, …, w n T = t 1, t 2, …, t n I.e., want argmax T p(T | W) Problem: sparse data

Statistical POS Tagging (ctd) p(T|W) = p(T,W) / p(W) = p(W|T) p (T) / p(W) argmax T p(T|W) = argmax T p(W|T) p (T) / p(W) = argmax T p(W|T) p (T)

Statistical POS Tagging (ctd) p(T) = p(t 1, t 2, …, t n-1, t n ) = p(t n | t 1, …, t n-1 ) p (t 1, …, t n-1 ) = p(t n | t 1, …, t n-1 ) p(t n-1 | t 1, …, t n-2 ) p (t 1, …, t n-2 ) =  i p(t i | t 1, …, t i-1 )   i p(t i | t i-2, t i-1 )  trigram (n-gram)

Statistical POS Tagging (ctd) p(W|T) = p(w 1, w 2, …, w n | t 1, t 2, …, t n ) =  i p(w i | w 1, …, w i-1, t 1, t 2, …, t n )   i p(w i | t i )

Statistical POS Tagging (ctd) argmax T p(T|W) = argmax T p(W|T) p (T)  argmax T  i p(w i | t i ) p(t i | t i-2, t i-1 ) Relatively easy to get data for parameter estimation (next slide) But: need smoothing for unseen words Easy to determine the argmax (Viterbi algorithm in time linear in sentence length)

Probability Estimation for trigram POS Tagging Maximum-Likelihood Estimation p’ ( w i | t i ) = c( w i, t i ) / c( t i ) p’ ( t i | t i-2, t i-1 ) = c( t i, t i-2, t i-1 ) / c( t i-2, t i-1 )

Statistical POS Tagging Method common to many tasks in speech & NLP “Noisy Channel Model”, Hidden Markov Model

Back to Syntax (((the/ Det ) boy/ N ) likes/ V ((a/ Det ) girl/ N )) boy the likes girl a DetP NP DetP S Phrase-structure tree nonterminal symbols = constituents terminal symbols = words

Phrase Structure and Dependency Structure likes/ V boy/ N girl/ N the/ Det a/ Det boy the likes girl a DetP NP DetP S

Types of Dependency likes/ V boy/ N girl/ N a/ Det small/ Adj the/ Det very/ Adv sometimes/ Adv Obj Subj Adj(unct) Fw Adj

Grammatical Relations Types of relations between words o Arguments: subject, object, indirect object, prepositional object o Adjuncts: temporal, locative, causal, manner, … o Function Words

Subcategorization List of arguments of a word (typically, a verb), with features about realization (POS, perhaps case, verb form etc) In canonical order Subject-Object- IndObj Example: o like: N-N, N-V(to-inf) o see: N, N-N, N-N-V(inf) Note: J&M talk about subcategorization only within VP

Where is the VP? boy the likes girl a DetP NP DetP S boy the likes DetP NP girl a NP DetP S VP

Where is the VP? Existence of VP is a linguistic (empirical) claim, not a methodological claim Semantic evidence??? Syntactic evidence o VP-fronting (and quickly clean the carpet he did! ) o VP-ellipsis (He cleaned the carpets quickly, and so did she ) o Can have adjuncts before and after VP, but not in VP (He often eats beans, *he eats often beans ) Note: in all right-branching structures, issue is different again

Penn Treebank, Again Syntactically annotated corpus (phrase structure) PTB is not naturally occurring data! Represents a particular linguistic theory (but a fairly “vanilla” one) Particularities o Very indirect representation of grammatical relations (need for head percolation tables) o Completely flat structure in NP (brown bag lunch, pink-and-yellow child seat ) o Has flat Ss, flat VPs

Context-Free Grammars Defined in formal language theory (comp sci) Terminals, nonterminals, start symbol, rules String-rewriting system Start with start symbol, rewrite using rules, done when only terminals left

CFG: Example Rules o S  NP VP o VP  V NP o NP  Det N | AdjP NP o AdjP  Adj | Adv AdjP o N  boy | girl o V  sees | likes o Adj  big | small o Adv  very o Det  a | the the very small boy likes a girl

Derivations of CFGs String rewriting system: we derive a string (=derived structure) But derivation history represented by phrase-structure tree (=derivation structure)!

Grammar Equivalence and Normal Form Can have different grammars that generate same set of strings (weak equivalence) Can have different grammars that have same set of derivation trees (string equivalence)

Nobody Uses CFGs Only (Except Intro NLP Courses) o All major syntactic theories (Chomsky, LFG, HPSG, TAG-based theories) represent both phrase structure and dependency, in one way or another o All successful parsers currently use statistics about phrase structure and about dependency

Massive Ambiguity of Syntax For a standard sentence, and a grammar with wide coverage, there are 1000s of derivations! Example: o The large head master told the man that he gave money and shares in a letter on Wednesday

Some Syntactic Constructions: Wh -Movement

Control

Raising