LING 581: Advanced Computational Linguistics Lecture Notes February 23rd.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Machine Learning PoS-Taggers COMP3310 Natural Language Processing Eric.
Advertisements

LING 581: Advanced Computational Linguistics Lecture Notes February 2nd.
Albert Gatt Corpora and Statistical Methods Lecture 11.
Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.
LING 388: Language and Computers Sandiway Fong Lecture 2.
Probabilistic Parsing Chapter 14, Part 2 This slide set was adapted from J. Martin, R. Mihalcea, Rebecca Hwa, and Ray Mooney.
Kewei Tu and Vasant Honavar
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.
LING 581: Advanced Computational Linguistics Lecture Notes February 16th.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Partial Prebracketing to Improve Parser Performance John Judge NCLT Seminar Series 7 th December 2005.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.
CS224N Interactive Session Competitive Grammar Writing Chris Manning Sida, Rush, Ankur, Frank, Kai Sheng.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
LING 581: Advanced Computational Linguistics Lecture Notes February 12th.
1 Statistical Parsing Chapter 14 October 2012 Lecture #9.
인공지능 연구실 정 성 원 Part-of-Speech Tagging. 2 The beginning The task of labeling (or tagging) each word in a sentence with its appropriate part of speech.
GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging II Transformation Based Tagging Brill (1995)
HW7 Extracting Arguments for % Ang Sun March 25, 2012.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Methods for the Automatic Construction of Topic Maps Eric Freese, Senior Consultant ISOGEN International.
LING 581: Advanced Computational Linguistics Lecture Notes February 16th.
Lecture 10 NLTK POS Tagging Part 3 Topics Taggers Rule Based Taggers Probabilistic Taggers Transformation Based Taggers - Brill Supervised learning Readings:
AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,
University of Edinburgh27/10/20151 Lexical Dependency Parsing Chris Brew OhioState University.
LING 581: Advanced Computational Linguistics Lecture Notes February 19th.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
Conversion of Penn Treebank Data to Text. Penn TreeBank Project “A Bank of Linguistic Trees” (as of 11/1992) University of Pennsylvania, LINC Laboratory.
Tokenization & POS-Tagging
CSA2050 Introduction to Computational Linguistics Parsing I.
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
Supertagging CMSC Natural Language Processing January 31, 2006.
February 2007CSA3050: Tagging III and Chunking 1 CSA2050: Natural Language Processing Tagging 3 and Chunking Transformation Based Tagging Chunking.
Part-of-speech tagging
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 5 th.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
Part-of-Speech Tagging CSCI-GA.2590 – Lecture 4 Ralph Grishman NYU.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
LING 581: Advanced Computational Linguistics Lecture Notes February 24th.
LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.
CSC 594 Topics in AI – Natural Language Processing
Prototype-Driven Learning for Sequence Models
LING 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
Constraining Chart Parsing with Partial Tree Bracketing
Natural Language Processing
David Kauchak CS159 – Spring 2019
LING/C SC 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
Part-of-Speech Tagging Using Hidden Markov Models
Presentation transcript:

LING 581: Advanced Computational Linguistics Lecture Notes February 23rd

Homework Task 2 Part 1 – Run the examples you showed on your slides from Homework Task 1 using the Bikel Collins parser. – Evaluate how close the parses are to the “gold standard” Part 2 – WSJ corpus: sections 00 through 24 – Evaluation: on section 23 – Training: normally (20 sections) – How does the Bikel Collins vary in accuracy if you randomly pick 1, 2, 3,…20 sections to do the training with… plot graph with evalb…

Task 2: Picking sections for training In directories Use – cat *.mrg > wsj-XX.mrg Merging sections – cat wsj-XX.mrg wsj- XY.mrg … > wsj-K.mrg (for K=2 to 20)

Task 2: Picking sections for training

Task 2: Section 23 Sentences for parsing No, it was n't Black Monday. But while the New York Stock Exchange did n't fall apart Friday as the Dow Jones Industrial Average plunged points -- most of it in the final hour -- it barely managed to stay this side of chaos. Some `` circuit breakers '' installed after the October 1987 crash failed their first test, traders say, unable to cool the selling panic in both stocks and futures. The 49 stock specialist firms on the Big Board floor -- the buyers and sellers of last resort who were criticized after the 1987 crash -- once again could n't handle the selling pressure. Big investment banks refused to step up to the plate to support the beleaguered floor traders by buying big blocks of stock, traders say. Heavy selling of Standard & Poor 's 500-stock index futures in Chicago relentlessly beat stocks downward. Seven Big Board stocks -- UAL, AMR, BankAmerica, Walt Disney, Capital Cities/ABC, Philip Morris and Pacific Telesis Group -- stopped trading and never resumed. The finger-pointing has already begun. `` The equity market was illiquid. Once again -LCB- the specialists -RCB- were not able to handle the imbalances on the floor of the New York Stock Exchange, '' said Christopher Pedersen, senior vice president at Twenty-First Securities Corp. Bikel Collins input sentences ((No (RB)) (, (,)) (it (PRP)) (was (VBD)) (n't (RB)) (Black (NNP)) (Monday (NNP)) (. (.)) ) ((But (CC)) (while (IN)) (the (DT)) (New (NNP)) (York (NNP)) (Stock (NNP)) (Exchange (NNP)) (did (VBD)) (n't (RB)) (fall (VB)) (apart (RB)) (Friday (NNP)) (as (IN)) (the (DT)) (Dow (NNP)) (Jones (NNP)) (Industrial (NNP)) (Average (NNP)) (plunged (VBD)) ( (CD)) (points (NNS)) (-- (:)) (most (JJS)) (of (IN)) (it (PRP)) (in (IN)) (the (DT)) (final (JJ)) (hour (NN)) (-- (:)) (it (PRP)) (barely (RB)) (managed (VBD)) (to (TO)) (stay (VB)) (this (DT)) (side (NN)) (of (IN)) (chaos (NN)) (. (.)) ) ((Some (DT)) (`` (``)) (circuit (NN)) (breakers (NNS)) ('' ('')) (installed (VBN)) (after (IN)) (the (DT)) (October (NNP)) (1987 (CD)) (crash (NN)) (failed (VBD)) (their (PRP$)) (first (JJ)) (test (NN)) (, (,)) (traders (NNS)) (say (VBP)) (, (,)) (unable (JJ)) (to (TO)) (cool (VB)) (the (DT)) (selling (NN)) (panic (NN)) (in (IN)) (both (DT)) (stocks (NNS)) (and (CC)) (futures (NNS)) (. (.)) ) ((The (DT)) (49 (CD)) (stock (NN)) (specialist (NN)) (firms (NNS)) (on (IN)) (the (DT)) (Big (NNP)) (Board (NNP)) (floor (NN)) (-- (:)) (the (DT)) (buyers (NNS)) (and (CC)) (sellers (NNS)) (of (IN)) (last (JJ)) (resort (NN)) (who (WP)) (were (VBD)) (criticized (VBN)) (after (IN)) (the (DT)) (1987 (CD)) (crash (NN)) (-- (:)) (once (RB)) (again (RB)) (could (MD)) (n't (RB)) (handle (VB)) (the (DT)) (selling (NN)) (pressure (NN)) (. (.)) ) ((Big (JJ)) (investment (NN)) (banks (NNS)) (refused (VBD)) (to (TO)) (step (VB)) (up (IN)) (to (TO)) (the (DT)) (plate (NN)) (to (TO)) (support (VB)) (the (DT)) (beleaguered (JJ)) (floor (NN)) (traders (NNS)) (by (IN)) (buying (VBG)) (big (JJ)) (blocks (NNS)) (of (IN)) (stock (NN)) (, (,)) (traders (NNS)) (say (VBP)) (. (.)) ) ((Heavy (JJ)) (selling (NN)) (of (IN)) (Standard (NNP)) (& (CC)) (Poor (NNP)) ('s (POS)) (500-stock (JJ)) (index (NN)) (futures (NNS)) (in (IN)) (Chicago (NNP)) (relentlessly (RB)) (beat (VBD)) (stocks (NNS)) (downward (RB)) (. (.)) ) ((Seven (CD)) (Big (NNP)) (Board (NNP)) (stocks (NNS)) (-- (:)) (UAL (NNP)) (, (,)) (AMR (NNP)) (, (,)) (BankAmerica (NNP)) (, (,)) (Walt (NNP)) (Disney (NNP)) (, (,)) (Capital (NNP)) (Cities/ABC (NNP)) (, (,)) (Philip (NNP)) (Morris (NNP)) (and (CC)) (Pacific (NNP)) (Telesis (NNP)) (Group (NNP)) (-- (:)) (stopped (VBD)) (trading (VBG)) (and (CC)) (never (RB)) (resumed (VBD)) (. (.)) ) ((The (DT)) (finger-pointing (NN)) (has (VBZ)) (already (RB)) (begun (VBN)) (. (.)) ) ((`` (``)) (The (DT)) (equity (NN)) (market (NN)) (was (VBD)) (illiquid (JJ)) (. (.)) ) wsj-23.txt RAW wsj-23.lsp Bikel Collins (Lisp SEXPs) wsj-23.mrgGold Standard parses wsj-23.txt RAW wsj-23.lsp Bikel Collins (Lisp SEXPs) wsj-23.mrgGold Standard parses

Q: What do statistical parsers do? Bikel-Collins 2416/2416 sentences Bracketing Recall: Precision: F-measure: Berkeley Stanford 2415/2416 sentences Bracketing Recall: Precision: F-measure: /2416 sentences** Bracketing Recall: Precision: F-measure: *using COLLINS.prm settings **after fix to allow EVALB to run to completion out of the box performance… Research

Often assumed that statistical models – are less brittle than symbolic models parses for ungrammatical data are they sensitive to noise or small perturbations?

Robustness and Sensitivity Examples 1.Herman mixed the water with the milk 2.Herman mixed the milk with the water 3.Herman drank the water with the milk 4.Herman drank the milk with the water f(water)=117, f(milk)=21 (mix)(drink)

Robustness and Sensitivity Examples 1.Herman mixed the water with the milk 2.Herman mixed the milk with the water 3.Herman drank the water with the milk 4.Herman drank the milk with the water different PP attachment choices (low) (high) logprob = logprob = -47.2

Robustness and Sensitivity First thoughts... does milk forces low attachment? (high attachment for other nouns like water, toys, etc.) Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun

Robustness and Sensitivity First thoughts... Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun but just one sentence (#5212) with PP attachment for milk Could just one sentence out of 39,832 training examples affect the attachment options?

Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain parser sentences parses derived counts wsj obj.gz

✕ Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain delete the PP with 4% butterfat altogether

Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain or bump it up to the VP level Treebank sentences wsj mrg Derived counts wsj obj.gz training the Bikel/Collins parser can be retrained in less time than it takes to make a cup of tea

Robustness and Sensitivity Result: – high attachment for previous PP adjunct to milk Could just one sentence out of 39,832 training examples affect the attachment options? YES Why such extreme sensitivity to perturbation? logprobs are conditioned on many things; hence, lots of probabilities to estimate smoothing need every piece of data, even low frequency ones

Robustness and Sensitivity (Bikel 2004): – “it may come as a surprise that the [parser] needs to access more than 219 million probabilities during the course of parsing the 1,917 sentences of Section 00 [of the PTB].''

Robustness and Sensitivity Trainer has a memory like a phone book:

Robustness and Sensitivity (mod ((with IN) (milk NN) PP (+START+) ((+START+ +START+)) NP-A NPB () false right) 1.0) – modHeadWord (with IN) – headWord (milk NN) – modifier PP – previousMods (+START+) – previousWords ((+START+ +START+)) – parent NP-A – head NPB – subcat () – verbIntervening false – side right (mod ((+STOP+ +STOP+) (milk NN) +STOP+ (PP) ((with IN)) NP-A NPB () false right) 1.0) – modHeadWord (+STOP+ +STOP+) – headWord (milk NN) – modifier +STOP+ – previousMods (PP) – previousWords ((with IN)) – parent NP-A – head NPB – subcat () – verbIntervening false – side right Frequency 1 observed data for: (NP (NP (DT a)(NN milk))(PP (IN with)(NP (ADJP (CD 4)(NN %))(NN butterfat))))

Robustness and Sensitivity 76.8% singular events 94.2% 5 or fewer occurrences

Robustness and Sensitivity Full story more complicated than described here... by picking different combinations of verbs and nouns, you can get a range of behaviors VerbNounAttachment milk + nounnoun + milk drankwaterhigh mixedwaterlowhigh mixedcomputerlow f(drank)=0 might as well have picked flubbed

An experiment with PTB Passives Documentation … – Passives in the PTB.pdf

Penn Treebank and Passive Sentences Wall Street Journal (WSJ) section of the Penn Treebank (PTB) – one million words from articles published in 1989 – All sentences Total: nearly 50,000 (49,208) divided into 25 sections (0– 24) Training sections: 39,832 Test section: 2,416 – Passive Sentences Total: (approx.) 6773 Training sections: 5507 (14%) Test section: 327 (14%) Standard training/test set split training sections 2–21 evaluation

Experiment – Use Wow! as the 4 th word of a sentence (as the only clue sentence is a passive sentence) – Remove standard English passive signaling i.e. no passive be + -en morphology Example – By 1997, almost all remaining uses of cancer-causing asbestos will be outlawed.

Experiment – Use Wow! as the 4 th word of a sentence (as the only clue sentence is a passive sentence) – Remove standard English passive signaling i.e. no passive be + -en morphology Example (sentence 00–64) – By 1997, almost all remaining uses of cancer-causing asbestos will outlaw. Wow! ⁁ Wow! Strategy: attach Wow! at the same syntactic level as the preceding word or lexeme POS tag WOW

Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case ❹ ❸ ❷ ❶

Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case logprob score better Wow! at position Parser using 4 th word Wow! training data

Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case logprob score better Wow! at position Parser using original training data Wow! is an unknown word

Results and Discussion Original: A buffet breakfast was held in the art museum. Modified: 1 A 2 buffet 3 breakfast 4 held 5 in 6 the 7 art 8 museum 9. Test: insert Wow! at positions 1–9 logprob score better Wow! at position Parser using 4 th word Wow! training data

Results and Discussion Original: A buffet breakfast was held in the art museum. Modified: 1 A 2 buffet 3 breakfast 4 held 5 in 6 the 7 art 8 museum 9. Test: insert Wow! at positions 1–9 logprob score better Wow! at position Parser using original training data

Results and Discussion On section 23 (test section) – 327 passive sentences – 2416 total OTD = original training data (i.e. no Wow!) Test data: -passwow1 = passives signaled using Wow!, -pass = passives signaled normally

Results and Discussion Comparison: – Wow! as object plus passive morphology – Wow! Inserted as NP object trace – Baseline (passive morphology) – Wow! 4 th word

Results and Discussion Comparison: – Wow! as object plus passive morphology – Wow! Inserted as NP object trace – Baseline (passive morphology) – Wow! 4 th word Wow! as object + passive morphology Wow! as object Regular passive morphology