LING 581: Advanced Computational Linguistics

Slides:

Advertisements

Similar presentations

LING 388: Language and Computers

Advertisements

LING 581: Advanced Computational Linguistics Lecture Notes February 2nd.

Albert Gatt Corpora and Statistical Methods Lecture 11.

Learning Accurate, Compact, and Interpretable Tree Annotation Recent Advances in Parsing Technology WS 2011/2012 Saarland University in Saarbrücken Miloš.

Learning and Inference for Hierarchically Split PCFGs Slav Petrov and Dan Klein.

LING 388: Language and Computers Sandiway Fong Lecture 2.

Probabilistic Parsing Chapter 14, Part 2 This slide set was adapted from J. Martin, R. Mihalcea, Rebecca Hwa, and Ray Mooney.

In Search of a More Probable Parse: Experiments with DOP* and the Penn Chinese Treebank Aaron Meyers Linguistics 490 Winter 2009.

LING 581: Advanced Computational Linguistics Lecture Notes January 19th.

LING 581: Advanced Computational Linguistics Lecture Notes February 23rd.

LING 581: Advanced Computational Linguistics Lecture Notes March 9th.

LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.

Probabilistic Parsing: Enhancements Ling 571 Deep Processing Techniques for NLP January 26, 2011.

LING 581: Advanced Computational Linguistics Lecture Notes May 5th.

PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.

LING 581: Advanced Computational Linguistics Lecture Notes January 26th.

1/13 Parsing III Probabilistic Parsing and Conclusions.

Learning Accurate, Compact, and Interpretable Tree Annotation Slav Petrov, Leon Barrett, Romain Thibaux, Dan Klein.

1/17 Probabilistic Parsing … and some other approaches.

LING 438/538 Computational Linguistics Sandiway Fong Lecture 18: 10/26.

SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.

BİL711 Natural Language Processing1 Statistical Parse Disambiguation Problem: –How do we disambiguate among a set of parses of a given sentence? –We want.

Probabilistic Parsing Reading: Chap 14, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.

GALE Banks 11/9/06 1 Parsing Arabic: Key Aspects of Treebank Annotation Seth Kulick Ryan Gabbard Mitch Marcus.

1 Statistical Parsing Chapter 14 October 2012 Lecture #9.

GRAMMARS David Kauchak CS159 – Fall 2014 some slides adapted from Ray Mooney.

The Language of Mathematics Basic Grammar. A word to the wise The purpose of this tutorial is to get you to understand what equations and inequalities.

LING 581: Advanced Computational Linguistics Lecture Notes February 16th.

AQUAINT Workshop – June 2003 Improved Semantic Role Parsing Kadri Hacioglu, Sameer Pradhan, Valerie Krugler, Steven Bethard, Ashley Thornton, Wayne Ward,

LING 581: Advanced Computational Linguistics Lecture Notes February 19th.

11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.

Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2007 Lecture August 2007.

CSA2050 Introduction to Computational Linguistics Parsing I.

Supertagging CMSC Natural Language Processing January 31, 2006.

LING 6520: Comparative Topics in Linguistics (from a computational perspective) Martha Palmer Jan 15,

Natural Language Processing Lecture 15—10/15/2015 Jim Martin.

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 5 th.

NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.

LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.

PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.

LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.

LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.

Natural Language Processing Vasile Rus

CSC 594 Topics in AI – Natural Language Processing

Parsing Recommended Reading: Ch th Jurafsky & Martin 2nd edition

COSC 6336 Natural Language Processing Statistical Parsing

Probabilistic CKY Parser

LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.

CS 388: Natural Language Processing: Statistical Parsing

LING 581: Advanced Computational Linguistics

Probabilistic and Lexicalized Parsing

CS 388: Natural Language Processing: Syntactic Parsing

LING/C SC 581: Advanced Computational Linguistics

LING/C SC 581: Advanced Computational Linguistics

CSCI 5832 Natural Language Processing

Probabilistic and Lexicalized Parsing

LING 581: Advanced Computational Linguistics

CSCI 5832 Natural Language Processing

LING/C SC 581: Advanced Computational Linguistics

Constraining Chart Parsing with Partial Tree Bracketing

LING/C SC 581: Advanced Computational Linguistics

David Kauchak CS159 – Spring 2019

CPSC 503 Computational Linguistics

Probabilistic Parsing

LING/C SC 581: Advanced Computational Linguistics

CPSC 503 Computational Linguistics

David Kauchak CS159 – Spring 2019

LING/C SC 581: Advanced Computational Linguistics

LING/C SC 581: Advanced Computational Linguistics

Presentation transcript:

LING 581: Advanced Computational Linguistics Lecture Notes March 15th

Today's topics We're going to switch topics next time… Last lecture on the WSJ Penn treebank Short remarks on Homework 7 Robustness and Sensitivity

Bikel-Collins Parsing of Section 23 split –l 100 section 23 into parts {a-w} and manually load-balance over 4 cores: Time in seconds from bash $SECONDS on a Dell XPS 13" laptop (2017) Core i7-7500U 2300 seconds ≈ 38-39 minutes

Bikel-Collins Parsing of Section 23 Note: these aren't the single core times because the Dell has only 2 real full cores (4 including hyperthreading). Note: latest 2018 Dell XPS 13 has 4 real cores (8 incl. hyperthreading) – i7-8550U https://ark.intel.com/products/122589/Intel-Core-i7-8550U-Processor-8M-Cache-up-to-4_00-GHz 4 terminal screens Actual wall times: 1:18:20 1:18:27 1:18:48 1:19:27

Bikel-Collins Parsing of Section 23 Optimistic view? gold tags supplied sentence length ≤ 40 only

Bikel-Collins Parsing of Section 23 nevalb data:

Other sections …

Robustness and Sensitivity it’s often assumed that statistical models are less brittle than symbolic models can get parses for ungrammatical data are they sensitive to noise or small perturbations?

Robustness and Sensitivity Examples Herman mixed the water with the milk Herman mixed the milk with the water Herman drank the water with the milk Herman drank the milk with the water (mix) (drink) f(water)=117, f(milk)=21

Robustness and Sensitivity Examples Herman mixed the water with the milk Herman mixed the milk with the water Herman drank the water with the milk Herman drank the milk with the water (high) logprob = -50.4 (low) logprob = -47.2 different PP attachment choices

Robustness and Sensitivity First thoughts... does milk forces low attachment? (high attachment for other nouns like water, toys, etc.) Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun

Robustness and Sensitivity First thoughts... Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun but just one sentence (#5212) with PP attachment for milk Could just one sentence out of 39,832 training examples affect the attachment options?

Robustness and Sensitivity Simple perturbation experiment alter that one sentence and retrain parser sentences parses derived counts wsj-02-21.obj.gz

Robustness and Sensitivity Simple perturbation experiment alter that one sentence and retrain ✕ delete the PP with 4% butterfat altogether

Robustness and Sensitivity Simple perturbation experiment alter that one sentence and retrain Treebank sentences wsj-02-21.mrg Derived counts wsj-02-21.obj.gz training the Bikel/Collins parser can be retrained quicky or bump it up to the VP level

Robustness and Sensitivity Result: high attachment for previous PP adjunct to milk Why such extreme sensitivity to perturbation? logprobs are conditioned on many things; hence, lots of probabilities to estimate smoothing need every piece of data, even low frequency ones Could just one sentence out of 39,832 training examples affect the attachment options? YES

Details… Two sets of files:

Details…

Robustness and Sensitivity (Bikel 2004): “it may come as a surprise that the [parser] needs to access more than 219 million probabilities during the course of parsing the 1,917 sentences of Section 00 [of the PTB].''

Robustness and Sensitivity Trainer has a memory like a phone book:

Robustness and Sensitivity Frequency 1 observed data for: (NP (NP (DT a)(NN milk))(PP (IN with)(NP (ADJP (CD 4)(NN %))(NN butterfat)))) (mod ((with IN) (milk NN) PP (+START+) ((+START+ +START+)) NP-A NPB () false right) 1.0) modHeadWord (with IN) headWord (milk NN) modifier PP previousMods (+START+) previousWords ((+START+ +START+)) parent NP-A head NPB subcat () verbIntervening false side right (mod ((+STOP+ +STOP+) (milk NN) +STOP+ (PP) ((with IN)) NP-A NPB () false right) 1.0) modHeadWord (+STOP+ +STOP+) headWord (milk NN) modifier +STOP+ previousMods (PP) previousWords ((with IN)) parent NP-A head NPB subcat () verbIntervening false side right

Robustness and Sensitivity 76.8% singular events 94.2% 5 or fewer occurrences

Robustness and Sensitivity Full story more complicated than described here... by picking different combinations of verbs and nouns, you can get a range of behaviors Verb Noun Attachment milk + noun noun + milk drank water high mixed low computer f(drank)=0 might as well have picked flubbed