Download presentation
Presentation is loading. Please wait.
1
LING 581: Advanced Computational Linguistics Lecture Notes February 23rd
2
Homework Task 2 Part 1 – Run the examples you showed on your slides from Homework Task 1 using the Bikel Collins parser. – Evaluate how close the parses are to the “gold standard” Part 2 – WSJ corpus: sections 00 through 24 – Evaluation: on section 23 – Training: normally 02-21 (20 sections) – How does the Bikel Collins vary in accuracy if you randomly pick 1, 2, 3,…20 sections to do the training with… plot graph with evalb…
3
Task 2: Picking sections for training In directories Use – cat *.mrg > wsj-XX.mrg Merging sections – cat wsj-XX.mrg wsj- XY.mrg … > wsj-K.mrg (for K=2 to 20)
4
Task 2: Picking sections for training
5
Task 2: Section 23 Sentences for parsing No, it was n't Black Monday. But while the New York Stock Exchange did n't fall apart Friday as the Dow Jones Industrial Average plunged 190.58 points -- most of it in the final hour -- it barely managed to stay this side of chaos. Some `` circuit breakers '' installed after the October 1987 crash failed their first test, traders say, unable to cool the selling panic in both stocks and futures. The 49 stock specialist firms on the Big Board floor -- the buyers and sellers of last resort who were criticized after the 1987 crash -- once again could n't handle the selling pressure. Big investment banks refused to step up to the plate to support the beleaguered floor traders by buying big blocks of stock, traders say. Heavy selling of Standard & Poor 's 500-stock index futures in Chicago relentlessly beat stocks downward. Seven Big Board stocks -- UAL, AMR, BankAmerica, Walt Disney, Capital Cities/ABC, Philip Morris and Pacific Telesis Group -- stopped trading and never resumed. The finger-pointing has already begun. `` The equity market was illiquid. Once again -LCB- the specialists -RCB- were not able to handle the imbalances on the floor of the New York Stock Exchange, '' said Christopher Pedersen, senior vice president at Twenty-First Securities Corp. Bikel Collins input sentences ((No (RB)) (, (,)) (it (PRP)) (was (VBD)) (n't (RB)) (Black (NNP)) (Monday (NNP)) (. (.)) ) ((But (CC)) (while (IN)) (the (DT)) (New (NNP)) (York (NNP)) (Stock (NNP)) (Exchange (NNP)) (did (VBD)) (n't (RB)) (fall (VB)) (apart (RB)) (Friday (NNP)) (as (IN)) (the (DT)) (Dow (NNP)) (Jones (NNP)) (Industrial (NNP)) (Average (NNP)) (plunged (VBD)) (190.58 (CD)) (points (NNS)) (-- (:)) (most (JJS)) (of (IN)) (it (PRP)) (in (IN)) (the (DT)) (final (JJ)) (hour (NN)) (-- (:)) (it (PRP)) (barely (RB)) (managed (VBD)) (to (TO)) (stay (VB)) (this (DT)) (side (NN)) (of (IN)) (chaos (NN)) (. (.)) ) ((Some (DT)) (`` (``)) (circuit (NN)) (breakers (NNS)) ('' ('')) (installed (VBN)) (after (IN)) (the (DT)) (October (NNP)) (1987 (CD)) (crash (NN)) (failed (VBD)) (their (PRP$)) (first (JJ)) (test (NN)) (, (,)) (traders (NNS)) (say (VBP)) (, (,)) (unable (JJ)) (to (TO)) (cool (VB)) (the (DT)) (selling (NN)) (panic (NN)) (in (IN)) (both (DT)) (stocks (NNS)) (and (CC)) (futures (NNS)) (. (.)) ) ((The (DT)) (49 (CD)) (stock (NN)) (specialist (NN)) (firms (NNS)) (on (IN)) (the (DT)) (Big (NNP)) (Board (NNP)) (floor (NN)) (-- (:)) (the (DT)) (buyers (NNS)) (and (CC)) (sellers (NNS)) (of (IN)) (last (JJ)) (resort (NN)) (who (WP)) (were (VBD)) (criticized (VBN)) (after (IN)) (the (DT)) (1987 (CD)) (crash (NN)) (-- (:)) (once (RB)) (again (RB)) (could (MD)) (n't (RB)) (handle (VB)) (the (DT)) (selling (NN)) (pressure (NN)) (. (.)) ) ((Big (JJ)) (investment (NN)) (banks (NNS)) (refused (VBD)) (to (TO)) (step (VB)) (up (IN)) (to (TO)) (the (DT)) (plate (NN)) (to (TO)) (support (VB)) (the (DT)) (beleaguered (JJ)) (floor (NN)) (traders (NNS)) (by (IN)) (buying (VBG)) (big (JJ)) (blocks (NNS)) (of (IN)) (stock (NN)) (, (,)) (traders (NNS)) (say (VBP)) (. (.)) ) ((Heavy (JJ)) (selling (NN)) (of (IN)) (Standard (NNP)) (& (CC)) (Poor (NNP)) ('s (POS)) (500-stock (JJ)) (index (NN)) (futures (NNS)) (in (IN)) (Chicago (NNP)) (relentlessly (RB)) (beat (VBD)) (stocks (NNS)) (downward (RB)) (. (.)) ) ((Seven (CD)) (Big (NNP)) (Board (NNP)) (stocks (NNS)) (-- (:)) (UAL (NNP)) (, (,)) (AMR (NNP)) (, (,)) (BankAmerica (NNP)) (, (,)) (Walt (NNP)) (Disney (NNP)) (, (,)) (Capital (NNP)) (Cities/ABC (NNP)) (, (,)) (Philip (NNP)) (Morris (NNP)) (and (CC)) (Pacific (NNP)) (Telesis (NNP)) (Group (NNP)) (-- (:)) (stopped (VBD)) (trading (VBG)) (and (CC)) (never (RB)) (resumed (VBD)) (. (.)) ) ((The (DT)) (finger-pointing (NN)) (has (VBZ)) (already (RB)) (begun (VBN)) (. (.)) ) ((`` (``)) (The (DT)) (equity (NN)) (market (NN)) (was (VBD)) (illiquid (JJ)) (. (.)) ) wsj-23.txt RAW wsj-23.lsp Bikel Collins (Lisp SEXPs) wsj-23.mrgGold Standard parses wsj-23.txt RAW wsj-23.lsp Bikel Collins (Lisp SEXPs) wsj-23.mrgGold Standard parses
6
Q: What do statistical parsers do? Bikel-Collins 2416/2416 sentences Bracketing Recall: 88.45 Precision: 88.56 F-measure: 88.50 Berkeley Stanford 2415/2416 sentences Bracketing Recall: 88.98 Precision: 84.88 F-measure: 86.88 2412/2416 sentences** Bracketing Recall: 84.92 Precision: 81.87 F-measure: 83.37 *using COLLINS.prm settings **after fix to allow EVALB to run to completion out of the box performance… Research
7
Often assumed that statistical models – are less brittle than symbolic models parses for ungrammatical data are they sensitive to noise or small perturbations?
8
Robustness and Sensitivity Examples 1.Herman mixed the water with the milk 2.Herman mixed the milk with the water 3.Herman drank the water with the milk 4.Herman drank the milk with the water f(water)=117, f(milk)=21 (mix)(drink)
9
Robustness and Sensitivity Examples 1.Herman mixed the water with the milk 2.Herman mixed the milk with the water 3.Herman drank the water with the milk 4.Herman drank the milk with the water different PP attachment choices (low) (high) logprob = -50.4 logprob = -47.2
10
Robustness and Sensitivity First thoughts... does milk forces low attachment? (high attachment for other nouns like water, toys, etc.) Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun
11
Robustness and Sensitivity First thoughts... Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun but just one sentence (#5212) with PP attachment for milk Could just one sentence out of 39,832 training examples affect the attachment options?
12
Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain parser sentences parses derived counts wsj-02-21.obj.gz
13
✕ Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain delete the PP with 4% butterfat altogether
14
Robustness and Sensitivity Simple perturbation experiment – alter that one sentence and retrain or bump it up to the VP level Treebank sentences wsj-02-21.mrg Derived counts wsj-02-21.obj.gz training the Bikel/Collins parser can be retrained in less time than it takes to make a cup of tea
15
Robustness and Sensitivity Result: – high attachment for previous PP adjunct to milk Could just one sentence out of 39,832 training examples affect the attachment options? YES Why such extreme sensitivity to perturbation? logprobs are conditioned on many things; hence, lots of probabilities to estimate smoothing need every piece of data, even low frequency ones
16
Robustness and Sensitivity (Bikel 2004): – “it may come as a surprise that the [parser] needs to access more than 219 million probabilities during the course of parsing the 1,917 sentences of Section 00 [of the PTB].''
17
Robustness and Sensitivity Trainer has a memory like a phone book:
18
Robustness and Sensitivity (mod ((with IN) (milk NN) PP (+START+) ((+START+ +START+)) NP-A NPB () false right) 1.0) – modHeadWord (with IN) – headWord (milk NN) – modifier PP – previousMods (+START+) – previousWords ((+START+ +START+)) – parent NP-A – head NPB – subcat () – verbIntervening false – side right (mod ((+STOP+ +STOP+) (milk NN) +STOP+ (PP) ((with IN)) NP-A NPB () false right) 1.0) – modHeadWord (+STOP+ +STOP+) – headWord (milk NN) – modifier +STOP+ – previousMods (PP) – previousWords ((with IN)) – parent NP-A – head NPB – subcat () – verbIntervening false – side right Frequency 1 observed data for: (NP (NP (DT a)(NN milk))(PP (IN with)(NP (ADJP (CD 4)(NN %))(NN butterfat))))
19
Robustness and Sensitivity 76.8% singular events 94.2% 5 or fewer occurrences
20
Robustness and Sensitivity Full story more complicated than described here... by picking different combinations of verbs and nouns, you can get a range of behaviors VerbNounAttachment milk + nounnoun + milk drankwaterhigh mixedwaterlowhigh mixedcomputerlow f(drank)=0 might as well have picked flubbed
21
An experiment with PTB Passives Documentation … – Passives in the PTB.pdf
22
Penn Treebank and Passive Sentences Wall Street Journal (WSJ) section of the Penn Treebank (PTB) – one million words from articles published in 1989 – All sentences Total: nearly 50,000 (49,208) divided into 25 sections (0– 24) Training sections: 39,832 Test section: 2,416 – Passive Sentences Total: (approx.) 6773 Training sections: 5507 (14%) Test section: 327 (14%) Standard training/test set split training sections 2–21 evaluation 0 24 23
23
Experiment – Use Wow! as the 4 th word of a sentence (as the only clue sentence is a passive sentence) – Remove standard English passive signaling i.e. no passive be + -en morphology Example – By 1997, almost all remaining uses of cancer-causing asbestos will be outlawed.
24
Experiment – Use Wow! as the 4 th word of a sentence (as the only clue sentence is a passive sentence) – Remove standard English passive signaling i.e. no passive be + -en morphology Example (sentence 00–64) – By 1997, almost all remaining uses of cancer-causing asbestos will outlaw. Wow! ⁁ Wow! Strategy: attach Wow! at the same syntactic level as the preceding word or lexeme POS tag WOW
25
Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case ❹ ❸ ❷ ❶
26
Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case logprob score better Wow! at position Parser using 4 th word Wow! training data
27
Results and Discussion Original: The book was delivered. Modified: 1 The 2 book 3 delivered 4. Test: insert Wow! at positions 1–4 – compare the probability of the (best) parse in each case logprob score better Wow! at position Parser using original training data Wow! is an unknown word
28
Results and Discussion Original: A buffet breakfast was held in the art museum. Modified: 1 A 2 buffet 3 breakfast 4 held 5 in 6 the 7 art 8 museum 9. Test: insert Wow! at positions 1–9 logprob score better Wow! at position Parser using 4 th word Wow! training data
29
Results and Discussion Original: A buffet breakfast was held in the art museum. Modified: 1 A 2 buffet 3 breakfast 4 held 5 in 6 the 7 art 8 museum 9. Test: insert Wow! at positions 1–9 logprob score better Wow! at position Parser using original training data
30
Results and Discussion On section 23 (test section) – 327 passive sentences – 2416 total OTD = original training data (i.e. no Wow!) Test data: -passwow1 = passives signaled using Wow!, -pass = passives signaled normally
31
Results and Discussion Comparison: – Wow! as object plus passive morphology – Wow! Inserted as NP object trace – Baseline (passive morphology) – Wow! 4 th word
32
Results and Discussion Comparison: – Wow! as object plus passive morphology – Wow! Inserted as NP object trace – Baseline (passive morphology) – Wow! 4 th word 89.088.0 87.0 Wow! as object + passive morphology Wow! as object Regular passive morphology
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.