Download presentation
Presentation is loading. Please wait.
Published byΘεόδουλος Καψής Modified over 5 years ago
1
LING/C SC 581: Advanced Computational Linguistics
Lecture 28 April 18th
2
Administrivia on course website: evalb.c
Next week we might have a candidate lecture… up in the air
3
evalb.c Modified evalb.c to not skip the null parses
sec23 -e n number of errors to kill (default=10) MAX_ERROR 10 in COLLINS.PRM (100 errors ok) Modified evalb.c to not skip the null parses evalb.c counts them as if the parser had simply returned (W1 .. Wn) this kills the recall for those cases (but- well - precision is 100%) The number of null parses is impossible to ignore Josh Zhang: add to settings/collins.properties parser.decoder.maxSentenceLength = 200
4
evalb.c ls -l evalb.c 1 sandiway staff Apr 18 14:47 evalb.c gcc -o evalb evalb.c evalb.c:384:22: warning: passing 'unsigned char [5000]' to parameter of type 'char *' converts between pointers to integer types with different sign [-Wpointer-sign] for(Line=1;fgets(buff,5000,fd1)!=NULL;Line++){ ^~~~ /usr/include/secure/_string.h:124:34: note: expanded from macro 'strncpy' __builtin___strncpy_chk (dest, __VA_ARGS__, __darwin_obs... ^~~~~~~~~~~ 10 warnings generated. ls -l evalb -rwxr-xr-x 1 sandiway staff Apr 18 17:50 evalb ./evalb -h evalb [-dDh][-c n][-e n][-p param_file] gold-file test-file Evaluate bracketing in test-file against gold-file. Return recall, precision, F-Measure, tag accuracy.
5
Bikel Collins and EVALB
Performance of Bikel-Collins on section 23
6
tregex tsurgeon (-s flag)
System Flowchart Diagram: Treebank trees .mrg Bikel Collins parser Train Events .obj.gz create using cat tregex View WSJ treebank 00–24 Treebank sentences .txt Bikel Collins parser Parse Parse trees .txt.parsed (one tree per line) How? tregex View Search Sec 24 trees .mrg tregex tsurgeon (-s flag) Sec 24 gold trees .mrg (one tree per line) EVALB recall precision F-measure ≈86% create using cat COLLINS .prm
7
Extracting the sentences for section 24
Save matched. Then delete prefix, and empty elements (see patterns) Regex patterns: \*T\*-[0-9]+ \*-[0-9]+ \*U\* \b0\b \b\*\b \*EXP\*-[0-9]+
8
Section 24: easier way ~$ python3.5 Python (v3.5.2:4def2a2901a5, Jun , 10:47:25) [GCC (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import nltk from>>> from nltk.corpus import ptb s = ptb.tagged_sents(categories=['news'])[47862] >>> s [('The', 'DT'), ('economy', 'NN'), ("'s", 'POS'), ('temperature', 'NN'), ('will', 'MD'), ('be', 'VB'), ('taken', 'VBN'), ('*-1', '-NONE-'), ('from', 'IN'), ('several', 'JJ'), ('vantage', 'NN'), ('points', 'NNS'), ('this', 'DT'), ('week', 'NN'), (',', ','), ('with', 'IN'), ('readings', 'NNS'), ('on', 'IN'), ('trade', 'NN'), (',', ','), ('output', 'NN'), (',', ','), ('housing', 'NN'), ('and', 'CC'), ('inflation', 'NN'), ('.', '.')] >>> s = ptb.tagged_sents(categories=['news'])[-1] [('``', '``'), ('We', 'PRP'), ('want', 'VBP'), ('*', '- NONE-'), ('to', 'TO'), ('see', 'VB'), ('Nelson', 'NNP'), ('Mandela', 'NNP'), ('and', 'CC'), ('all', 'PDT'), ('our', 'PRP$'), ('comrades', 'NNS'), ('out', 'IN'), ('of', 'IN'), ('prison', 'NN'), (',', ','), ('and', 'CC'), ('if', 'IN'), ('we', 'PRP'), ('are', 'VBP'), ("n't", 'RB'), ('disciplined', 'VBN'), ('*-1', '-NONE-'), ('we', 'PRP'), ('may', 'MD'), ('not', 'RB'), ('see', 'VB'), ('them', 'PRP'), ('here', 'RB'), ('with', 'IN'), ('us', 'PRP'), ('.', '.')] …
9
Homework 9: Part 3 Q: How much data do we need to get good Recall, Precision and F-scores? Test concatenate 1, 2, 4, 8, 16 sections of your choosing and train the parser Run the resulting .obj.gz files for 1,2,4,8 and 16 sections on sentences from section 24 Plot the Recall, Precision and F- scores? See if you get a curve form like this…
10
Other sections …
11
Training: Robustness and Sensitivity
(Bikel 2004): “it may come as a surprise that the [parser] needs to access more than 219 million probabilities during the course of parsing the 1,917 sentences of Section 00 [of the PTB].''
12
Training: Robustness and Sensitivity
Trainer has a memory like a phone book: 76.8% singular events 94.2% 5 or fewer occurrences
13
Training: Robustness and Sensitivity
.observed file Frequency 1 observed data for: (NP (NP (DT a)(NN milk))(PP (IN with)(NP (ADJP (CD 4)(NN %))(NN butterfat)))) (mod ((with IN) (milk NN) PP (+START+) ((+START+ +START+)) NP-A NPB () false right) 1.0) modHeadWord (with IN) headWord (milk NN) modifier PP previousMods (+START+) previousWords ((+START+ +START+)) parent NP-A head NPB subcat () verbIntervening false side right (mod ((+STOP+ +STOP+) (milk NN) +STOP+ (PP) ((with IN)) NP-A NPB () false right) 1.0) modHeadWord (+STOP+ +STOP+) headWord (milk NN) modifier +STOP+ previousMods (PP) previousWords ((with IN)) parent NP-A head NPB subcat () verbIntervening false side right
14
Robustness and Sensitivity
it’s often assumed that statistical models are less brittle than symbolic models can get parses for ungrammatical data are they sensitive to noise or small perturbations?
15
Robustness and Sensitivity
Examples Herman mixed the water with the milk Herman mixed the milk with the water Herman drank the water with the milk Herman drank the milk with the water (mix) (drink) f(water)=117, f(milk)=21
16
Robustness and Sensitivity
Examples Herman mixed the water with the milk Herman mixed the milk with the water Herman drank the water with the milk Herman drank the milk with the water (high) logprob = -50.4 (low) logprob = -47.2 different PP attachment choices
17
Robustness and Sensitivity
First thoughts... does milk forces low attachment? (high attachment for other nouns like water, toys, etc.) Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun
18
Robustness and Sensitivity
First thoughts... Is there something special about the lexical item milk? 24 sentences in the WSJ Penn Treebank with milk in it, 21 as a noun but just one sentence (#5212) with PP attachment for milk Could just one sentence out of 39,832 training examples affect the attachment options?
19
Robustness and Sensitivity
Simple perturbation experiment alter that one sentence and retrain parser sentences parses derived counts wsj obj.gz
20
Robustness and Sensitivity
Simple perturbation experiment alter that one sentence and retrain ✕ delete the PP with 4% butterfat altogether
21
Robustness and Sensitivity
Simple perturbation experiment alter that one sentence and retrain Treebank sentences wsj mrg Derived counts wsj obj.gz training the Bikel/Collins parser can be retrained quicky or bump it up to the VP level
22
Robustness and Sensitivity
Result: high attachment for previous PP adjunct to milk Why such extreme sensitivity to perturbation? logprobs are conditioned on many things; hence, lots of probabilities to estimate smoothing need every piece of data, even low frequency ones Could just one sentence out of 39,832 training examples affect the attachment options? YES
23
Details… Two sets of files:
24
Details…
25
Robustness and Sensitivity
Frequency 1 observed data for: (NP (NP (DT a)(NN milk))(PP (IN with)(NP (ADJP (CD 4)(NN %))(NN butterfat)))) (mod ((with IN) (milk NN) PP (+START+) ((+START+ +START+)) NP-A NPB () false right) 1.0) modHeadWord (with IN) headWord (milk NN) modifier PP previousMods (+START+) previousWords ((+START+ +START+)) parent NP-A head NPB subcat () verbIntervening false side right (mod ((+STOP+ +STOP+) (milk NN) +STOP+ (PP) ((with IN)) NP-A NPB () false right) 1.0) modHeadWord (+STOP+ +STOP+) headWord (milk NN) modifier +STOP+ previousMods (PP) previousWords ((with IN)) parent NP-A head NPB subcat () verbIntervening false side right
26
Robustness and Sensitivity
76.8% singular events 94.2% 5 or fewer occurrences
27
Robustness and Sensitivity
Full story more complicated than described here... by picking different combinations of verbs and nouns, you can get a range of behaviors Verb Noun Attachment milk + noun noun + milk drank water high mixed low computer f(drank)=0 might as well have picked flubbed
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.