Download presentation
Presentation is loading. Please wait.
Published byJayda Jakeway Modified over 9 years ago
1
How dominant is the commonest sense of a word? Adam Kilgarriff Lexicography MasterClass Univ of Brighton
2
What do you think? (zero-freq senses don’t count)
3
The WSD task select correct sense in context sense inventory given in a dictionary old problem corpus methods are best
4
Lower bound Gale Church Yarowsky 1992 Baseline system: always choose commonest Around 70% Only small sample available SEMCOR Bigger sample, still too small SENSEVAL Big problem
5
Overview Mathematical model Evaluation (against SEMCOR) Implications for WSD evaluation
6
Model: assumptions Meanings unrelated Word sense frequency distribution same as word frequency distribution
7
Model All k word senses in a bag Randomly select 2 for a 2-sense word k(k-1)/2 possible 2-sense words
8
Set the frequency For a 2-sense word with freq 101, possibilities include 100:1 split How many times? 50:51 split How many times?
9
Words to model word senses Brown, or BNC How many types for each frequency Smooth to give monotonic-decreasing
10
Brown rawBrown smooth BNC rawBNC smooth 116278 486507 26097 123633 … 504343.13742700.45 514741.86688679.45 … 1001011.03262244.37 … Freq # of words having that freq
11
Using Brown frequencies 100:1 split How many times? 16278*11.03 = 179,546 50:51 split How many times? 43.13*41.86 = 1805 Ratio 179,546:1805 = 99 100:1 split is 99 times likelier than 51:50
12
Generalising For a 2-sense word with fr=n select ‘commonest’ fr = m n/2 < m < n select another from subset where fr =n-m find all possible selections Calculate average ratio, commonest:other answer title question
13
Model: answers (BNC) n2-sense ‘words’ 3-sense ‘words’ 4-sense ‘words’ 1083.258.940.0 2588.974.258.2 5092.381.869.1 10094.687.077.1 20096.290.783.1 50097.694.289.1
14
SEMCOR 250,000 word corpus Manually sense-tagged WordNet senses
15
Evaluate model against SEMCOR n2-sense words # % BNC 3-sense words # % BNC 1055 73.6 83.241 64.3 58.9 25-class96 79.8 88.970 68.1 74.2 50-class45 83.1 92.359 72.4 81.8 100-class16 79.4 94.624 77.8 87.0
16
Discussion Same trend Assumption untrue: SFIP principle: a reading must be sufficiently frequent, insufficiently predictable to get into a dictionary generous vs pike generous: donation/person/helping pike: fish or weapon or hill or turnpike
17
Discussion More data, more meanings (without end) not changing ratios for known senses but addition of new senses Models pike not generous Dominated by singletons
18
SENSEVAL Evaluation exercise for WSD 1998; 2001; 2004 Two task-types: Lexical sample Choose a small samples of words and disambiguate multiple instances of each All-words Choose a text or two, disambiguate all words
19
Lower bound and SENSEVAL All-words Samples too small to see extent of skew freq of 2-sense word =3: lower bound=67% Lexical sample Skew in manual sample selection “good” candidate words show “balance” (amazing) Are systems better than baseline? SENSEVAL-3: systems scarcely beat baseline Not proven (and not likely)
20
What is the commonest sense Varies with domain More mileage than disambiguation cf default strategy in commercial MT McCarthy Koeling Weeds Carroll ACL-04 3-sentence window does not allow domain-identification methods Domain-id task more interesting and worthwhile than WSD
21
Thank you
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.