Download presentation
Presentation is loading. Please wait.
Published byShannon Small Modified over 9 years ago
1
Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University of Brighton Credits: UK EPSRC grant WASPS, M34971
2
Lexicographers need NLP
3
NLP needs lexicography
4
Word senses: nowhere truer Lexicography – the second hardest part
5
Word senses: nowhere truer Lexicography – the second hardest part NLP –Word sense disambiguation (WSD) SENSEVAL-1 (1998): 77% Hector SENSEVAL-2 (2001): 64% WordNet
6
Word senses: nowhere truer Lexicography – the second hardest part NLP –Word sense disambiguation (WSD) SENSEVAL-1 (1998): 77% Hector SENSEVAL-2 (2001): 64% WordNet –Machine Translation Main cost is lexicography
7
Synergy The WASPBENCH
8
Inputs and outputs Inputs –Corpus (processed) –Lexicographic expertise
9
Inputs and outputs Outputs –Analysis of meaning/translation repertoire –Implemented: Word expert Can disambiguate A “disambiguating dictionary”
10
Inputs and outputs MT needs rules of form in context C, S => T –Major determinant of MT quality –Manual production: expensive –Eng oil => Fr huile or petrole? SYSTRAN: 400 rules
11
Inputs and outputs MT needs rules of form in context C, S => T –Major determinant of MT quality –Manual production: expensive –Eng oil => Fr huile or petrole? SYSTRAN: 400 rules Waspbench output: thousands of rules
12
Evaluation hard
13
Evaluation hard Three communities
14
Evaluation hard Three communities No precedents
15
Evaluation hard Three communities No precedents The art and craft of lexicography
16
Evaluation hard Three communities No precedents The art and craft of lexicography MT personpower budgets
17
Five threads as WSD: SENSEVAL for lexicography: MED expert reports Quantitative experiments with human subjects –India Within-group consistency –Leeds Comparison with commercial MT
18
Method Human1 creates word experts Computer uses word experts to disambiguate test instances MT system translates same test instances Human2 –evaluates computer and MT performance on each instance: –good / bad / unsure / preferred / alternative
19
Words mid-frequency –1,500-20,000 instances in BNC At least two clearly distinct meanings –Checked with ref to translations into Fr/Ger/Dutch 33 words –16 nouns, 10 verbs, 7 adjs around 40 test instances per word
20
Words NounsVerbsAdjectives bank partycharge toastbright chest policyfloat underminefree coat recordmovefunny fit sealobservehot line stepoffendmoody lot termpoststrong mass volumepray
21
Human subjects Translation studies students, Univ Leeds –Thanks: Tony Hartley Native/near-native in English and their other language twelve people, working with: –Chinese (4) French (3) German (2) Italian (1) Japanese (2) (no MT system for Japanese) circa four days’ work: –introduction/training –two days to create word experts –two days to evaluate output
22
Method Human1 creates word experts, average 30 mins/word Computer uses word experts to disambiguate test instances MT system: Babelfish via Altavista translates same test instances Human2 –evaluates computer and MT performance on each instance: –good / bad / unsure / preferred / alternative
23
Results (%) LangWaspsMTbothneitherunsure Ger602819265 Fr614537284 Ch684237233 It672923225 All643629254
24
Results by POS (%) WaspsMTbothneither Nouns69403524 Verbs61383227 Adjs63413124
25
Observations Grad student users, 4-hour training 30 mins per (not-too-complex) word ‘fuzzy’ words intrinsically harder No great inter-subject disparities –(it’s the words that vary, not the people)
26
Conclusion WSD can improve MT (using a tool like WASPS)
28
Future work multiwords n>2 thesaurus other source languages new corpora, bigger corpora –the web
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.