Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University.

Similar presentations


Presentation on theme: "Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University."— Presentation transcript:

1 Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University of Brighton Credits: UK EPSRC grant WASPS, M34971

2 Lexicographers need NLP

3 NLP needs lexicography

4 Word senses: nowhere truer  Lexicography – the second hardest part

5 Word senses: nowhere truer  Lexicography – the second hardest part  NLP –Word sense disambiguation (WSD)  SENSEVAL-1 (1998): 77% Hector  SENSEVAL-2 (2001): 64% WordNet

6 Word senses: nowhere truer  Lexicography – the second hardest part  NLP –Word sense disambiguation (WSD)  SENSEVAL-1 (1998): 77% Hector  SENSEVAL-2 (2001): 64% WordNet –Machine Translation  Main cost is lexicography

7 Synergy The WASPBENCH

8 Inputs and outputs  Inputs –Corpus (processed) –Lexicographic expertise

9 Inputs and outputs  Outputs –Analysis of meaning/translation repertoire –Implemented:  Word expert  Can disambiguate A “disambiguating dictionary”

10 Inputs and outputs MT needs rules of form in context C, S => T –Major determinant of MT quality –Manual production: expensive –Eng oil => Fr huile or petrole?  SYSTRAN: 400 rules

11 Inputs and outputs MT needs rules of form in context C, S => T –Major determinant of MT quality –Manual production: expensive –Eng oil => Fr huile or petrole?  SYSTRAN: 400 rules Waspbench output: thousands of rules

12 Evaluation hard

13 Evaluation hard  Three communities

14 Evaluation hard  Three communities  No precedents

15 Evaluation hard  Three communities  No precedents  The art and craft of lexicography

16 Evaluation hard  Three communities  No precedents  The art and craft of lexicography  MT personpower budgets

17 Five threads  as WSD: SENSEVAL  for lexicography: MED  expert reports  Quantitative experiments with human subjects –India  Within-group consistency –Leeds  Comparison with commercial MT

18 Method  Human1 creates word experts  Computer uses word experts to disambiguate test instances  MT system translates same test instances  Human2 –evaluates computer and MT performance on each instance: –good / bad / unsure / preferred / alternative

19 Words  mid-frequency –1,500-20,000 instances in BNC  At least two clearly distinct meanings –Checked with ref to translations into Fr/Ger/Dutch  33 words –16 nouns, 10 verbs, 7 adjs  around 40 test instances per word

20 Words NounsVerbsAdjectives bank partycharge toastbright chest policyfloat underminefree coat recordmovefunny fit sealobservehot line stepoffendmoody lot termpoststrong mass volumepray

21 Human subjects  Translation studies students, Univ Leeds –Thanks: Tony Hartley  Native/near-native in English and their other language  twelve people, working with: –Chinese (4) French (3) German (2) Italian (1) Japanese (2) (no MT system for Japanese)  circa four days’ work: –introduction/training –two days to create word experts –two days to evaluate output

22 Method  Human1 creates word experts, average 30 mins/word  Computer uses word experts to disambiguate test instances  MT system: Babelfish via Altavista translates same test instances  Human2 –evaluates computer and MT performance on each instance: –good / bad / unsure / preferred / alternative

23 Results (%) LangWaspsMTbothneitherunsure Ger602819265 Fr614537284 Ch684237233 It672923225 All643629254

24 Results by POS (%) WaspsMTbothneither Nouns69403524 Verbs61383227 Adjs63413124

25 Observations  Grad student users, 4-hour training  30 mins per (not-too-complex) word  ‘fuzzy’ words intrinsically harder  No great inter-subject disparities –(it’s the words that vary, not the people)

26 Conclusion  WSD can improve MT (using a tool like WASPS)

27

28 Future work  multiwords  n>2  thesaurus  other source languages  new corpora, bigger corpora –the web


Download ppt "Evaluating the Waspbench A Lexicography Tool Incorporating Word Sense Disambiguation Rob Koeling, Adam Kilgarriff, David Tugwell, Roger Evans ITRI, University."

Similar presentations


Ads by Google