Download presentation
Presentation is loading. Please wait.
Published byEgbert O’Connor’ Modified over 9 years ago
1
1 Word senses: a computational response Adam Kilgarriff
2
Madrid 2010 Kilgarriff: Word senses: a computational response2 A word sense is a cluster of corpus lines But I’m an NLP person Automatic clustering? Inspiration: Hindle 1991, Schütze 1993, Grefenstette 1993, Lin 1999 You can get semantic sense from corpora+stats
3
Madrid 2010 Kilgarriff: Word senses: a computational response3 First attempt Longman 1994 Abject failure No grammar Corpus too small and noisy Naïve clustering Useless programmer
4
Madrid 2010 Kilgarriff: Word senses: a computational response4 Collocations Easy Most words don’t go with most other words Then build on what we can do well metaphor, analogy, homonymy, rules all much harder
5
Madrid 2010 Kilgarriff: Word senses: a computational response5 Clustering Word sketch Collocates organised by grammar Dictionary Collocates (and other things) organised by meaning How to re-organise
6
Madrid 2010 Kilgarriff: Word senses: a computational response6 Observation: corpus: arbitrary sample dictionary ( =lexicon) : systematic account Children encounter arbitrary samples develop systematic account
7
Madrid 2010 Kilgarriff: Word senses: a computational response7 Corpus provisional, dispensable used to develop lexicon
8
Madrid 2010 Kilgarriff: Word senses: a computational response8 Levels of abstraction Direct linkage: Fragile Updates (to C or D) break links Dictionary: abstract Corpus: raw Intermediate level needed CorpusDictionary === ===
9
Madrid 2010 Kilgarriff: Word senses: a computational response9 How most automatic word sense disambiguation (WSD) works Analyse dictionary to give set of collocates Match to collocates in a corpus Dispensable corpus CorpusDictionary === === === === Collocates
10
Madrid 2010 Kilgarriff: Word senses: a computational response10 Not just collocates triples parse the corpus some “unary relations” I hear him singing domain-based clues Collocates, Constructions, Domains = CoCoDo
11
Madrid 2010 Kilgarriff: Word senses: a computational response11 Automatically extract CoCoDos from corpus How linked to senses? Automatic (WSD techniques) Manual “dictionary-free”: ideal for new dictionaries Labour costs Mixed WSD with manual confirmation/correction CorpusDictionary === === === === CoCoDo CoCoDo Linking CoCoDo’s to senses
12
Madrid 2010 Kilgarriff: Word senses: a computational response12 Semi-automatic dictionary drafting (SADD) CoCoDo database Automatic clustering Lexicographer input More clustering Dictionary with corpus inside
13
Madrid 2010 Kilgarriff: Word senses: a computational response13 Automatic clustering of collocates Propose senses Iterate: Lexicographer input Confirm/reject/edit sense inventory Assigns collocates / corpus lines to senses WSD Uses seeds to build full WSD for word Find more collocates for each sense XML dictionary entry Load into dictionary-editing tool
14
Madrid 2010 Kilgarriff: Word senses: a computational response14 Atkins method for bilingual lexicography Analyse source language From corpus List all expressions that might possibly have a non-predictable translation Very fine grained Lots of collocations target-language-neutral; re-usable Translate Edit to finalise dictionary
15
Madrid 2010 Kilgarriff: Word senses: a computational response15 Current projects/initiatives Semi-automatic Dictionary Disambiguation (SADD) Tickbox Lexicography (TBL) Slovene project New English-Irish Dictionary Putting Collocations in the Dictionary (PCID)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.