Download presentation
Presentation is loading. Please wait.
Published bySibyl Sparks Modified over 9 years ago
1
1 Word senses: a computational response Adam Kilgarriff Auckland 2012Kilgarriff: Word senses: a computational response
2
Auckland 2012 Kilgarriff: Word senses: a computational response2 My PhD (in 5 slides) What is a word sense
3
Auckland 2012 Kilgarriff: Word senses: a computational response3 The lexicographers They create them Methods Introspection Other dictionaries Corpus Atkins, Hanks, Krishnamurthy
4
Auckland 2012 Kilgarriff: Word senses: a computational response4 What is a word sense (1) SFIP Sufficiently frequent insufficiently predictable (a glass of) whisky x (a glass of) tequila
5
Auckland 2012 Kilgarriff: Word senses: a computational response5 What is a word sense (2) homonymy analogy polysemy rules collocation
6
Auckland 2012 Kilgarriff: Word senses: a computational response6 What is a word sense (3) A cluster Of instances of use Operationalised as: corpus lines Clustered by lexicographers
7
Auckland 2012 Kilgarriff: Word senses: a computational response7 What is a word sense (3)
8
Auckland 2012 Kilgarriff: Word senses: a computational response8 What is a word sense (3)
9
Auckland 2012 Kilgarriff: Word senses: a computational response9 What is a word sense (3)
10
Auckland 2012 Kilgarriff: Word senses: a computational response10 What is a word sense (3)
11
Auckland 2012 Kilgarriff: Word senses: a computational response11 What is a word sense (3) A cluster Of instances of use Operationalised as: corpus lines Clustered by lexicographers Makes sense of Overlapping senses Different dictionaries, different senses Lumping and splitting
12
Auckland 2012 Kilgarriff: Word senses: a computational response12 I don’t believe in word senses Believe in: resurrection ghost witch vampire god miracle fairy Philosophy: Ontological commitment (same meaning different register) “good entities to build belief systems on”
13
Auckland 2012 Kilgarriff: Word senses: a computational response13 A word sense is a cluster of corpus lines But I’m an NLP person Automatic clustering? Inspiration: Hindle 1991, Schütze 1993, Grefenstette 1993, Lin 1999 You can get semantic sense from corpora+stats
14
Auckland 2012 Kilgarriff: Word senses: a computational response14 First attempt Longman 1994 Abject failure No grammar Corpus too small and noisy Naïve clustering Useless programmer
15
Auckland 2012 Kilgarriff: Word senses: a computational response15 Collocations Easy Most words don’t go with most other words Then build on what we can do well metaphor, analogy, homonymy, rules all much harder
16
Auckland 2012 Kilgarriff: Word senses: a computational response16 Clustering Word sketch Collocates organised by grammar Dictionary Collocates (and other things) organised by meaning How to re-organise
17
Auckland 2012 Kilgarriff: Word senses: a computational response17 Observation: corpus: arbitrary sample dictionary ( =lexicon) : systematic account Children encounter arbitrary samples develop systematic account
18
Auckland 2012 Kilgarriff: Word senses: a computational response18 Corpus provisional, dispensable used to develop lexicon
19
Auckland 2012 Kilgarriff: Word senses: a computational response19 Levels of abstraction Direct linkage: Fragile Updates (to C or D) break links Dictionary: abstract Corpus: raw Intermediate level needed CorpusDictionary === ===
20
Auckland 2012 Kilgarriff: Word senses: a computational response20 How most automatic word sense disambiguation (WSD) works Analyse dictionary to give set of collocates Match to collocates in a corpus Dispensable corpus CorpusDictionary === === === === Collocates
21
Auckland 2012 Kilgarriff: Word senses: a computational response21 Not just collocates triples parse the corpus some “unary relations” I hear him singing domain-based clues Collocates, Constructions, Domains = CoCoDo
22
Auckland 2012 Kilgarriff: Word senses: a computational response22 Automatically extract CoCoDos from corpus How linked to senses? Automatic (WSD techniques) Manual “dictionary-free”: ideal for new dictionaries Labour costs Mixed WSD with manual confirmation/correction CorpusDictionary === === === === CoCoDo CoCoDo Linking CoCoDo’s to senses
23
Auckland 2012 Kilgarriff: Word senses: a computational response23 Semi-automatic dictionary drafting (SADD) CoCoDo database Automatic clustering Lexicographer input More clustering Dictionary with corpus inside
24
Auckland 2012 Kilgarriff: Word senses: a computational response24 Automatic clustering of collocates Propose senses Iterate: Lexicographer input Confirm/reject/edit sense inventory Assigns collocates / corpus lines to senses WSD Uses seeds to build full WSD for word Find more collocates for each sense XML dictionary entry Load into dictionary-editing tool
25
Auckland 2012 Kilgarriff: Word senses: a computational response25 Fits with Atkins method for bilingual lexicography Analyse source language From corpus List all expressions that might possibly have a non-predictable translation Very fine grained Lots of collocations target-language-neutral; re-usable Translate Edit to finalise dictionary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.