Download presentation
Presentation is loading. Please wait.
Published byDomenic White Modified over 9 years ago
1
Between Corpus and Dictionary Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds, Sussex
2
Szeged, Jan 2008Kilgarriff, Global WordNet2 What is a word sense?
3
Szeged, Jan 2008Kilgarriff, Global WordNet3 Preliminaries What is language? What is meaning?
4
Szeged, Jan 2008Kilgarriff, Global WordNet4 What is language?
5
Szeged, Jan 2008Kilgarriff, Global WordNet5 What is language? In our heads
6
Szeged, Jan 2008Kilgarriff, Global WordNet6 What is language? In our heads In texts and sound signals
7
Szeged, Jan 2008Kilgarriff, Global WordNet7 What is language? In our heads In texts and sound signals Both
8
Szeged, Jan 2008Kilgarriff, Global WordNet8 Methodology Study language in our heads Introspection Semantic analysis Experiments with human subjects “rationalist” (Leibniz, Chomsky) Problems: coverage, arbitrariness
9
Szeged, Jan 2008Kilgarriff, Global WordNet9 Methodology Study text “empiricist” (Locke, Hume) Physics: forces, matter Chemistry: chemicals, bonds Language: text, speech signals
10
Szeged, Jan 2008Kilgarriff, Global WordNet10 It goes against the grain What is important about a sentence? its meaning Corpus methodology: Throw away individual sentence meaning Find patterns
11
Szeged, Jan 2008Kilgarriff, Global WordNet11 Empiricist linguistics A new way to find out about language 15 years of rapid ascent Computers Corpora bigger and bigger data sets available Language technology tools lemmatizers, POS-taggers, parsers, machine learning for pattern finding
12
Szeged, Jan 2008Kilgarriff, Global WordNet12 Rationalists vs empiricists in the age of the web semantic web vs Google?
13
Szeged, Jan 2008Kilgarriff, Global WordNet13 What are you? Temperament Complementary/alternatives Barbu and Poesio, Keller and Lapata: comparisons, evaluations (AK: current research project)
14
Szeged, Jan 2008Kilgarriff, Global WordNet14 What is meaning? Fregean Gricean
15
Szeged, Jan 2008Kilgarriff, Global WordNet15 Gottlob Frege (1848-1925) Founder of modern logic Truth values The sentence “grass is green” is true if and only if grass is green (Tarski) Meanings of words, phrases are such that: Put them together in a sentence State basic facts Sentence computes to ‘true’ if sentence is true, ‘false’ if it is false
16
Szeged, Jan 2008Kilgarriff, Global WordNet16 Gottlob Frege (1848-1925) Formal semantics Sparkling analyses for quantifiers, connectives Montague semantics Foundations for maths, databases, ontologies …
17
Szeged, Jan 2008Kilgarriff, Global WordNet17 H. P. Grice (1913-1988) An agent means something by an utterance if and only if they intended the utterance to produce some effect in an audience by means of the recognition of this intention. Dictionary of Philosophy of Mind, http://philosophy.uwaterloo.ca
18
Szeged, Jan 2008Kilgarriff, Global WordNet18 Meaning is something you do Basis of meaning is Meaning event Speaker’s intention Speaker’s expectation of interpretation of hearer (messy, hard)
19
Szeged, Jan 2008Kilgarriff, Global WordNet19 Strawson commentary (1970s) For the sake of a label, we might call it the conflict between the theorists of communication-intention and the theorists of formal semantics. […] A struggle on what seems to be such a central issue in philosophy should have something of a Homeric quality; and a Homeric struggle calls for gods and heroes. I can at least, though tentatively, name some living captains and benevolent shades: on the one side, say, Grice, Austin, and the later Wittgenstein; on the other, Chomsky, Frege, and the earlier Wittgenstein.
20
Szeged, Jan 2008Kilgarriff, Global WordNet20 Battle of the two Adams?
21
Szeged, Jan 2008Kilgarriff, Global WordNet21 Relevance to word senses Fregean Supports reasoning Builds on well-defined word-meanings Identifying word meanings: can’t help Fall back on Grice
22
Szeged, Jan 2008Kilgarriff, Global WordNet22 Fauconnier and Turner “linguistics expressions prompt for meanings rather than express meanings” (AK chapter, Agirre and Edmonds WSD book)
23
Szeged, Jan 2008Kilgarriff, Global WordNet23 Preliminaries over What is a word sense
24
Szeged, Jan 2008Kilgarriff, Global WordNet24 The lexicographers They create them Methods Introspection Other dictionaries Corpus Atkins, Hanks, Krishnamurthy
25
Szeged, Jan 2008Kilgarriff, Global WordNet25 What is a word sense (1) SFIP Sufficiently frequent insufficiently predictable (a glass of) whisky x (a glass of) tequila
26
Szeged, Jan 2008Kilgarriff, Global WordNet26 What is a word sense (2) homonymy analogy polysemy rules collocation
27
Szeged, Jan 2008Kilgarriff, Global WordNet27 What is a word sense (3) A cluster Of instances of use Operationalised as: corpus lines Clustered by lexicographers
28
Szeged, Jan 2008Kilgarriff, Global WordNet28 What is a word sense (3)
29
Szeged, Jan 2008Kilgarriff, Global WordNet29 What is a word sense (3)
30
Szeged, Jan 2008Kilgarriff, Global WordNet30 What is a word sense (3)
31
Szeged, Jan 2008Kilgarriff, Global WordNet31 What is a word sense (3)
32
Szeged, Jan 2008Kilgarriff, Global WordNet32 What is a word sense (3) A cluster Of instances of use Operationalised as: corpus lines Clustered by lexicographers Makes sense of Overlapping senses Different dictionaries, different senses Lumping and splitting
33
Szeged, Jan 2008Kilgarriff, Global WordNet33 I don’t believe in word senses Believe in: resurrection ghost witch vampire god miracle fairy Philosophy: Ontological commitment (same meaning different register) “good entities to build belief systems on”
34
Szeged, Jan 2008Kilgarriff, Global WordNet34 But I’m an NLP person Automatic clustering? Inspiration: Hindle 1991, Schütze 1993, Grefenstette 1993, Lin 1999 You can get semantic sense from corpora+stats
35
Szeged, Jan 2008Kilgarriff, Global WordNet35 First attempt Longman 1994 Abject failure No grammar Corpus too small and noisy Naïve clustering Useless programmer
36
Szeged, Jan 2008Kilgarriff, Global WordNet36 Collocations Easy Most words don’t go with most other words Then build on what we can do well (metaphor, analogy, homonymy, rules: all much harder)
37
Szeged, Jan 2008Kilgarriff, Global WordNet37 The Sketch Engine 2003: programmer problem solved Corpora More available Build big clean ones from web Grammar POS-taggers/lemmatisers available Shallow regexp grammars if no full parser Stats: progress (Lin, Curran, Evert …)
38
Szeged, Jan 2008Kilgarriff, Global WordNet38 demo
39
Szeged, Jan 2008Kilgarriff, Global WordNet39 Clustering Word sketch Collocates organised by grammar Dictionary Collocates (and other things) organised by meaning How to re-organise Three phases
40
Szeged, Jan 2008Kilgarriff, Global WordNet40 Semi-automatic dictionary drafting (SADD) Automatic clustering of collocates Propose senses Iterate: Lexicographer input Confirm/reject/edit sense inventory Assigns collocates / corpus lines to senses WSD Uses seeds to build full WSD for word Find more collocates for each sense XML dictionary entry Load into dictionary-editing tool
41
Szeged, Jan 2008Kilgarriff, Global WordNet41 Atkins method for bilingual lexicography Analyse source language From corpus List all expressions that might possibly have a non-predictable translation Very fine grained Lots of collocations target-language-neutral; re-usable Translate Edit to finalise dictionary
42
Szeged, Jan 2008Kilgarriff, Global WordNet42 New English-Irish Dictionary Irish: Gaelic language, some native speakers, culturally important for Ireland Project To replace dictionary from 1950s Government-funded project Lexicography MasterClass (Atkins Rundell Kilgarriff) designed project in 2003
43
Szeged, Jan 2008Kilgarriff, Global WordNet43 English analysis for NEID New project, 1 st Feb 2008- late 2010 Contractor: Lexicography MasterClass 12 lexicographers Plan Test SADD If viable, use it on industrial scale
44
Szeged, Jan 2008Kilgarriff, Global WordNet44 demo2 http://corpora.fi.muni.cz/sadd/
45
Szeged, Jan 2008Kilgarriff, Global WordNet45 Thank you Sketch Engine: http://www.sketchengine.co.uk Lexicom workshop Pre-Euralex, 10-15 July, Barcelona http://www.iula.upf.edu/agenda/lexicom Pre-CICLING, Mexico, Feb 2009
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.