1 Why the “Word Sense Disambiguation Problem” can't be solved, and what should be done instead Patrick Hanks Masaryk University, Brno Czech Republic

Slides:



Advertisements
Similar presentations
Building Wordnets Piek Vossen, Irion Technologies.
Advertisements

1 Lexical Semantics for the Semantic Web Patrick Hanks Masaryk University, Brno Czech Republic UFAL, Mathematics Faculty, Charles University.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Authority 2. HW 8: AGAIN HW 8 I wanted to bring up a couple of issues from grading HW 8. Even people who got problem #1 exactly right didn’t think about.
Unit 2 Welcome to the unit. What is happiness to you?
1 The Generative Lexicon (GL) meets Corpus Pattern Analysis (CPA) Patrick Hanks Institute of Formal and Applied Linguistics, Charles University in Prague,
1 Why do CPA? Patrick Hanks Research Institute for Information and Language Processing, University of Wolverhampton; Bristol Centre for Linguistics, University.
CL Research ACL Pattern Dictionary of English Prepositions (PDEP) Ken Litkowski CL Research 9208 Gue Road Damascus,
S.T.A.I.R.. General problem solving strategy that can be applied to a range problems.
Phil 160 Kant.
1 Computing Real Language Meaning for the Semantic Web Patrick Hanks Masaryk University, Brno Czech Republic UFAL, Mathematics Faculty,
Automatic Metaphor Interpretation as a Paraphrasing Task Ekaterina Shutova Computer Lab, University of Cambridge NAACL 2010.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
Introduction to Linguistics and Basic Terms
Term 2 Week 3 Semantics.
August 23, 2010 Grammars and Lexicons How do linguists study grammar?
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Concepts and Categories. Functions of Concepts By dividing the world into classes of things to decrease the amount of information we need to learn, perceive,
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
Test Taking Tips How to help yourself with multiple choice and short answer questions for reading selections A. Caldwell.
What is Sketching? Engineering Design and Presentation
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
HTML and Designing Web Pages. u At its creation, the web was all about –Web pages were clumsily assembled –Web sites were accumulations of hyperlinked.
1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.
Notes for CS3310 Artificial Intelligence Part 2: Representation of facts Prof. Neil C. Rowe Naval Postgraduate School Version of January 2006.
The DVC project: Disambiguation of Verbs by Collocation ____ an introduction to the linguistic theory of norms and exploitations Patrick Hanks Research.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Query Relevance Feedback and Ontologies How to Make Queries Better.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
How to do Quality Research for Your Research Paper
Unit 1 – Improving Productivity Instructions ~ 100 words per box.
Big Idea 1: The Practice of Science Description A: Scientific inquiry is a multifaceted activity; the processes of science include the formulation of scientifically.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
The Current State of FrameNet CLFNG June 26, 2006 Fillmore.
W ORD S ENSE D ISAMBIGUATION By Mahmood Soltani Tehran University 2009/12/24 1.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Semantic Web - an introduction By Daniel Wu (danielwujr)
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
Preparing for the TAKS ESSAY. Content / Ideas This is the heart of the paper--what the writer has to say. It should be a topic that is important to.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Introduction Chapter 1 Foundations of statistical natural language processing.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Sight Words.
What are The Traits of Writing? A trait can be defined as a characteristic needed for a successful performance. Just as there are traits for good.
Building Abstractions with Variables (Part 2) CS 21a: Introduction to Computing I First Semester,
“A change of heart about animals” By jeremy Rifkin
Object Oriented Paradigm OOP’s. Problems with Structured Programming As programs grow ever larger and more complex, even the structured programming approach.
A Change of Heart About Animals
CLASSIFICATION AND DIVISION A type of analysis. Analysis Breaking something down into parts to understand or explain it better Division takes a whole.
LOGIC, PHILOSOPHY AND HUMAN EXISTENCE
Semantic Roles and Ontologies Ontologies Growing interest in the data structures known as ontologies Language expressions covering the.
CHAPTER 1 Introduction BIC 3337 EXPERT SYSTEM.
Computing Natural-Language Meaning for the Semantic Web
“A change of heart about animals” By jeremy Rifkin
Fundamentals/ICY: Databases 2010/11 WEEK 1
Information Networks: State of the Art
Information Retrieval
Formulating Research Questions (RQ’s) Refer EXERCISE # 1
Presentation transcript:

1 Why the “Word Sense Disambiguation Problem” can't be solved, and what should be done instead Patrick Hanks Masaryk University, Brno Czech Republic PALC - Lodz April, 2007

2 Traditional WSD procedure Take a list of senses of each word from a source (typically WordNet or LDOCE) Stipulate “disambiguation criteria” for different word senses Applies the criteria to unseen texts Results: few successes; many failures (e.g. unresolved ambiguities; or cases where none of the criteria are satisfied). Declare success.

Yorick Wilks A leading figure in Artificial Intelligence. His theory of preference semantics has been hugely influential (on me and Paul Procter among many others and - through us - on lexicography) Wilks rightly characterizes the Semantic Web as “the apotheosis of annotation” and asks, “But what are its semantics?” His 2005 paper (with Nancy Ide), Making Sense about Sense, gets things badly wrong. 3

What sort of inventory? “Contemporary automatic WSD assigns sense labels drawn from a pre-defined sense inventory to words in context.... If dictionaries are not a good source of sense inventories useful in NLP, where do we turn?” -- Ide and Wilks 2005 –Yes, but for mapping meaning onto use, you also need a source of syntagmatic inventories. –Only Cobuild says anything systematically about syntagmatics of each word. –The WSD people have never tried to use Cobuild. –You can’t extract information that isn’t there. 4

Ide and Wilks on lexicographers “Whatever kind of lexicographer one is dealing with,... their goal is and must be the explanation of meaning to one who does not know it.” –Ide and Wilks again –One might as well say: “Whatever kind of computational linguist one is dealing with,... their goal is and must be the translation of texts into a foreign language without human intervention.” –One of many goals of lexicographers is to compile inventories. Ask a suitably trained lexicographer to compile a syntagmatic inventory and he/she will compile one. 5

Ide and Wilks on “successful” WSD “All successful WSD has operated at... the homograph rather than the sense level... (e.g. “crane” = bird or machine)... basically those [distinctions] that can be found easily in parallel texts in different languages.” –Ide and Wilks again –You mean, like French grue, Czech jeráb? 6

Does MT really succeed at the homograph level? Consider: –Eng. crane --> Ger. Hebewerkzeug, Kranich, Kran. Google’s t-translate makes a horrible mess of: –A crane had built its nest on the roof vs. –They used a crane to lift the goods. The words are ambiguous but the contexts are unambiguous. 7

Reformulation of the aim of WSD WSD should aim to disambiguate all uses of words that are not ambiguous in the contexts in which they are used. –There is a crane in my garden ambiguous. –A crane had built its nest on the roof and –They used a crane to lift the goods not ambiguous. 8

What do we need? We need a dictionary of contexts. There isn’t one. 9

Tim Berners-Lee Another hugely influential figure Inventor of the word-wide web Co-author (with Hendler and Lassila) of an article in Scientific American (2001) predicting “the semantic web”. 10

Semantic Web: the dream To enable computers to manipulate data meaningfully. “Most of the Web's content today is designed for humans to read, not for computer programs to manipulate meaningfully.” –Berners-Lee et al.,

Why are people so excited about the Semantic Web idea? It offers “unchecked exponential growth” of “data and information that can be processed automatically” –Berners-Lee et al., 2001 Distributed, not centrally controlled –but with scientists as ‘guardians of truth’? -Wilks “... paradoxes and unanswerable questions are a price that must be paid to achieve versatility.” –Berners-Lee et al.,

Semantic Web: the reality RDF (Resource Description Framework) handles only html-tagged entities and precisely defined items. In SW jargon “ontology” means a list of names, addresses, documents, and other tagged, defined entities. The SW does not engage with natural language. PREDICTION: If it does, then in the current state of NLP it will come unstuck. 13

Semantic Web as a librarian All efforts devoted to tagging and classifying documents. The SW currently has neither the time nor the skill needed to look inside the documents and read what they say. If the dream is to be fulfilled, then sooner or later the SW must engage with the vague, fuzzy phenomenon that is meaning in natural language. It must learn to process unstructured text. 14

Hypertext “The power of hypertext is that anything can link to anything.” –Berners-Lee et al., 2001 Yes, but we need procedures for determining (automatically) what counts as a relevant link, e.g. –Firing a person is relevant to employment law. –Firing a gun is relevant to warfare and armed robbery. 15

Precise definition does not help discover implicatures The meaning of the English noun second is vague: “a short unit of time” and “1/60 of a minute”. –Wait a second. –He looked at her for a second. It is also a very precisely defined technical term in certain scientific contexts, the basic SI unit of time: –“the duration of 9,192,631,770 cycles of radiation corresponding to the transition between two hyperfine levels of the ground state of an atom of caesium 133.” 16

Being precise about vagueness Giving a precise definition to an ordinary word removes it from ordinary language. When it is given a precise, stipulative definition, an ordinary word becomes a technical term. “An adequate definition of a vague concept must aim not at precision but at vagueness; it must aim at precisely that level of vagueness which characterizes the concept itself.” –Wierzbicka

A proposed new resource A “Pattern Dictionary” of verbs and their arguments Based on close, detailed, painstaking corpus pattern analysis (CPA) Drawing on a new, lexically based theory of language, the “theory of norms and exploitations” (TNE) 18

19 CPA (Corpus Pattern Analysis) 1.Identify usage patterns for each word – Patterns include semantic types and lexical sets of arguments (valencies) Associate a meaning (“implicature”) with each pattern (not with the word in isolation) Match occurrences of the target word in unseen texts to the nearest pattern (“norm”) If 2 matches are found, choose the most frequent If no match is found, it is not normal usage -- it is an exploitation of a norm (or a mistake).

20 Dictionaries and Ontologies “Patterns include semantic types”.... What are these? Dictionaries don’t show semantic type structure. Ontologies such as WordNet and the Brandeis Semantic Ontology (BSO) show a hierarchical structures of types, e.g. a gun, pistol, revolver, rifle, cannon, mortar, Kalashnikov,... is a: weapon artefact physical object (or ‘material entity’) entity

21 Brandeis Semantic Ontology A hierarchy of semantic concepts, with links to words at the appropriate level. Example (shortened and edited): Name: gun Type: Firearm Inheritance tree: TopType > Entity > Material Entity > Artifact > Weapon > Firearm Telic: Attack with Weapon

22 Ontological reasoning EXAMPLE: If it’s a gun, it must be a weapon, an artefact, a physical object, and an entity, and it is used for attacking people and things. –Otherwise known as ‘semantic inheritance’ –So far, so good. –How useful is ontological information as a basis for verbal reasoning? –Not as useful as we would like.

23 Semantics and Usage (1) He was pointing a gun at me -- is a Weapon and a Material Entity. BUT 2. A child’s toy gun -- is an Entertainment Artifact, not a Weapon 3. The fastest gun in the west -- is a Human < Animate Entity, not a Weapon “must be a weapon” on the previous slide is too strong; should be “is probably a weapon” probabilities can be measured, using corpus data The normal semantics of terms are constantly exploited to make new concepts (as in 2 and 3)

24 Semantics and Usage (2) Knowing the exact place of a word in a semantic ontology is not enough To compute meaning, we need more info.... Another major source of semantic information (potentially) is usage: –how words go together (normally | unusually | never) How do patterns of usage (syntagmatic) mesh with the information in an ontology?

25 The Semantics of Norms Dennis closed his eyes and fired the gun –[[Human]] fires [[Firearm]] He fired a single round at the soldiers –[[Human]] fires [[Projectile]] {at [[PhysObj = Target]]} BOTH PATTERNS MEAN: [[Human]] causes [[Firearm]] to discharge [[Projectile]] towards [[Target]] Rumsfeld fires anyone who stands up to him. –[[Human 1 = Employer]] fires [[Human 2 = Employee]] MEANS: discharge from employment –The semantic roles Employer and Employee are assigned by context -- they are not part of the type structure of the language.

26 Complications and Distractions Minor senses: reading this new book fired me with fresh enthusiasm to visit this town –[[Event]] fire [[Human]] {with [[Attitude = Good]]} Mr. Walker fired questions at me. –[[Human 1]] fire [[Speech Act]] {at [[Human 2]]} Named inanimate entity: herI... got back on Mabel and fired her up. –Mabel is [[Artifact]] (a motorbike, actually) –[[Human]] fire [[Artifact > Energy Production Device]] {up}

27 What do you do with a gun? Word Sketch Engine: freq. of gun: BNC 5,269; OEC 91, (7)-330-load (4)-663-aim (11) (14) (8) (10) (1) (2) OECBNC OECBNC (7) 70hold (6) 20wave (5) 11brandish (4) 31jump (3) 85carry (2) 59point (1)104fire Salience (rank)Frequency of collocation Collocate (verb with gun as object)

28 Shimmering Lexical Sets (1) weapon: carry, surrender, possess, use, deploy, fire, acquire, conceal, seize,... ____ pointgun: fire, carry, point, jump, brandish, wave, hold, cock, spike, load, reload,... rifle: fire, carry, sling (over one’s shoulder), load, reload, aim, drop, clean,... pistol: fire, load, level, hold, brandish, point, carry, wave,... revolver: empty, draw, hold, carry, take,...

29 Shimmering Lexical Sets (2) spear: thrust, hoist, carry, throw, brandish sword: wield, draw, cross, brandish, swing, sheathe, carry,... dagger: sheathe, draw, plunge, hold sabre: wield, rattle, draw knife: brandish, plunge, twist, wield bayonet: fix

30 Shimmering Lexical Sets (3) missile: fire, deploy, launch bullet: bite, fire, spray, shoot, put shell: fire, lob; crack,... round: fire, shoot;... arrow: fire, shoot, aim; paint, follow

31 Shimmering Lexical Sets (4) fire: shot, gun, bullet, rocket, missile, salvo... [[Projectile]] or [[Firearm]] carry: passenger, weight, bag, load, burden, tray, weapon, gun, cargo... [polysemous] aim: kick, measure, programme, campaign, blow, mischief, policy, rifle... [polysemous] point: finger, gun, way, camera, toe, pistol... [polysemous?] brandish: knife, sword, gun, shotgun, razor, stick, weapon, pistol... [[Weapon]] shoot: glance, bolt, Palestinian, rapid, policeman; –shoot... with: pistol, bow, bullet, gun

32 Triangulation Meanings attach to patterns, not words. A typical pattern consists of a verb and its arguments (with semantic values), thus: [[Human]] fire [[Projectile]] {from [[Firearm]]} {PREP [[Physical Object]]} Pattern elements are often omitted in actual usage. (See Porzig, FrameNet)

33 Semantic Type vs. Semantic Role [[Human]] fire [[Firearm]] {at [[PhysObj = Target]]} [[Human]] fire [[Projectile]] {at [[PhysObj = Target]]} Bond walks into our sights and fires his pistol at the audience The soldier fired a single shot at me The Italian authorities claim that three US soldiers fired at the car. –‘audience’, ‘me’, and ‘car’ have the semantic type [[Human]] and [[Vehicle]] (< [[PhysObj]]). –The context assigns them the semantic role Target.

34 Lexical sets don’t map neatly onto semantic types calm as a transitive (causative) verb: What do you calm? 1 lexical set, 5 semantic types: –him, her, me, everyone: [[Human]] –fear, anger, temper, rage: [[Negative Feeling]] –mind: [[Psychological Entity]] –nerves, heart: [[Body Part]] but not toes, chest, kidney) –breathing, breath: [[Living Entity Relational Process]] (but not defecation, urination) words from at least 3 of these types are canonical members of the set of things that get calmed

Populating a semantic type with lexical items Pattern: [[Human 1 | Event]] calm [[Human 2 | Animal]] –Canonical lexical items for [[Human]]: –him, her, me, everyone,... –Attributes of [[Human 2]] in this context: –fear, anger, temper, rage; mind; nerves, heart; breathing, breath 35

36 Why don’t ontologies help WSD? Ontologies such as Roget and WordNet attempt to organize the lexicon as a representation of 2,500 years of Aristotelian scientific conceptualization of the universe. This is not the same as investigating how people use words to make meanings. Why ever did we think it would be?

Why WSD can’t be done (as currently formulated) Because words (in isolation) don’t have meanings. You’re looking for something that does not exist. –Words have meaning potentials. –The meaning potential of a word is activated by context (real-word context of utterance and co- text). 37

38 What should be done instead Compare each actual usage with an inventory of norms. Best match wins. Don’t look for the meaning of the word -- look for the meaning of the pattern. Distinguish conventional, prototypical usage of words (norms) from creativity (exploitations). To do this, we need an inventory of patterned norms. The Pattern Dictionary of English Verbs will be such an inventory.