1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton.

Slides:



Advertisements
Similar presentations
1 Lexical Semantics for the Semantic Web Patrick Hanks Masaryk University, Brno Czech Republic UFAL, Mathematics Faculty, Charles University.
Advertisements

ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES language teaching (1) Bambang Kaswanti Purwo
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES language teaching (3) Bambang Kaswanti Purwo
CODE/ CODE SWITCHING.
Language Assessment What it measures and how Jill Kerper Mora, Ed.D.
The Cooperative Principle
Critical Thinking Course Introduction and Lesson 1
Modality Lecture 10. Language is not merely used for conveying factual information A speaker may wish to indicate a degree of certainty to try to influence.
1 The Generative Lexicon (GL) meets Corpus Pattern Analysis (CPA) Patrick Hanks Institute of Formal and Applied Linguistics, Charles University in Prague,
Mapping meaning onto use: a Pattern Dictionary of English Verbs Patrick Hanks Faculty of Informatics, Masaryk University, Brno, Czech Republic
1 Why do CPA? Patrick Hanks Research Institute for Information and Language Processing, University of Wolverhampton; Bristol Centre for Linguistics, University.
1 Elliptical Arguments Patrick Hanks Institute of Formal and Applied Linguistics, Charles University in Prague, Czech Republic ***
Ontology From Wikipedia, the free encyclopedia In philosophy, ontology (from the Greek oν, genitive oντος: of being (part. of εiναι: to be) and –λογία:
1 Computing Real Language Meaning for the Semantic Web Patrick Hanks Masaryk University, Brno Czech Republic UFAL, Mathematics Faculty,
Denotation and connotation denotation and connotation are used to different types of value that we attribute to words.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
What is a corpus?* A corpus is defined in terms of  form  purpose The word corpus is used to describe a collection of examples of language collected.
January 12, Statistical NLP: Lecture 2 Introduction to Statistical NLP.
COGNITION AND LANGUAGE Pertemuan 6 Matakuliah: O0072 / Pengantar Psikologi Tahun: 2008.
System Design and Analysis
Language, Mind, and Brain by Ewa Dabrowska Chapter 2: Language processing: speed and flexibility.
PSY 369: Psycholinguistics Some basic linguistic theory part3.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Word Usage and Vocabulary in context Lecture 8
1 Vocab Assessment & Corpora and Concordancing Major vocabulary assessment tools Major corpora and concordancers.
Corpus Linguistics What can a corpus tell us ? Levels of information range from simple word lists to catalogues of complex grammatical structures and.
1 Syntagmatic Preferences Patrick Hanks Masaryk University In honour of Yorick Wilks BCS, London, June 22, 2007.
McEnery, T., Xiao, R. and Y.Tono Corpus-based language studies. Routledge. Unit A 2. Representativeness, balance and sampling (pp13-21)
1 How People use words to make meanings __ How to compute the meaning of natural language utterances Patrick Hanks Professor in Lexicography University.
Linguistics, Pragmatics & Natural Grammar
The DVC project: Disambiguation of Verbs by Collocation ____ an introduction to the linguistic theory of norms and exploitations Patrick Hanks Research.
1 Statistical NLP: Lecture 10 Lexical Acquisition.
Word senses Adam Kilgarriff Lexical Computing Ltd Lexicography MasterClass Ltd Universities of Leeds, Sussex.
Getting the Language Right ITSW 1410 Presentation Media Software Instructor: Glenda H. Easter.
Chapter 10 Thinking and Language.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 12.
Chapter 6. Semantics is the study of the meaning of words, phrases and sentences. In semantic analysis, there is always an attempt to focus on what the.
N o, you don’t understand, I mean… Irini Nomikou supervisor: Dr. Floriana Grasso The one with the conductor and the girl on the train Cond: Did you pay.
An Intelligent Analyzer and Understander of English Yorick Wilks 1975, ACM.
7. Parsing in functional unification grammar Han gi-deuc.
인공지능 연구실 황명진 FSNLP Introduction. 2 The beginning Linguistic science 의 4 부분 –Cognitive side of how human acquire, produce, and understand.
VOCABCHAPTER 10. CONCEPT A mental grouping of similar objects, events, ideas, or people.
1 Statistical NLP: Lecture 7 Collocations. 2 Introduction 4 Collocations are characterized by limited compositionality. 4 Large overlap between the concepts.
BBL 3207 SEMESTER /2013 DR. IDA BAIZURA BAHAR LANGUAGE IN LITERATURE.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
LECTURE 2: SEMANTICS IN LINGUISTICS
Adapting activities in the Lexical Approach
Introduction Chapter 1 Foundations of statistical natural language processing.
1 Corpus Pattern Analysis (CPA) Patrick Hanks Research Institute of Information and Language Processing, University of Wolverhampton ***
Corpus-Driven Analysis of Noun Use
Lecture 1 Ling 442.
The Minto Pyramid Principle® Concept  The Minto Pyramid Principle concentrates on the thinking process that should precede writing. It explains how to.
1 CPA: Where do we go from here? Research Institute for Information and Language Processing, University of Wolverhampton; UPF Barcelona; University of.
Pragmatics. Definitions of pragmatics Pragmatics is a branch of general linguistics like other branches that include: Phonetics, Phonology, Morphology,
Genre and cultural purpose We recognize a genre when a text does something with language that we’re familiar with. Very often we are able state what kind.
Chapter 11 Language. Some Questions to Consider How do we understand individual words, and how are words combined to create sentences? How can we understand.
Implicature. I. Definition The term “Implicature” accounts for what a speaker can imply, suggest or mean, as distinct from what the speaker literally.
Use of Concordancers A corpus (plural corpora) – a large collection of texts, written or spoken, stored on a computer. A concordancer – a computer programme.
NLP Midterm Solution #1 bilingual corpora –parallel corpus (document-aligned, sentence-aligned, word-aligned) (4) –comparable corpus (4) Source.
How are experiential and interpersonal meanings communicated in translation? Dr. Ghurmallah Alghamdi.
In Other Words: a Coursebook on Translation (1992)
Statistical NLP: Lecture 7
A common-sense paradigm for linguistic research
Searching corpora.
Computing Natural-Language Meaning for the Semantic Web
Language, Logic, and Meaning
By Dr. Abdulrahman H. Altalhi
Lesson 2 follow up.
Statistical NLP: Lecture 10
Presentation transcript:

1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton ***

Goals of the tutorial To explore the relationship between meaning and phraseology. To explore the relationship between conventional uses of words and creative uses such as freshly coined metaphors. To discover factors that contribute to the dynamic power of natural language, including anomalous arguments, ellipsis, and other “explotations” of normal usage. 2

Procedure We shall focus on verbs. –We shall not assume that the analytic procedure developed for verbs is equally suitable for nouns We shall not invent examples. –Instead, we shall analyse data. Instead, we shall look at large numbers of actual uses of a verb, using concordances to a very large corpus. We shall ask questions such as: –What patterns of normal use of this verb can we detect? –What is the nature of a “pattern”? –Does each pattern have a different meaning? –What is the nature of lexical ambiguity, and why has it been so troublesome for NLP? 3

Patterns in Corpora When you first open a concordance, very often some patterns of use leap out at you. –Collocations make patterns: one word goes with another –To see how words make meanings, we need to analyse collocations The more you look, the more patterns you see. BUT When you try to formalize the patterns, you start to see more and more exceptions. The boundaries are fuzzy and there are many outlying cases. 4

Analysis of Meaning in Language Analysis based on predicate logic is doomed to failure: –Words are NOT building blocks in a ‘Lego set’ –A word does NOT denote ‘all and only’ members of a set –Word meaning is NOT determined by necessary and sufficient conditions for set membership Instead, a prototype-based approach to the lexicon is necessary: –mapping prototypical interpretations onto prototypical phraseology 5

The linguistic ‘double-helix’ hypothesis A language is a system of rule-governed behaviour. Not one, but TWO (interlinked) sets of rules: 1.Rules governing the normal uses of words to make meanings 2.Rules governing the exploitation of norms 6

Exploitations People exploit the rules of normal usage for various purposes: For economy and speed: –Conversation is quick –Listeners (and readers) get bored easily –Words that are ‘obvious’ can sometimes be omitted To say new things (reporting discoveries) To say old things in new ways For rhetoric, humour, poetry, politics … 7

Lexicon and prototypes Each word is typically used in one or more patterns of usage (valency + collocations) Each pattern is associated with a meaning: –a meaning is a set of prototypical beliefs –In CPA, meanings are expressed as ‘anchored implicatures’. –few patterns are associated with more than one meaning. Corpus data enables us to discover the patterns that are associated with each word. 8

What is a pattern? (1) The verb is the pivot of the clause. A pattern is a statement of the clause structure (valency) associated with a meaning of a verb, –together with the typical semantic values of each argument. –arguments of verbs are populated by lexical sets of collocates Different semantic values of arguments activate different meanings of the verb. 9

What is a pattern? (2) [[Human]] fire [[Firearm]] [[Human]] fire [[Projectile]] [[Human 1]] fire [[Human 2]] [[Anything]] fire [[Human]] {with enthusiasm} [[Human]] fire [NO OBJ] Etc. 10

Semantic Types and Ontology Items in double square brackets are semantic types. Semantic types are being gathered together into a shallow ontology. –(This is work in progress in the currect CPA project) Each type in the ontology will (eventually) be populated with a set of lexical items on the basis of what’s in the corpus under each relevant pattern. 11

Shimmering lexical sets Lexical sets are not stable – not „all and only”. Example: –[[Human]] attend [[Event]] –[[Event]] = meeting, wedding, funeral, etc. –But not thunderstorm, suicide. 12

Meanings and boundaries Boundaries of all linguistic and lexical categories are fuzzy. –There are many borderline cases. Instead of fussing about boundaries, we should focus instead on identifying prototypes Then we can decide what goes with what –Many decision will be obvious. –Some decisions – especially about boundary cases – will be arbitrary. 13

The importance of phraseology “Many, if not most, meanings depend on the presence of more than one word for their realization.” – John Sinclair 14

The Idiom Principle (Sinclair) In word use, there is tension between the „terminological tendency” and the „phraseological tendency”: –The terminological tendency: the tendency for words to have meaning in isolation –The phraseological tendency: the tendency for the meaning of a word to be activated by the context in which it is used. 15

Computing meaning (1) Each user of a language has a “corpus” of uses stored inside his or her head –These are traces of utterances that the person has seen, heard, or uttered Each person’s mental corpus of English (etc.) is different What all these “mental corpora” have in common is patterns By analysing a huge corpus of texts computationally, we can create a pattern dictionary for use by computers as well as by people. In a pattern dictionary, each pattern is associated with a meaning (or a translation, or other implicature) 16

Computing meaning (2) When processing unseen text, the computer compares the actual use of each verb in the text with the inventory of patterns in the pattern dictionary, unsing information about a) valency, and b) semantic types of collocates. Exact matches are not to be expected. Best match wins: the pattern dictionary provides the most probable meaning (or trnaslation) of the word in context. 17