Work Group 2: Ontological Concepts for Lexical Entries.

Slides:



Advertisements
Similar presentations
Language & Mind Summer Words Perhaps the most conspicuous, most easily extractable aspect of language. Cf. phone, phoneme, syllable NB word vis.
Advertisements

The Wichita lexicon in LEXUS Armik Mirzayan University of Colorado at Boulder Jacquelijn Ringersma Max Planck Institute for Psycholinguistics RELISH Workshop.
Morphology Chapter 7 Prepared by Alaa Al Mohammadi.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Introduction to Linguistics and Basic Terms
Language is very difficult to put into words. -- Voltaire What do we mean by “language”? A system used to convey meaning made up of arbitrary elements.
Morphology part 2 Andrew Hippisley Department of Computing, University of Surrey.
Linguisitics Levels of description. Speech and language Language as communication Speech vs. text –Speech primary –Text is derived –Text is not “written.
Building the Valency Lexicon of Arabic Verbs Viktor Bielický Otakar Smrž LREC 2008, Marrakech, Morocco.
323 Morphology The Structure of Words 1.1 What is Morphology? Morphology is the internal structure of words. V: walk, walk+s, walk+ed, walk+ing N: dog,
EMELD Workshop on Digitizing Lexical Information Modeling Lexical Entries in Bilingual Dictionaries —Or— Exegeting the UML Model Mike Maxwell Linguistic.
E-Meld Workshop on Digitization of lexical Information 3-5 August 2002, EMU, Ypsilanti Working Group on Lexicon Macrostructures Chairman’s Report Dafydd.
Chapter 10 Natural Language Processing Xiu-jun GONG (Ph. D) School of Computer Science and Technology, Tianjin University
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
9. Microstructure of Bilingual Dictionaries. The microstructure of the dictionary specifies the way the lemma articles are composed. The lemma article.
Chapter 1: By: Ms. Ola Al-arjani
Introduction Pinker and colleagues (Pinker & Ullman, 2002) have argued that morphologically irregular verbs must be stored as full forms in the mental.
Lecture 2 What Is Linguistics.
What is Language?. What is Saussure's definition semiology? 1. Semiology is "A science that studies the life of signs within society..." 2. A semiological.
Reasons to Study Lexicography  You love words  It can help you evaluate dictionaries  It might make you more sensitive to what dictionaries have in.
323 Morphology The Structure of Words 3. Lexicon and Rules 3.1 Productivity and the Lexicon The lexicon is in theory infinite, but in practice it is limited.
WEEK3- MORPHOLOGY Dr. Monira I. Al-Mohizea. What is this?
Introduction to CL & NLP CMSC April 1, 2003.
Learning to read 1 Three issues for this lecture: 1.What is reading? 2.What is language? 3.What is the task facing children as they learn to read?
Introduction to Morphology and Syntax (NGL 243)
Chapter 3 Lexical & Grammatical Morphology Morphology Lane 333.
AN INTRO TO LINGUISTICS CREATED BY TENAYA CAMPBELL.
Language. Phonetics is the study of how elements of language are physically produced.
Chapter 3 Culture and Language. Chapter Outline  Humanity and Language  Five Properties of Language  How Language Works  Language and Culture  Social.
INTRODUCTION TO PRAGMATICS the study of language use the study of linguistic phenomena from the point of view of their usage properties and processes (Verschueren,
Introduction to Linguistics Class # 1. What is Linguistics? Linguistics is NOT: Linguistics is NOT:  learning to speak many languages  evaluating different.
The Minimalist Program
Natural Language Processing Chapter 2 : Morphology.
Lexicography Lexicon has two different meanings:
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
MORPHOLOGY definition; variability among languages.
Jeopardy Syntax Morphology Sociolinguistics and Prescriptivism Phonology Language and Diversity Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300.
Levels of Linguistic Analysis
School Kids Investigating Language & Life in Society 3 February 2015 Lesson 4: Levels of Linguistic Structure, History of English Teaching Fellows Anna.
English Morphology Introduction Talib M. Sharif Omer Asst. Lecturer, English Department November22,
Group 2: Sino-Tibetan Languages Working Group II: Sino-Tibetan Languages Session Report July 2, 2005.
School Kids Investigating Language & Life in Society 1 February 2015 Lesson 3: Linguistic Landscapes & Levels of Linguistic Structure Teaching Fellows.
LANGUAGE, DIALECT, AND VARIETIES
Slang. Informal verbal communication that is generally unacceptable for formal writing.
Introduction to Language and Society August 25. Areas in Linguistics Phonetics (sound) Phonology (sound in mind) Syntax (sentence structure) Morphology.
Kuiper and Allan Chapter 2.2.2
INTRODUCTION TO APPLIED LINGUISTICS
Morphology 1 : the Morpheme
VISUAL WORD RECOGNITION. What is Word Recognition? Features, letters & word interactions Interactive Activation Model Lexical and Sublexical Approach.
Introduction to Linguistics Unit Four Morphology, Part One Dr. Judith Yoel.
NLP Midterm Solution #1 bilingual corpora –parallel corpus (document-aligned, sentence-aligned, word-aligned) (4) –comparable corpus (4) Source.
HOMONYM One of a group of words that share the same spelling and pronunciation but have different meanings Homograph = same spelling, different meaning.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
An Introduction to Linguistics
Università di Cagliari
Morphology Morphology Morphology Dr. Amal AlSaikhan Morphology.
INTRODUCTION TO PHONETICS AND PHONOLOGY
Enrico Grazzi Lingua e Traduzione Inglese I LCMC 6 Cfu A.A
Morphology and syntax.
Xiaogang Ma, John Erickson, Patrick West, Stephan Zednik, Peter Fox,
CHAPTER 5 This chapter introduces students to the study of linguistics. It discusses the basic categories and definitions used to study language, and the.
Natural Language Processing (NLP)
What is Linguistics? The scientific study of human language
Course supervisor: Lubna Siddiqui
Ontology.
Língua Inglesa - Aspectos Morfossintáticos
Levels of Linguistic Analysis
Natural Language Processing (NLP)
English Linguistcis English Morphology Prof. Isabel Moskowich.
Natural Language Processing (NLP)
Presentation transcript:

Work Group 2: Ontological Concepts for Lexical Entries

An example (Sesana; Gur; Ghana): hórro "at its heart, dirty": (2) "bad" mε- (prefix, verb>modifier); productive but with exceptions mεhórro 'dirty' (only) Must list in lexical entries which verbs take mε- Proposed solution: Assign identifiers (senses and subsenses) Use subsense indentifier to link mεhorro to "be dirty"

: Attributes (1) native-speaker type (e.g. Ingush and Turkish - use infinitive) : Navajo - infl form (2) linguist-conventions type (Ingush, Turkish, and Navajo: roots) replace with morph/morpheme - ?only when irregular (e.g. suppletion) types: Suppletion, …. morphosyntactic information could have a subtypes morphology and syntax, limited, vs.?? link media stream to transcription (MMaxwell's Form) MM: definition, gloss, SciName suggested elements: / - kinship term - cognate, reconstructions, loans/copies, source language includes register and stylistic value - formal, informal, taboo, colloquial, child language, archaic

LexEntry type= headword/lem ma MSI headcitation form orthog. variant sense id unpredictable variation sensetranscription (phonetic, gesture) media+ audio video image example idexample gloss semantic fieldsense id etymology use idaccess lexical relation scientific term dialect region

Our thinking about lexical resources/structures has been dominated by print models, primarily dictionaries, less so thesauruses and encyclopedias.

We have the opportunity to design electronic, specifically web, lexical resources in new ways, combining the parts in whatever way is best for specific purposes. This suggests a highly modular design so that the parts can be combined as needed, not just for looking up the meaning or pronunciation of individual words.

The natural unit of analysis is the lexical entry, or lexeme. But each of its parts: phonetic, phonological, orthographic, morphological, syntactic, semantic, pragmatic, perhaps even etymological, are discrete, separable and recurring.

Bell and Bird recognize this to the extent of suggesting as a data structure a set of triples L={T a = }, where each T, F, M, S can be separately identified and combined. We can go further with this breakdown, particularly customizing the parts covered by M for the language, providing complete paradigms, derivational patterns, etc.

We know how to break down phonology, orthography, and to some extent, morphology and syntax into smaller units of analysis. We have had less success, consensus, and hence experience with semantics.

We have on the one hand Bloomfield- Fodor “atomism”– the unit of meaning is the meaning of the morpheme– and on the other Pustejovsky-Wierzbicka “decompositionalism” into primitive semantic units (properties and relations).

We need to come to a practical working agreement about semantic analysis. We’re being guided/driven by our friends and colleagues in computer science and artificial intelligence to do so. They are busily developing commonsense ontologies (Cyc Corp, Teknowledge) and practical reasoners, the “agents” who will work for us behind the scenes in Web transactions, for example, so I recommend that we plunge into this research area w. gusto.

Conclusion: A distributed lexicon, with the parts identified and some parts pre- assembled (e.g., Bird and Bell style N- tuples), others assemblable and presentable on the fly, e.g., the inflectional paradigms for a particular stem.