Formal Foundation of Lexical Functions Sylvain Kahane LaTTiCe/TALaNa Université Paris 7 Alain Polguère OLST Université de Montréal.

Slides:



Advertisements
Similar presentations
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
Advertisements

Monotrans: Human-Computer Collaborative Translation Chang Hu, Ben Bederson, Philip Resnik Human-Computer Interaction Lab Computational Linguistics and.
FREE-WORD COMBINATIONS
Cognitive Linguistics Croft & Cruse 9
Semantic Structure of the Word and Polysemy. Polysemy The ability of words to have more than one meaning is described as polysemy A word having several.
The Bulgarian National Corpus and Its Application in Bulgarian Academic Lexicography Diana Blagoeva, Sia Kolkovska, Nadezhda Kostova, Cvetelina Georgieva.
The quest for meaning in language documentation Felix Ameka.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
Machine Translation Anna Sågvall Hein Mösg F
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Evaluating an MT French / English System Widad Mustafa El Hadi Ismaïl Timimi Université de Lille III Marianne Dabbadie LexiQuest - Paris.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Semantics and Lexicology Generativist semantics. From structuralist semantics Semantic features, components.
Senkevych Anastasiya. Mass media is our only connection to the big world. English is the language of international news agencies. Thus, the translation.
Meaning and Language Part 1.
The Study of Meaning in Language
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
Outline What is a collocation? Automatic approaches 1: frequency-based methods Automatic approaches 2: ruling out the null hypothesis, t-test Automatic.
Building the Valency Lexicon of Arabic Verbs Viktor Bielický Otakar Smrž LREC 2008, Marrakech, Morocco.
LING 304 SEMANTICS YANBU UNIVERSITY COLLEGE APPLIED LINGUISTICS DEPARTMENT FIRST SEMESTER-131 Prepared by : Ms. Sahar Deknash.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
1 How to Compute the Meaning of Natural Language Utterances Patrick Hanks, Research Institute of Information and Language Processing, University of Wolverhampton.
Working group on multimodal meaning representation Dagstuhl workshop, Oct
1 Marie-Claude L'Homme Patrick Leroyer Benoit Robichaud Observatoire de linguistique Sens-Texte (OLST) Département de linguistique et de traduction Université.
Unit A1 What is Translation?
LREC 2008 AWN 1 Arabic WordNet: Semi-automatic Extensions using Bayesian Inference H. Rodríguez 1, D. Farwell 1, J. Farreres 1, M. Bertran 1, M. Alkhalifa.
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
CHAPTER 10 – VOCABULARY: STUDENTS IN CHARGE Presenter: 1.
The definition of wording According to Illustrated Oxford Dictionary(1999:961), wording refers to: 1. a form of words used; 2. the way in which something.
Towards multimodal meaning representation Harry Bunt & Laurent Romary LREC Workshop on standards for language resources Las Palmas, May 2002.
Terminology and documentation*  Object of the study of terminology:  analysis and description of the units representing specialized knowledge in specialized.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Semantics The study of meaning in language. Semantics is…  The study of meaning in language.  It deals with the meaning of words (Lexical semantics)
GUIDE : PROF. PUSHPAK BHATTACHARYYA Bilingual Terminology Mining BY: MUNISH MINIA (07D05016) PRIYANK SHARMA (07D05017)
ETAP-3: State of the Art, Options, and Prospects of Development Leonid Iomdin Institute for Information Transmission Problems Russian Academy of Sciences.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, January 2003.
The Unit Graphs Framework: A graph-based Knowledge Representation Formalism designed for the Meaning-Text Theory & Application to Lexicographic Definitions.
Reasoning with Dependency Structures and Lexicographic Definitions using Unit Graphs Maxime Lefrançois, Fabien Gandon [ maxime.lefrancois | fabien.gandon.
On the unit of mass: The mass of a macroscopic object is the sum of that of all its microscopic constituents and of a weak approximately calculable.
Idiomaticity and Translation in the Context of Contemporary Applied Linguistics. Zinaida Camenev, doctor conferenţiar, ULIM, Chişinău,Moldova Olga Pascari,
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
FIDELITY IN TRANSLATION AND INTERPRETATION PLAN 1.Fidelity as a phenomenon in translation 2.Verbalizing a simple idea 3.Principles of fidelity 3.1. Primary.
Leonid Iomdin Institute for Information Transmission Problems, Russian Academy of Sciences
BASIC TRANSLATION THEORIES
FREE-WORD COMBINATIONS Lecture # 11
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Extension du formalisme SES pour l’intégration de la hiérarchie d’abstraction et la granularité temporelle au sein de la modélisation et la simulation.
A Fully Lexicalized Grammar (for French) based on Meaning-Text Theory Sylvain Kahane Lattice, Université Paris 7 Mexico, February 19-24, 2001.
General Notes on Styles and Stylistics
METADATA MANAGEMENT AT ISTAT: CONCEPTUAL FOUNDATIONS AND TOOLS Istituto Nazionale di Statistica ITALY.
Lecture IV. Basic Translation Theories Plan 1. The Transformational Approach 2. The Denotative Approach 3. The Communicational Approach.
On the status of the deep-syntactic structure Sylvain Kahane Lattice, Paris 7 / Paris 10 MTT 2003, ENS 18 juin 2003.
Child Syntax and Morphology
Lexicons, Concept Networks, and Ontologies
Approaches to Machine Translation
Distributional analysis
Representation of Actions as an Interlingua
Chapter III: Terminology and Arabization: Problems of Multiplicity and Methodology Part 1.
Equivalence and equivalent effect
Introducing Domain and Typing Bias in Automata Inference
Equivalence and equivalent effect
How to use a dictionary effectively
Approaches to Machine Translation
The Study of Meaning in Language
Translation: key concepts
ViCoS Visualising Conceptual Spaces
User’s Perspective Laurie Gerber.
Presentation transcript:

Formal Foundation of Lexical Functions Sylvain Kahane LaTTiCe/TALaNa Université Paris 7 Alain Polguère OLST Université de Montréal

Workshop on Collocation, ACL 2001, Toulouse2 Content n The Meaning-Text concept of collocation n Lexical functions n Encodings and ECD encoding n Explicit encoding n Algebraic encoding n Conclusion

The Meaning-Text concept of collocation

Workshop on Collocation, ACL 2001, Toulouse4 An example: Collocations and MT n gros fumeurlit. transl. big smoker actual transl. heavy smoker n An MT system has to identify that gros fumeur is a collocation, something between a free combination and a full idiom

Workshop on Collocation, ACL 2001, Toulouse5 What is a collocation for us n Following Meaning-Text terminology, we call collocation a linguistic expression made up of two components: –the base of the collocation: a full lexical unit which is “freely” chosen by the speaker on the basis of its meaning (e.g. ‘smoker’  smoker); –the collocate: a lexical unit or a multilexical expression which is chosen in a (partially) arbitrary way to express a given meaning and/or a grammatical structure contingent upon the choice of the base (e.g. ‘intense’  heavy).

Workshop on Collocation, ACL 2001, Toulouse6 Translation and dictionaries n Rich bilingual dictionary –fumeur  smoker –gros fumeur  heavy smoker n Minimal bilingual dictionary –fumeur  smoker + rich monolingual dictionaries –intensification(fumeur) = gros –intensification(smoker) = heavy

Workshop on Collocation, ACL 2001, Toulouse7 Diversifications n Collocations are numerous and various in nature. n Ex: COLÈRE ‘anger’ –colère aveugle/noire, lit. ‘blind/black anger’ –colère sourde/froide, lit. ‘deaf/cold anger’ –fou/ivre de colère, lit. ‘mad/drunk of anger’ –rouge/blanc de colère, lit. ‘red/white of anger’ –etc.

Workshop on Collocation, ACL 2001, Toulouse8 Collocations and semantic derivations n Intensification of rain: –torrential (collocate) –downpour (semantic derivation) –torrential rain ~ downpour n Semantic derivations –(quasi)synonymy/antonymy –verbal, nominal, adjectival or adverbial derivations –name of a participant or circonstant, e.g. crime is linked to author [of a crime] or criminal, victim, instrument [of a crime], etc. n Both types of lexical relation could and should be encoded by the same conceptual device

Lexical Functions (LFs)

Workshop on Collocation, ACL 2001, Toulouse10 Collocations as Functions Concept of LF: Îolkovskij & Mel'ãuk 1965 n Base-collocate relations are oriented Magn(rain) = torrential the base is a collocate of expressing an intensification value keyword lexical function

Workshop on Collocation, ACL 2001, Toulouse11 “Generalized” lexical unit A lexical function f can be viewed as a “generalized” lexical unit –whose meaning is rather vague –whose signifier depends on the keywords Magn + SMOKER ‘intense’ ‘smoker’ heavy smoker

Encodings and ECD Encoding

Workshop on Collocation, ACL 2001, Toulouse13 Encodings n Encoding of LFs = correspondence between the set of LFs and a formal language such that any natural operation on LFs will be associated with an operation in this formal language h = combination of f and g encod(h) = encod(f) x encod(g)

Workshop on Collocation, ACL 2001, Toulouse14 ECD encoding n Explanatory Combinatorial Dictionaries (ECDs): –Mel'ãuk & Zholkovsky 1984 (Russian) –Mel'ãuk et al. 1984, 88, 92, 99 (French) n ECD encoding is based on linguistic paraphrases: IncepOper 1 (disease) = to contract to contract a disease = to begin (Incep) to experience (Oper 1 ) a disease n It combines: – semantic content (the paraphrase) –and syntactic frame (the 1 of Oper 1 )

Explicit Encoding

Workshop on Collocation, ACL 2001, Toulouse16 A more explicit encoding n Each LF is encoded by a couple: (semantic content, syntactic frame) n semantic content = predicate formula n syntactic frame = pos + syntactic valency n Advantages: Everything is explicit, the meaning and the syntactic behavior

Workshop on Collocation, ACL 2001, Toulouse17 Semantic content (1) n Primitive LF meanings: Incep[X]: ‘X begins’ Plus[X]: ‘X increases’ Non[X]: ‘X does not hold’ Minus[X]: ‘X decreases’ Caus[X,Y]: ‘X causes Y’ Fact[X]: ‘X functions’ Magn[X]: ‘X is intense’ Real[X,Y]: ‘X realizes Y’ AntiMagn[X]: ‘X is little’ Manif[X,(Y)]: ‘X manifests itself (in Y)’ Sympt[X,Y]: ‘X takes place, revealed by Y’ n # : keyword n 1,2,3: semantic actants of the keyword n Ω: additional semantic actant

Workshop on Collocation, ACL 2001, Toulouse18 Semantic content (2) n Real[1,#]^Magn ‘1 intensively realizes #’ ex: X déchaîne sa colère sur Y ‘X unleashes his anger on Y’ Real 1 2 ^ 1 # Magn n Caus[1,Minus[Manif[#]] ‘X causes a decrease in the manifestation of #’ ex: X étouffe sa colère lit. ‘X suffocates his anger’

Workshop on Collocation, ACL 2001, Toulouse19 Syntactic frame n # V[1,#] () (colère) = éprouver ‘to feel’ ex: Jean éprouve une grande colère n # V[2,#] () (colère) = encourir ‘to incur’ ex: Pierre encoure sa colère (la colère de Jean) () (colère) = habiter ‘to live in’ ex: Une grande colère habite Jean n # V[#,1]

Workshop on Collocation, ACL 2001, Toulouse20 Explicit encoding: Examples valuesemsynt être [en ~], éprouver [ART ~]#V[1,#] bouillir, bouillonner [(de (ART) ~)]#^MagnV[1,#] fulminer [(de ~ contre N=Y)]#^MagnV[1,#,2] se fâcherIncep[#]V[1] se mettre, “fam” se foutre [en ~]Incep[#]V[1,#] serrer [les poings de ~]Sympt[#,‘poings’/‘dents’]V[1,µ,#] bégayer, “fam” bafouiller [de ~]Sympt[#,‘parole’]V[1,#] aveugle, folle, sauvage{#}^MagnA[#^] sourde, froide, rentrée{#}^Non[Manif[#]]A[#^] empreint, chargé [de Ø/ART ~]{Ω}^Manif[#,Ω]A[Ω^,#] fou [de ~] {1}^(#^Magn^Manif[#])A[1^,#] blanc, blême, pale [de ~] {µ/1}^Sympt[#,‘visage’]A[µ/1^,#] rouge<écarlate<cramoisi [de ~] {µ/1}^Sympt[#,‘visage’]A[µ/1^,#] [yeux] exorbités [(de ~)]{µ}^Sympt[#,‘yeux’]A[µ^,#]

Algebraic Encoding

Workshop on Collocation, ACL 2001, Toulouse22 Algebraic structure n An LF is encoded by an algebraic expression n The algebraic encoding can be defined from the explicit encoding: –simple LFs –algebraic operations: product, fusion n Advantages: –algebraic structure of the set of LFs –similar to natural language (cf. ECD)

Workshop on Collocation, ACL 2001, Toulouse23 Simple LFs () Func 0 := # V[#] () Func i := # V[#,i] () Oper i := # V[i,#] () Caus := Caus[Ω,#] V[Ω,#] () Incep := Incep[#] V[#] () Magn := {#}^Magn A[#^] () A i := {i}^# A[i^,#] () Manif := Manif[#,Ω] V[#,Ω]

Workshop on Collocation, ACL 2001, Toulouse24 Composition vs. product Oper 1 (L) L Caus 2 (Oper 1 (L)) 2 21 Caus 2.Oper 1 (L) L  2 causes that 1 experiences L 2 met 1 en colère lit. ‘2 puts 1 in anger’ (L = colère) meaning diathesis

Workshop on Collocation, ACL 2001, Toulouse25 Product: Definition Let f and g be two collocational LFs The product h = g.f is a collocational LF such as: –h(L) is a collocate of L –h(L) is a paraphrase of g(f(L))  f(L) n () g.f := c( g ):#  c( f ) pos( g )[d( g ):#  d( f )]

Workshop on Collocation, ACL 2001, Toulouse26 Product: Examples () Incep.Oper 1 := Incep[#] V[1,#] () Incep := Incep[#] V[#] () f := c( f ) pos( f )[d( f )] () f = Oper 1 := # V[1,#] () Incep.f := Incep[c( f )] V[d( f )]

Workshop on Collocation, ACL 2001, Toulouse27 Fusion The operation of fusion associates to each collocational LF f a derivational LF //f s.t.: //f(L) is a paraphrase of L  f(L) () //(Incep.Oper 1 ) := Incep(#) V[1] () Incep.Oper 1 := Incep(#) V[1,#] se mettre en colère se fâcher # = colère

Conclusion

Workshop on Collocation, ACL 2001, Toulouse29 Conclusion (1) n Our main purpose is to make available formalisms that would be computationally tractable, that is, suitable for: –applications such as MT and text generation –maintenance and development of lexical database

Workshop on Collocation, ACL 2001, Toulouse30 Conclusion (2) n What encoding for what use –Explicit encoding: For computers –Algebraic/ECD encoding: For lexicographers and theoretical investigations –Linguistic paraphrases: For human readers n Explicit encoding and linguistic paraphrases can be computed from the algebraic encoding

Thank you

Workshop on Collocation, ACL 2001, Toulouse32 LAF encoding n A “controlled” natural language encoding for a general public dictionary (for French) LAF (Lexique Actif du Français) Mel'ãuk & Polguère, to appear n Ex: For a noun of feeling: Oper 1  [X] experiences # Oper 2  [Y] is the target of # Oper 3  [Z] is the reason for # Func 0  [#] takes placeFunc 1  [#] is in X Func 2  [#] is targeting Y //A 1.f  who f //A 2.f  whom f

Workshop on Collocation, ACL 2001, Toulouse33 Granularity n A lexical function denotes a set of pairs of LUs. An encoding defines a partition of the sets of pairs of LUs. Granularity = fineness of the partition