Formal Foundation of Lexical Functions Sylvain Kahane LaTTiCe/TALaNa Université Paris 7 Alain Polguère OLST Université de Montréal
Workshop on Collocation, ACL 2001, Toulouse2 Content n The Meaning-Text concept of collocation n Lexical functions n Encodings and ECD encoding n Explicit encoding n Algebraic encoding n Conclusion
The Meaning-Text concept of collocation
Workshop on Collocation, ACL 2001, Toulouse4 An example: Collocations and MT n gros fumeurlit. transl. big smoker actual transl. heavy smoker n An MT system has to identify that gros fumeur is a collocation, something between a free combination and a full idiom
Workshop on Collocation, ACL 2001, Toulouse5 What is a collocation for us n Following Meaning-Text terminology, we call collocation a linguistic expression made up of two components: –the base of the collocation: a full lexical unit which is “freely” chosen by the speaker on the basis of its meaning (e.g. ‘smoker’ smoker); –the collocate: a lexical unit or a multilexical expression which is chosen in a (partially) arbitrary way to express a given meaning and/or a grammatical structure contingent upon the choice of the base (e.g. ‘intense’ heavy).
Workshop on Collocation, ACL 2001, Toulouse6 Translation and dictionaries n Rich bilingual dictionary –fumeur smoker –gros fumeur heavy smoker n Minimal bilingual dictionary –fumeur smoker + rich monolingual dictionaries –intensification(fumeur) = gros –intensification(smoker) = heavy
Workshop on Collocation, ACL 2001, Toulouse7 Diversifications n Collocations are numerous and various in nature. n Ex: COLÈRE ‘anger’ –colère aveugle/noire, lit. ‘blind/black anger’ –colère sourde/froide, lit. ‘deaf/cold anger’ –fou/ivre de colère, lit. ‘mad/drunk of anger’ –rouge/blanc de colère, lit. ‘red/white of anger’ –etc.
Workshop on Collocation, ACL 2001, Toulouse8 Collocations and semantic derivations n Intensification of rain: –torrential (collocate) –downpour (semantic derivation) –torrential rain ~ downpour n Semantic derivations –(quasi)synonymy/antonymy –verbal, nominal, adjectival or adverbial derivations –name of a participant or circonstant, e.g. crime is linked to author [of a crime] or criminal, victim, instrument [of a crime], etc. n Both types of lexical relation could and should be encoded by the same conceptual device
Lexical Functions (LFs)
Workshop on Collocation, ACL 2001, Toulouse10 Collocations as Functions Concept of LF: Îolkovskij & Mel'ãuk 1965 n Base-collocate relations are oriented Magn(rain) = torrential the base is a collocate of expressing an intensification value keyword lexical function
Workshop on Collocation, ACL 2001, Toulouse11 “Generalized” lexical unit A lexical function f can be viewed as a “generalized” lexical unit –whose meaning is rather vague –whose signifier depends on the keywords Magn + SMOKER ‘intense’ ‘smoker’ heavy smoker
Encodings and ECD Encoding
Workshop on Collocation, ACL 2001, Toulouse13 Encodings n Encoding of LFs = correspondence between the set of LFs and a formal language such that any natural operation on LFs will be associated with an operation in this formal language h = combination of f and g encod(h) = encod(f) x encod(g)
Workshop on Collocation, ACL 2001, Toulouse14 ECD encoding n Explanatory Combinatorial Dictionaries (ECDs): –Mel'ãuk & Zholkovsky 1984 (Russian) –Mel'ãuk et al. 1984, 88, 92, 99 (French) n ECD encoding is based on linguistic paraphrases: IncepOper 1 (disease) = to contract to contract a disease = to begin (Incep) to experience (Oper 1 ) a disease n It combines: – semantic content (the paraphrase) –and syntactic frame (the 1 of Oper 1 )
Explicit Encoding
Workshop on Collocation, ACL 2001, Toulouse16 A more explicit encoding n Each LF is encoded by a couple: (semantic content, syntactic frame) n semantic content = predicate formula n syntactic frame = pos + syntactic valency n Advantages: Everything is explicit, the meaning and the syntactic behavior
Workshop on Collocation, ACL 2001, Toulouse17 Semantic content (1) n Primitive LF meanings: Incep[X]: ‘X begins’ Plus[X]: ‘X increases’ Non[X]: ‘X does not hold’ Minus[X]: ‘X decreases’ Caus[X,Y]: ‘X causes Y’ Fact[X]: ‘X functions’ Magn[X]: ‘X is intense’ Real[X,Y]: ‘X realizes Y’ AntiMagn[X]: ‘X is little’ Manif[X,(Y)]: ‘X manifests itself (in Y)’ Sympt[X,Y]: ‘X takes place, revealed by Y’ n # : keyword n 1,2,3: semantic actants of the keyword n Ω: additional semantic actant
Workshop on Collocation, ACL 2001, Toulouse18 Semantic content (2) n Real[1,#]^Magn ‘1 intensively realizes #’ ex: X déchaîne sa colère sur Y ‘X unleashes his anger on Y’ Real 1 2 ^ 1 # Magn n Caus[1,Minus[Manif[#]] ‘X causes a decrease in the manifestation of #’ ex: X étouffe sa colère lit. ‘X suffocates his anger’
Workshop on Collocation, ACL 2001, Toulouse19 Syntactic frame n # V[1,#] () (colère) = éprouver ‘to feel’ ex: Jean éprouve une grande colère n # V[2,#] () (colère) = encourir ‘to incur’ ex: Pierre encoure sa colère (la colère de Jean) () (colère) = habiter ‘to live in’ ex: Une grande colère habite Jean n # V[#,1]
Workshop on Collocation, ACL 2001, Toulouse20 Explicit encoding: Examples valuesemsynt être [en ~], éprouver [ART ~]#V[1,#] bouillir, bouillonner [(de (ART) ~)]#^MagnV[1,#] fulminer [(de ~ contre N=Y)]#^MagnV[1,#,2] se fâcherIncep[#]V[1] se mettre, “fam” se foutre [en ~]Incep[#]V[1,#] serrer [les poings de ~]Sympt[#,‘poings’/‘dents’]V[1,µ,#] bégayer, “fam” bafouiller [de ~]Sympt[#,‘parole’]V[1,#] aveugle, folle, sauvage{#}^MagnA[#^] sourde, froide, rentrée{#}^Non[Manif[#]]A[#^] empreint, chargé [de Ø/ART ~]{Ω}^Manif[#,Ω]A[Ω^,#] fou [de ~] {1}^(#^Magn^Manif[#])A[1^,#] blanc, blême, pale [de ~] {µ/1}^Sympt[#,‘visage’]A[µ/1^,#] rouge<écarlate<cramoisi [de ~] {µ/1}^Sympt[#,‘visage’]A[µ/1^,#] [yeux] exorbités [(de ~)]{µ}^Sympt[#,‘yeux’]A[µ^,#]
Algebraic Encoding
Workshop on Collocation, ACL 2001, Toulouse22 Algebraic structure n An LF is encoded by an algebraic expression n The algebraic encoding can be defined from the explicit encoding: –simple LFs –algebraic operations: product, fusion n Advantages: –algebraic structure of the set of LFs –similar to natural language (cf. ECD)
Workshop on Collocation, ACL 2001, Toulouse23 Simple LFs () Func 0 := # V[#] () Func i := # V[#,i] () Oper i := # V[i,#] () Caus := Caus[Ω,#] V[Ω,#] () Incep := Incep[#] V[#] () Magn := {#}^Magn A[#^] () A i := {i}^# A[i^,#] () Manif := Manif[#,Ω] V[#,Ω]
Workshop on Collocation, ACL 2001, Toulouse24 Composition vs. product Oper 1 (L) L Caus 2 (Oper 1 (L)) 2 21 Caus 2.Oper 1 (L) L 2 causes that 1 experiences L 2 met 1 en colère lit. ‘2 puts 1 in anger’ (L = colère) meaning diathesis
Workshop on Collocation, ACL 2001, Toulouse25 Product: Definition Let f and g be two collocational LFs The product h = g.f is a collocational LF such as: –h(L) is a collocate of L –h(L) is a paraphrase of g(f(L)) f(L) n () g.f := c( g ):# c( f ) pos( g )[d( g ):# d( f )]
Workshop on Collocation, ACL 2001, Toulouse26 Product: Examples () Incep.Oper 1 := Incep[#] V[1,#] () Incep := Incep[#] V[#] () f := c( f ) pos( f )[d( f )] () f = Oper 1 := # V[1,#] () Incep.f := Incep[c( f )] V[d( f )]
Workshop on Collocation, ACL 2001, Toulouse27 Fusion The operation of fusion associates to each collocational LF f a derivational LF //f s.t.: //f(L) is a paraphrase of L f(L) () //(Incep.Oper 1 ) := Incep(#) V[1] () Incep.Oper 1 := Incep(#) V[1,#] se mettre en colère se fâcher # = colère
Conclusion
Workshop on Collocation, ACL 2001, Toulouse29 Conclusion (1) n Our main purpose is to make available formalisms that would be computationally tractable, that is, suitable for: –applications such as MT and text generation –maintenance and development of lexical database
Workshop on Collocation, ACL 2001, Toulouse30 Conclusion (2) n What encoding for what use –Explicit encoding: For computers –Algebraic/ECD encoding: For lexicographers and theoretical investigations –Linguistic paraphrases: For human readers n Explicit encoding and linguistic paraphrases can be computed from the algebraic encoding
Thank you
Workshop on Collocation, ACL 2001, Toulouse32 LAF encoding n A “controlled” natural language encoding for a general public dictionary (for French) LAF (Lexique Actif du Français) Mel'ãuk & Polguère, to appear n Ex: For a noun of feeling: Oper 1 [X] experiences # Oper 2 [Y] is the target of # Oper 3 [Z] is the reason for # Func 0 [#] takes placeFunc 1 [#] is in X Func 2 [#] is targeting Y //A 1.f who f //A 2.f whom f
Workshop on Collocation, ACL 2001, Toulouse33 Granularity n A lexical function denotes a set of pairs of LUs. An encoding defines a partition of the sets of pairs of LUs. Granularity = fineness of the partition