Synonymies and conceptual vectors NLPRS 2001 Mathieu Lafourcade, Violaine Prince LIRMM - France 1
Overwiew & Objectives why synonymy? what : Conceptual vectors which synonymies ? for what : Use with lexical functions
Objectives Evaluation Relative synonymy Subjective synonymy Semantic proxymity to possible contexts for lexical interchangeability Relative synonymy Elimination of transitivity punctum proximum Subjective synonymy punctum remotum 2
Conceptual vectors vector space An idea Concept combination — a vector Idea space = vector space A concept = an idea = a vector V with augmentation: V + neighboorhood Meaning space = vector space + {v}* 27
Conceptual vectors Thesaurus H : thesaurus hierarchy — K concepts Thesaurus Larousse = 873 concepts V(Ci) : <a1, …, ai, … , a873> aj = 1/ (2 ** Dum(H, i, j)) 1/16 1/16 1/4 1 1/4 1/4 1/64 1/64 4 2 6 93
Conceptual vectors Concept c4:peace conflict relations hiérarchical relations The world, manhood society
Conceptual vectors Term “peace” c4:peace
Angular distance DA(x, y) = angle (x, y) 0 DA(x, y) if 0 then x & y colinear — same idea if /2 then nothing in common if then DA(x, -x) with -x — anti-idea of x x’ x y 36
Angular distance DA(x, y) = acos(sim(x,y)) DA(x, y) = acos(x.y/|x||y|)) DA(x, x) = 0 DA(x, y) = DA(y, x) DA(x, y) + DA(y, z) DA(x, z) DA(0, 0) = 0 and DA(x, 0) = /2 by definition DA(x, y) = DA(x, y) with 0 DA(x, y) = - DA(x, y) with < 0 DA(x+x, x+y) = DA(x, x+y) DA(x, y) 37
Thematic distance Examples DA(tit, tit) = 0 DA(tit, passerine) = 0.4 DA(tit, bird) = 0.7 DA(tit, train) = 1.14 DA(tit, insect) = 0.62 tit = insectivorous passerine bird … 43
Relative synonymy Aspectual or referential Term polysemy un personnel trié sur le volet (CHOISIR) une liste triée par ordre alphabétique (ORDONNER) le courrier est trié (REPARTIR) A vector plays as an aspect (aka reference) How can we exchange A & B in the context of C ? 43
SynR(A, B, C) = DA(A+AC, B+BC) Relative synonymy SynR(A, B, C) with C as a reference (ref) SynR(A, B, C) = DA(A+AC, B+BC) A+AC SynR(A, B, C) A B+BC C B 66
Relative synonymy Properties SynR(A, B, C) = SynR(B, A, C) SynR(A, A, C) = DA(A C, A C) = 0 SynR(A, B, 0) = DA(A, B SynR(A, 0, C) = /2 DA(charbon,nuit) = 0.9 SynR(charbon, nuit, couleur) = 0.4 SynR(charbon, nuit, noir) = 0.35 67
Relative synonymy Properties The relative synonymy is a measure which favors the closing in of 2 vectors: “black” a good punctum proximum for “coal” and “night” Transitivity of the synonymy SynR(coal, crow, black) = 0.18 SynR(crow, night, black) = 0.5 SynR(coal, night, black) = 0.35 67
SynA(A, B) = SynR (A, B, AB) Absolute synonymy SynA(A, B) a particular case with AB as ref SynA(A, B) = SynR (A, B, AB) A+A(AB) SynA(A, B) A AB B+B(AB) B 66
Subjective synonymy Point of view Semantic discrimination scope DA(tit, bird) = 0.7 DA(sparrow, bird) = 0.48 DA(tit, sparrow) = 0.23 With which pow can we discriminate two given vectors? Closest “punctum remotum” 66
Subjective synonymy SynS(A, B, C) — C = point of view (pow) = D(A-AC, B-BC) A A-AC C B SynS(A, B, C) B-BC 66
Subjective synonymy When DA(A, C) /2 & DA(B, C) /2 then SynS(A, B, C) DA(A,B) SynS(A, B, 0) = DA(A, B) SynS(A, A, C) = 0 SynS(A, B, B) = DA(A-AB, 0) = /2 DA(tit, crow) = 0.32 SynS(tit, crow, zoology) = 0.54 SynS(tit, crow, bird) = 1.07 SynS(tit, crow, passerine) = 1.37 66
Subjective synonymy Properties non conservation of the concept hierarchy chain Concept chain @the_world > @the_life > @animals > @birds DA(tit, sparrow) = 0.23 SynS(tit, sparrow, @the_life) = 0.75 SynS(tit, sparrow, @the_world) = 0.5 SynS(tit, sparrow, @animals) = 0.4 SynS(tit, sparrow, @birds) = 0.9 Concepts horizon (at the lowest concept level) 66
Subjective synonymy Properties Polysemy: term vs concept SynS(tit, sparrow, @birds) = 0.9 SynS(tit, sparrow, bird) = 0.78 Loosly correlated vectors as pow SynS(tit, sparrow, @gold) = 0.7 DA(tit, @gold) = 1.19 DA(sparrow, @gold) = 1.15 66
Objective synonymy SynA(A, B) a particular case with AB as pow = SynA(A, B, AB) A-A(AB) A AB B B-B(AB) SynA(A, B) 66
Conclusion Synonymy as enhancement of the thematic analysis The conceptual vector models shows interferencies from polysemy : relative synonymy from the complex relation btw concept and terms (bird vs @birds) System in continuous learning Evolving results Hopefully converging 107