1 / 25Sat. 31 Aug. 2002SEMANET Workshop Frameworks, Implementation & Open Problems for the Collaborative Building of a Multilingual Lexical Database Mathieu Mangeot & Gilles Sérasset NII, Tokyo, Japan GETA-CLIPS, Grenoble, France
2 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
3 / 25Sat. 31 Aug. 2002SEMANET Workshop Motivations Initial Goal Build a French-Japanese electronic dictionary for humans Very Few Existing Resources French-Japanese, Free, Electronic Construction Costs Too High EDR English-Japanese Dictionary 1200 human-year; entries; price: 14,3 Mo ¥ On Going Collaborative Construction Projects Edict Japanese->English, SAIKAM Japanese-Thai Lack of Information Numerical Specifiers, kanji+kana+romaji
4 / 25Sat. 31 Aug. 2002SEMANET Workshop Extended Goals Build a More Complete Dictionary Multilingual (English, French, German, Japanese, Lao, Malay, Thai, Vietnamese) Multiusers (beginners, experts, applications) Community Development LINUX Construction Paradigm Voluntary Contributors Mutualization of the Resources User Preferences & Profiles
5 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
6 / 25Sat. 31 Aug. 2002SEMANET Workshop Bilingual Dictionaries French English ThaiJapanese Malay Vietnamese Lao
7 / 25Sat. 31 Aug. 2002SEMANET Workshop Pivot Dictionary French English ThaiJapanese Malay Vietnamese Lao Int
8 / 25Sat. 31 Aug. 2002SEMANET Workshop Detailed Pivot Structure French DiCo Vocable affection n.f. lexie affection.1 (tendresse) lexie affection.2 (médecine) Interlingual Links (Axies) lexie maladie Vocable maladie n.f. Refinement Links English DiCo Vocable disease N lexie disease lexie affection Vocable affection N 病気 byouki 【びょうき】 Japanese DiCo Ref: Work done by Gilles Sérasset
9 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
10 / 25Sat. 31 Aug. 2002SEMANET Workshop Combinatorial Lexicography From Meaning-Text Theory Alain Polguère & Igor Mel’tchuk (U. de Montréal) Gives the necessary information to go from an idea (the meaning) to its realisation in a given language (the text). Existing Dictionaries: DEC, DiCo database & LAF Same Structure for Every Language 56 Basic Lexical Functions
11 / 25Sat. 31 Aug. 2002SEMANET Workshop French Lexie (DiCo Entry) Name of the Lexical Unit: MEURTRE Grammatical Properties: nom, masc Semantical Formula: action de tuer: ~ PAR L'individu X DE L'individu Y Government Pattern: X =I = de N, A-poss Y= II = de N, A-poss Lexical Functions: {QSyn} assassinat,homicide#1;crime /*Quasi synonyms*/ {Oper 1 } accomplir, commettre, perpétrer [ART ~]; tremper [dans ART ~] /*Causes that X does a M.*/ {S 1 } auteur [de ART Ø]//meurtrier-n /*Name for X*/ {S 2 } victime [de ART Ø] /*Name for Y*/ Example: La mésentente pourrait être le mobile du meurtre. Idioms: _appel au meurtre_ _crier au meurtre_
12 / 25Sat. 31 Aug. 2002SEMANET Workshop Japanese Lexie Name of the Lexical Unit: 殺人 【さつじん】 Reading: satsujin Grammatical Properties: 名詞 【めいし】 Semantical Formula: どうさ : 人 Y の 人 X の ~ Government Pattern: X = I = N, Y = II = N の Lexical Functions: {QSyn} 殺戮【さつりく】, 殺害【さつがい】 /*Quasi synonyms*/ {Oper 1 } [~ を ] する; [~ を ] 犯す /* Causes that X does a M. */ {S 1 } 殺人者 【さつじんしゃ】, 殺人鬼 【さつじんき】 /*Name for X*/ {S 2 } 被害者【ひがいしゃ】 /*Name for Y*/ Example: 喧嘩【けんか】は殺人【さつじん】の動機【どうき】になり得【 え】るだろう。 Idioms: _ 殺人剣 【さつじんけん】 _ _ 嘱託殺人 【しくたくさつじん】 _
13 / 25Sat. 31 Aug. 2002SEMANET Workshop Interlingual Links (Axies) Links to other Axies Synonyms, Refinement, Generalizations Motivated by existing translation links. Not like concepts Links to External References To be independent from any existing theory Wordnet synsets, NTT Semantic category, ONTOS or LexiGuide ontologies, UNL Uws & Graphs etc. Linking Monolingual Lexies
14 / 25Sat. 31 Aug. 2002SEMANET Workshop Structure of an Axie Unique ID: a Semantic Tag (entity, process, state, result): process Links to lexies: fra: meurtre.1 eng: murder.1 jpn: satsujin.1 Links to other axies synonym axies: a (assassination) generic axies: a00002 refined axies: a References to External Resources: WordNet Synset: unlawful premeditated killing of a human being UNL UW: murder(icl>action,agt>human,obj>human) NTT Semantic Category ONTOS Concept LexiGuide concept
15 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
16 / 25Sat. 31 Aug. 2002SEMANET Workshop Preparation of the Existing Data Local Resources Export Recuperation DicDist DicOrig DicGen Contrib1 Contrib2 EDR FeM JMDict SAIKAM DiCo Contrib3 ELRA Limbo Original Format Purgatory XML Format Paradise Papillon Format Import Integration S pap Consultation
17 / 25Sat. 31 Aug. 2002SEMANET Workshop Introduction to Conceptual Vectors An idea = a concept = a conceptual vector The vector space is of K dimensions K = nb of concepts in a thesaurus hierarchy Eg: for French, Thesaurus Larousse = 873 concepts One independent vector space for each language Distance between 2 vectors = angular distance D A (x, y) = acos(sim(x,y)) D A (x, y) = acos(x.y/|x||y|)) Ref: Work done by Mathieu Lafourcade
18 / 25Sat. 31 Aug. 2002SEMANET Workshop demand English Vectorized monolingual dictionary English-French Bilingual dictionary v v v v Left over meaning Association demand.3 v defdemand.2 v defdemand.1 v def demand.1equivalents demand Slide from Mathieu Lafourcade Vector space for English demand.2equivalents demand.3equivalents demand.4equivalents Linking Word Senses with Vectors
19 / 25Sat. 31 Aug. 2002SEMANET Workshop Contributions & Validation Papillon Server Paradise Papillon Format S pap Voluntary Contributors Contributions User Space User Space Payed Specialists Validation Integration
20 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
21 / 25Sat. 31 Aug. 2002SEMANET Workshop Lexico Semantical Multilingual Network (1) DiCo French Assassinat Meurtre Assassination Murder _Lancer un appel au meurtre_ _To call for sb's assassination_ {Qsyn} {Qsyn} ? DiCo English Interlingual Links
22 / 25Sat. 31 Aug. 2002SEMANET Workshop Lexico Semantical Multilingual Network (2) DiCo French Meurtrier Meurtre 殺人者 【さつじんしゃ】 殺人 【さつじん】 {S0} {S0} ? DiCo Japanese Interlingual Links Satsujin (Murder) Satsujinsha (Murderer) _Lancer un appel au meurtre_
23 / 25Sat. 31 Aug. 2002SEMANET Workshop Outline Presentation of Papillon Project Macrostructure of the Dictionary Microstructure of the Entries Bootstrapping & Contribution Process Limbo, Purgatory & Paradise Bootstrapping with Conceptual Vectors Contributions & Validation Process Lexico-Semantical Network Monolingual with Lexical Functions Multilingual with Axies (Interlingual Links) Conclusion & References
24 / 25Sat. 31 Aug. 2002SEMANET Workshop Conclusion Framework for Experimenting Networks Research Issues Remaining Social issues: how to motivate people? Contribution Interfaces Checking Interfaces The Project Cannot Succeed without the Help of the Public People (Voluntary Contributors)
25 / 25Sat. 31 Aug. 2002SEMANET Workshop References & Contacts Web Site (information & consultation) Steering Committee President Gilles Sérasset Technical Responsible in Japan Mathieu Mangeot