Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project.

Similar presentations


Presentation on theme: "1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project."— Presentation transcript:

1

2 1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project PAPILLON-02 international seminar, NII, Tokyo, 16-18 July 2002 Christian Boitet GETA, CLIPS, IMAG, CNRS, INPG & UJF Grenoble, France Christian.Boitet@imag.fr

3 2/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Outline The problem: given the “pivot” architecture of monolingual dictionaries translate all “free language elements” into all languages & store the results, respecting the overall structure Proposed solutions: Storing: use auxiliary lexies and axies Translating-1: shared tools for human translation Translating-2: partial MT using UNL Perspectives

4 3/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) French Dictionary Interlingual Dictionary Japanese Dictionary Vocable Carte n.f. Lexie carte à jouer Lexie carte géographique 地図 カード Acception 343 UNL: card(icl>play) Acception 345 UNL: map(fld>geography) Internal architecture of the database Architecture derived from Gilles Sérasset’s Ph.D. Thesis

5 4/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Interlingual links motivated by translations = "AXIEs" Possibilitity to link 1 lexie to >1 acception Links to other representations: AXIE—1——n—>UW PAPILLON scenario & diagram French DiCo Vocable carte n.f. lexie carte.1 carte à jouer lexie carte.2 carte géographique Japanese DiCo 地図 カード Acception 343 UNL: card(icl>play), card(icl>thing)… Acception 345 UNL: map(fld>geography) Interlingual links Acception 1002 UNL: card(fld>money) a Thai DiCo English DiCo Vocable card N lexie card.1 playing card lexie card.2 money card Vocable=lexie map

6 5/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A monolingual DiCo entry (again) 1.Name of the lexical unit: MEURTRE 2.Grammatical properties: nom, masc 3.Semantic Formula: action de tuer: ~ PAR L'individu X DE L'individu Y 4.Government pattern: X = I = de N, A-poss Y = II = de N, A-poss 5.(Quasi-)synonyms: {QSyn} assassinat, homicide#1; crime 6.Semantic derivations & collocations: {V0} tuer {A0} meurtrier-adj / *Nom pour X*/ {S1} auteur [de ART Ø] //meurtrier-n /*Nom pour Y*/ {S2} victime [de ART Ø] /*Très choquant*/ 7.Examples: La mésentente pourrait être le mobile du meurtre. 8.Full Idioms: appel au meurtre crier au meurtre Structure derived from Alain Polguère’s work on DiCo

7 6/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Fixed and free language elements Fixed Stereotyped definition in semantic formula: action de tuer: Logical argument frame: ~ PAR L'individu X DE L'individu Y Grammatical properties: nom, masc Free Examples: La mésentente pourrait être le mobile du meurtre. Citations (e.g. for SPIRIT): the spirit is strong, but the flesh is weak (Bible, ref.XXX) Free definitions in semantic formula (e.g. for a disease noun such as LEUCOCYTE): sort of cell contained in the blood and attacking infectious agents Glosses (sometimes = quasi-synonyms): character (mood)

8 7/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The problem (1) Necessity to translate all free language elements The translation in L2 of an example for X(L1) is not in general a good example for the translation of Y in L2 Il utilise souvent des cartes IGN *He often uses IGN roadmaps/maps  He often uses AA maps IGN = Institut Géographique National AA = Automobile Association Hence, the size of the problem is quadratic!

9 8/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The problem (2) Where to store these translations? Not in the lexies, which must remain monolingual Not in the axies, which must remain pure links

10 9/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Solution for the storing problem Use auxiliary lexies and axies terminology: x-lexie, x-axie x  {def, cit, ex, glo} Each free language element becomes an x-lexie cit-lexies and ex-lexies are simpler than normal lexies X-lexies are linked through x-axies An x-axie contains lists of x-lexies and, in case of an external reference to UNL a UNL graph (if x ≠ glo), or a UW (glo-axie)

11 10/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Multilingual links = AXIES Normal axies for each language L, 0:n links to lexies of L for each semantic system S available, 0:n links to entities of S UNL UWs, WordNet synsets, NTT SemCat, Ontos concepts, LexiQuest Lex-concepts… Auxiliary axies for examples, citations… for each language L, 0:n links to lexies of L if UNL-annotated,1 UNL graph

12 11/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A « Montaigne » environmemt for Human Translation Idea: let users SHARE translation memory & tools on a server Specs in 1995 around Eurolang Optimizer™ no funding although « Francophony » interested… Internet version: see www.yakushite.net (OKI) First version built for Lao see www.laosoftware.com (V. Berment) Future: bilingual editor as applet use of Papillon server architecture (private spaces etc.)

13 12/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Scenario & possible GUI …… source segment N-2translated segment (done) source segment N-1translated segment (done) suggestion(s) from the TM source segment N translated segment (currently being created) source segment N+1 source segment N+2 dictionary suggestions Typical layout of a bilingual editor in a TSS

14 13/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Design & implementation issues Peer-to-peer architecture Papillon  Montaigne Possibility to modify input text & segmentation Integrate with private lexicon(s) Open to plug-ins (voice input…)

15 14/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Automating translation using UNL UNL = a project a language to represent NL utterance meanings a format for multilingual documents (html  xml) Elements of the UNL language UWs: headword(restrictions)book(icl>do) Attributes: @future, @past, @complete…, @entry Relations: agt, aoj, mod, obj, tim… (Hyper)graphs: subgraph is connex & has an entry node

16 15/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A simpleUNL input graph

17 16/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Possible interactive disambiguation at analysis time

18 17/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Interactive disambiguation (2) - gives a correct unique multilevel concrete (UMC) tree - then a correct unique multilevel abstract (UMA) tree - and finally a correct UNL graph

19 18/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Possible text-graph « coedition » at reading time applicable if there is a UNL graph associated with a segment one wants to modify goal : share the revisions across languages, by reflecting them on the UNL graph Ex: FB2204 (Forum Barcelona 2004) « Une cité retrouvera une zone côtière après un forum » add ".@def" on the nodes for "city", ”forum” transform “forum” into “Forum” replace "retrieve" by "recover" add ".@complete" on the node containing it. « La cité récupérera une zone côtière après le Forum »

20 19/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

21 20/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

22 21/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

23 22/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Principles of coedition (1) It is impossible in principle to deduce the modification on the graph from a modification on the text For example, replacing "un" ("a") by "le" ("the") does not entail that the following noun is determined (.@def), because it can also be generic "il aime la montagne" = "he likes mountains" Revision is not done by modifying directly the text, but by using a menu system Menu items have a "language side" and a hidden "UNL side"

24 23/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Principles of coedition (2) when a menu item is chosen, only the graph is transformed, the action to be done on the text is delayed and shown at any time, the new graph may be deconverted If is is satisfactory, that shows that errors were due to the graph and not to the deconverter, and the graph may be sent to deconverters in other languages. Versions in some other languages known by the user may be displayed, so that improvement sharing is visible and encouraging.

25 24/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Conclusion Need for a translation task (#7) in Papillon Seamless integration with x-lexies + x-axies Possible combination of TA & MT Mutualization spirit (Papillon, Montaigne) for TA Use of UNL (2 « pivot » architectures) for MT Mutualization again in MT part (humans involved) Interactive disambiguation Coedition text  UNL graph … & of course lexical data contribution through Papillon!


Download ppt "1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project."

Similar presentations


Ads by Google