1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project.

Slides:



Advertisements
Similar presentations
From the UNL hypergraph to GETA's multilevel tree Etienne BLANC GETA, CLIPS-IMAG BP 53, F Grenoble cedex 09
Advertisements

Using OLIF, The Open Lexicon Interchange Format Susan McCormick OLIF2 Consortium October 1, 2004.
The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
1 STRUCTURAL AND LEXICAL TRANSFER from a UNL GRAPH to a NATURAL LANGUAGE DEPENDENCY TREE Etienne BLANC, Gilles SERASSET, WangJu TSAI GETA, CLIPS-IMAG.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
MP IP Strategy Stateye-GUI Provided by Edotronik Munich, May 05, 2006.
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
Copyright ©2004 Cezary Z Janikow 1 Domain Model n Visualization of entities and relationships n In UP presented as Class Diagrams – Classes, Relationships,
1 Towards a better transcultural understanding through distance education. Towards a better transcultural understanding through.
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
PowerPoint Presentation for Dennis, Wixom & Tegarden Systems Analysis and Design Copyright 2001 © John Wiley & Sons, Inc. All rights reserved. Slide 1.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
Page 1 Multidatabase Querying by Context Ramon Lawrence, Ken Barker Multidatabase Querying by Context.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
1 Distance education : What could technology offer ? Gérard CHOLLET ENST/CNRS-LTCI 46 rue Barrault PARIS cedex 13
Methodology Conceptual Database Design
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Software Design Description (SDD) Diagram Samples
The chapter will address the following questions:
1 / 25Sat. 31 Aug. 2002SEMANET Workshop Frameworks, Implementation & Open Problems for the Collaborative Building of a Multilingual Lexical Database Mathieu.
Chapter 10 Architectural Design
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
1 Chapter 14 Architectural Design. 2 Why Architecture? The architecture is not the operational software. Rather, it is a representation that enables a.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
ITEC 352 Lecture 11 ISA - CPU. ISA (2) Review Questions? HW 2 due on Friday ISA –Machine language –Buses –Memory.
GOOD, MULTILINGUAL interpretation, translation, resources What can we do for the OG-08? Christian BOITET GETA, CLIPS, IMAG-campus UJF & CNRS, Grenoble,
Artificial Intelligence for Universal Networking Language (UNL) (Perspective Bengali Language) By Deen Islam Muslim ID: Ariful Hoque Tuhin ID:
1 BTEC HNC Systems Support Castle College 2007/8 Systems Analysis Lecture 9 Introduction to Design.
Processing of large document collections Part 10 (Information extraction: multilingual IE, IE from web, IE from semi-structured data) Helena Ahonen-Myka.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 7 Slide 1 System models l Abstract descriptions of systems whose requirements are being.
Chapter 4 System Models A description of the various models that can be used to specify software systems.
Amber Annett David Bell October 13 th, What will happen What is this business about personal web pages? Designated location of your own web page.
Integrating Security Design Into The Software Development Process For E-Commerce Systems By: M.T. Chan, L.F. Kwok (City University of Hong Kong)
M. Lafourcade (LIRMM & Ch. Boitet (GETA, CLIPS)LREC-02, Las Palmas, 31/5/ LREC-2002, Las Palmas, May 2002 Mathieur Lafourcade & Christian Boitet.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
© Ch. Boitet & Wang-Ju Tsai (GETA, CLIPS) ICUKL-2002, Goa, 25-29/11/02 1/46 International Conference on Universal Knowledge and Language (ICUKL2002),
25 juin 2010 Interactive Genetic Algorithms for Creative Enhancement in UI design Dimitri Masson Alexandre Demeure Gaelle Calvary 1.
JavaScript Professor Robin Burke. 2 Outline Quiz Tables JavaScript.
1 Software Design Reference: Software Engineering, by Ian Sommerville, Ch. 12 & 13, 5 th edition and Ch. 10, 6 th edition.
ICONIX P ROCESS FOR S OFTWARE D EVELOPMENT Hoang Huu Hanh, Hue University hanh-at-hueuni.edu.vn 1.
WEB BASED DATA TRANSFORMATION USING XML, JAVA Group members: Darius Balarashti & Matt Smith.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Software Design Overview Reference: Software Engineering, by Ian Sommerville, Ch. 12 & 13.
Chapter 7 System models.
2005 Epocrates, Inc. All rights reserved. Integrating XML with legacy relational data for publishing on handheld devices David A. Lee Senior member of.
A roadmap for MT : four « keys » to handle more languages, for all kinds of tasks, while making it possible to improve quality (on demand) International.
Modified by Juan M. Gomez Software Engineering, 6th edition. Chapter 7 Slide 1 Chapter 7 System Models.
Systems Analysis and Design in a Changing World, 3rd Edition
© Ch. Boitet & Wang-Ju Tsai (GETA, CLIPS) ICUKL-2002, Goa, 25-29/11/02 1 Proposals for solving some problems in UNL encoding International Conference on.
Lecture 4 Conceptual Data Modeling. Objectives Define terms related to entity relationship modeling, including entity, entity instance, attribute, relationship,
Copyright 2006 Prentice-Hall, Inc. Essentials of Systems Analysis and Design Third Edition Joseph S. Valacich Joey F. George Jeffrey A. Hoffer Chapter.
+ Information Systems and Databases 2.2 Organisation.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
Physics and Electronics Laboratory Ontologies in IMAT Integrating Manuals and Training Suzanne Kabel TNO Physics and Electronics Laboratory & University.
Chapter 5 Introduction To Form Builder. Lesson A Objectives  Display Forms Builder forms in a Web browser  Use a data block form to view, insert, update,
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
11/23/00UNU/IAS/UNL Centre1 The Universal Networking Language United Nations University Institute of Advanced Studies United Networking Language ® UNU/IAS.
JavaScript Introduction and Background. 2 Web languages Three formal languages HTML JavaScript CSS Three different tasks Document description Client-side.
Systems Development Lifecycle
UNL Document Summarization Virach Sornlertlamvanich, Tanapong Potipiti and Thatsanee Charoenporn Information Research and Development Division National.
The UNL Program A program created by the United Nations University / Institute of Advanced Studies Now carried out by the UNDL Foundation
8. Translation resources
LACONEC A Large-scale Multilingual Semantics-based Dictionary
Chapter 9 Designing Databases
DITA Translation Management Challenges in Japan
Automatically Populating Acception Lexical Database through Bilingual Dictionaries and Conceptual Vectors PAPILLON 2002 Mathieu Lafourcade LIRMM -
Building Ontologies with Protégé-2000
Presentation transcript:

1/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The translation of examples, citations, definitions and glosses in the Papillon project PAPILLON-02 international seminar, NII, Tokyo, July 2002 Christian Boitet GETA, CLIPS, IMAG, CNRS, INPG & UJF Grenoble, France

2/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Outline The problem: given the “pivot” architecture of monolingual dictionaries translate all “free language elements” into all languages & store the results, respecting the overall structure Proposed solutions: Storing: use auxiliary lexies and axies Translating-1: shared tools for human translation Translating-2: partial MT using UNL Perspectives

3/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) French Dictionary Interlingual Dictionary Japanese Dictionary Vocable Carte n.f. Lexie carte à jouer Lexie carte géographique 地図 カード Acception 343 UNL: card(icl>play) Acception 345 UNL: map(fld>geography) Internal architecture of the database Architecture derived from Gilles Sérasset’s Ph.D. Thesis

4/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Interlingual links motivated by translations = "AXIEs" Possibilitity to link 1 lexie to >1 acception Links to other representations: AXIE—1——n—>UW PAPILLON scenario & diagram French DiCo Vocable carte n.f. lexie carte.1 carte à jouer lexie carte.2 carte géographique Japanese DiCo 地図 カード Acception 343 UNL: card(icl>play), card(icl>thing)… Acception 345 UNL: map(fld>geography) Interlingual links Acception 1002 UNL: card(fld>money) a Thai DiCo English DiCo Vocable card N lexie card.1 playing card lexie card.2 money card Vocable=lexie map

5/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A monolingual DiCo entry (again) 1.Name of the lexical unit: MEURTRE 2.Grammatical properties: nom, masc 3.Semantic Formula: action de tuer: ~ PAR L'individu X DE L'individu Y 4.Government pattern: X = I = de N, A-poss Y = II = de N, A-poss 5.(Quasi-)synonyms: {QSyn} assassinat, homicide#1; crime 6.Semantic derivations & collocations: {V0} tuer {A0} meurtrier-adj / *Nom pour X*/ {S1} auteur [de ART Ø] //meurtrier-n /*Nom pour Y*/ {S2} victime [de ART Ø] /*Très choquant*/ 7.Examples: La mésentente pourrait être le mobile du meurtre. 8.Full Idioms: appel au meurtre crier au meurtre Structure derived from Alain Polguère’s work on DiCo

6/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Fixed and free language elements Fixed Stereotyped definition in semantic formula: action de tuer: Logical argument frame: ~ PAR L'individu X DE L'individu Y Grammatical properties: nom, masc Free Examples: La mésentente pourrait être le mobile du meurtre. Citations (e.g. for SPIRIT): the spirit is strong, but the flesh is weak (Bible, ref.XXX) Free definitions in semantic formula (e.g. for a disease noun such as LEUCOCYTE): sort of cell contained in the blood and attacking infectious agents Glosses (sometimes = quasi-synonyms): character (mood)

7/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The problem (1) Necessity to translate all free language elements The translation in L2 of an example for X(L1) is not in general a good example for the translation of Y in L2 Il utilise souvent des cartes IGN *He often uses IGN roadmaps/maps  He often uses AA maps IGN = Institut Géographique National AA = Automobile Association Hence, the size of the problem is quadratic!

8/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) The problem (2) Where to store these translations? Not in the lexies, which must remain monolingual Not in the axies, which must remain pure links

9/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Solution for the storing problem Use auxiliary lexies and axies terminology: x-lexie, x-axie x  {def, cit, ex, glo} Each free language element becomes an x-lexie cit-lexies and ex-lexies are simpler than normal lexies X-lexies are linked through x-axies An x-axie contains lists of x-lexies and, in case of an external reference to UNL a UNL graph (if x ≠ glo), or a UW (glo-axie)

10/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Multilingual links = AXIES Normal axies for each language L, 0:n links to lexies of L for each semantic system S available, 0:n links to entities of S UNL UWs, WordNet synsets, NTT SemCat, Ontos concepts, LexiQuest Lex-concepts… Auxiliary axies for examples, citations… for each language L, 0:n links to lexies of L if UNL-annotated,1 UNL graph

11/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A « Montaigne » environmemt for Human Translation Idea: let users SHARE translation memory & tools on a server Specs in 1995 around Eurolang Optimizer™ no funding although « Francophony » interested… Internet version: see (OKI) First version built for Lao see (V. Berment) Future: bilingual editor as applet use of Papillon server architecture (private spaces etc.)

12/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Scenario & possible GUI …… source segment N-2translated segment (done) source segment N-1translated segment (done) suggestion(s) from the TM source segment N translated segment (currently being created) source segment N+1 source segment N+2 dictionary suggestions Typical layout of a bilingual editor in a TSS

13/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Design & implementation issues Peer-to-peer architecture Papillon  Montaigne Possibility to modify input text & segmentation Integrate with private lexicon(s) Open to plug-ins (voice input…)

14/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Automating translation using UNL UNL = a project a language to represent NL utterance meanings a format for multilingual documents (html  xml) Elements of the UNL language UWs: Relations: agt, aoj, mod, obj, tim… (Hyper)graphs: subgraph is connex & has an entry node

15/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) A simpleUNL input graph

16/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Possible interactive disambiguation at analysis time

17/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Interactive disambiguation (2) - gives a correct unique multilevel concrete (UMC) tree - then a correct unique multilevel abstract (UMA) tree - and finally a correct UNL graph

18/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Possible text-graph « coedition » at reading time applicable if there is a UNL graph associated with a segment one wants to modify goal : share the revisions across languages, by reflecting them on the UNL graph Ex: FB2204 (Forum Barcelona 2004) « Une cité retrouvera une zone côtière après un forum » add on the nodes for "city", ”forum” transform “forum” into “Forum” replace "retrieve" by "recover" add on the node containing it. « La cité récupérera une zone côtière après le Forum »

19/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

20/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

21/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet)

22/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Principles of coedition (1) It is impossible in principle to deduce the modification on the graph from a modification on the text For example, replacing "un" ("a") by "le" ("the") does not entail that the following noun is determined because it can also be generic "il aime la montagne" = "he likes mountains" Revision is not done by modifying directly the text, but by using a menu system Menu items have a "language side" and a hidden "UNL side"

23/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Principles of coedition (2) when a menu item is chosen, only the graph is transformed, the action to be done on the text is delayed and shown at any time, the new graph may be deconverted If is is satisfactory, that shows that errors were due to the graph and not to the deconverter, and the graph may be sent to deconverters in other languages. Versions in some other languages known by the user may be displayed, so that improvement sharing is visible and encouraging.

24/24 17/7/2002 (Papillon-02) Translation in Papillon (Ch. Boitet) Conclusion Need for a translation task (#7) in Papillon Seamless integration with x-lexies + x-axies Possible combination of TA & MT Mutualization spirit (Papillon, Montaigne) for TA Use of UNL (2 « pivot » architectures) for MT Mutualization again in MT part (humans involved) Interactive disambiguation Coedition text  UNL graph … & of course lexical data contribution through Papillon!