Download presentation
Presentation is loading. Please wait.
1
L10N Standards Warszawa 2014 http://maturebabespics.com/
2
Why Standards?
3
Why have Standards?
4
L10N Standards What are we going to cover: 1.Why L10N standards are important 2.The role XML has to play 3.Key L10N standards data standards 4.How to leverage L10N standards 5.Creating a totally data driven automated L10N process 6.Interoperability
5
Why have Standards?
6
Current State of Art
7
L10N Typical Workflow
8
What you need is a better crane!???
9
Localization without Standards Customer source text extract extracted text tm process prepared text translate translated text target text merge target text QA
10
True Cost of Translation
11
Standards = Uniform Data
12
ISO Standard
13
Standards = Efficiency
14
Standards = Lower Costs
15
Standards = Safe to Implement
16
Standards = Greater Interoperability
17
Standards: Unforeseen Benefits
19
Standards: Misuse imap://azydron%40xml-intl%40xml- intl%2Ecom@xml- intl.com:143/fetch%3EUID%3E.INBOX%3 E87222?part=1.2&filename=image003.jpg
20
Standards: Abuse
21
Standards: Sabotage Sabotaged Standards: Proprietary extensions Bad implementations
23
The importance of XML Everything is now XML HTML/XHTML Web Services Adobe FrameMaker Microsoft Office Open Office ASP XAML Java Properties DITA Standards: TMX, XLIFF, SRX, GMX, TBX, xml:tm OAXAL Open Architecture for XML Authoring and Localization
24
The power of XML Any electronic format not in XML can be converted to XML Frame Maker RTF Microsoft Office pre 2007 Quark Express Windows resource files Java resources PO/POT YAML Etc. And then back into the original format
25
Benefits of XML for L10N Separation of form and content Should make documents easier to translate There are some critical design decisions Mistakes can hinder translatability XML can bootstrap its own localization
26
The significance of XML XML is not just another electronic format XML is an eXtensible syntax XML is a formal IT grammar XML is programmable XML is can bootstrap its own localization
27
Benefits of XML for L10N Why use XML for Localization? Most localizable documents are now in XML One input format Elegant Uses the latest IT technology Separation of source and content One single data bus Open Standards based You can use XML assist its own localization One extraction + TM + SMT engine
28
Core L10 Standards W3C ITS Document Rules ETSI LIS SRX ETSI LIS xml:tm ETSI LIS TMX ETSI LIS TBX ETSI LIS GMX OASIS XLIFF W3C/OASIS DITA (XHTML, DocBook, or any XML Vocabulary) Linport Interoperability: TIPP XLIFF:doc
29
ITS Internationalization and Localization Tag Set – http://www.w3.org/International/its Internationalization Tag Set – Document Rules for a given XML vocabulary: – Inline elements (within text) – Sub flows – Non-translatable – Translatable attributes Guidelines for localizing XML documents Internationalization and Localization Markup Requirements Version 1.0, 2008 Version 2.0, 2013
30
http://www.etsi.org/deliver/etsi_gs/lis/001_099/002/01.04.02_60/gs_lis002v 010402p.pdf Translation Memory Exchange Current version 1.4b, 2.0 undergoing review Allows for the interchange of translation memories between different vendor systems – No translation vendor lock-in – Free exchange of translation assets TMX
31
First LISA OSCAR Standard – Version 1.1 1998 – Version 1.2 1999 – Version 1.3 2001 – Version 1.4b 2002 Moved to ETSI/LIS 2012 – Version 2.0 2014? Two level of implementation: – Level 1 (Plain Text Only) – Level 2 (Content Markup) TMX History
32
http://www.gala-global.org/oscarStandards/srx/srx20.html Segmentation Rules Exchange Current version 2.0 2008 How sentences are segmented Allows for the exchange of segmentation rules using regular expressions Complements TMX standard Quoted XLIFF, TMX and xml:tm SRX
33
Unicode Regular expression syntax defined Meta characters – Unicode regular expressions: "\X", "\s", "\S" etc. Operators – "*", "|", "?", "+" etc. Defines: – Language rules: segmentation rules – Map rules: how to apply the segmentation rules SRX Key Concepts
34
GMX http://docbox.etsi.org/ISG/Open/ISGLIS/GMX-V/GMX-V/GMX-V-2.0.html Global Information Management Metrics eXchange GMX/V Approved LISA OSCAR Standard February 2007 Tripartite – GMX-V : Volume, published for public comment – GMX-C : Complexity, initial specification – GMX-Q : Quality Standard for defining a L10N job Allows for quantifying job complexity GMX/V 2.0 Approved ETSI LIS – added support for CJK word counts – overall character count including white space characters
35
GIM Metrics eXchange – Volume Objectives: – Unambiguous and verifiable definition of word and character counts – A method of exchanging counts within an XML framework Two types of count: – Verifiable, based on electronic documents – Non-verifiable Canonical form: XLIFF based Word boundaries: Unicode TR29 Unicode character encoding Minimum conformance – Total Character Count – Total Word Count GMX-V
36
XLIFF http://www.oasis-open.org/committees/xliff XLIFF – XML Localization Interchange File Format Current status – XLIFF 1.1 Committee Specification (31 Oct 2003) – XLIFF 1.2 Approved as an OASIS Standard 2008 Segmentation support (X)HTML XLIFF 1.1 Representation Guide PO / POT XLIFF 1.1. Representation Guide Java / Windows /.Net Representation Guide – XLIFF 2.0 currently out for public comment (not backwards compatible)
37
XLIFF
38
Single format for exchanging L10N from disperate sources Loss-less Tool-neutral Formalized as an XML vocabulary Can embed skeleton file XLIFF
39
xml:tm http://www.xtm-intl.com/manuals/xml-tm/xml-tm2.0.html XML based Text Memory – Radical rethink of how to handle Translation Memory – Donated by XML INTL to LISA OSCAR – OSCAR Standard Feb 2007 – Adopted by ETSI LIS, version 2.0 ready for adoption Takes the DITA reuse principle down to sentence level – Author Memory – Translation Memory
40
xml:tm - Namespace Namespace is a major feature of XML Allows the mapping of different ontological entities onto the same representation Allows different ways to look at the same data Namespaces can be made transparent
41
xml:tm XML based text memory Revolutionary approach to translating XML documents First significant advance in translation memory technology Uses XML namespace to transparently embed contextual information The one ring that binds them all
42
xml:tm namespace Example of the use of tm namespace in an XML document: Namespace is very flexible. It is very easy to use.
43
xml:tm namespace doc title section para tm te sentence tu te sentence tu te sentence tu Source document tm namespace view te text tu text te sentence tu para text para text para text para text para text te sentence tu te sentence tu text Source document view
44
xml:tm Text Memory Author memory Maintain memory of source text Authoring statistics Authoring tool input Translation memory Automatic alignment Maintain perfect link of source and target text Reduce translation costs
45
xml:tm DOM differencing tu id=”1” tu id=”2” tu id=”3” tu id=”4” tu id=”5” tu id=”6” Original Source Document tu id=”1” tu id=”2” tu id=”3” tu id=”4” tu id=”7” tu id=”6” deleted tu id=”8” modified new Updated Source Document DOM Differencing
46
xml:tm translated document in Polish doc title section para tm te zdanie tu te zdanie tu te zdanie tu Translated document tm namespace view te tekst tu tekst te zdanie tu para tekst para tekst para tekst para tekst para tekst te zdanie tu te zdanie tu tekst Translated document view
47
Putting It All Together
48
Open Architecture for XML Authoring and Localization (OAXAL) –http://wiki.oasis-open.org/oaxal/FrontPagehttp://wiki.oasis-open.org/oaxal/FrontPage
49
OAXAL 2.0
51
OAXAL Benefits SOA (Service Oriented Architecture) Open Architecture Open Standards - Open APIs Easy Exchange Modular design Interoperability Very high level of automation
52
Interoperability Now!/Linport Interoperability Now! http://www.interoperability-now.org/ Born out of frustration and necessity Early 2012 Members Bioloom Group Kilgray Medtronic Ontram Spartan Software XTM-INTL The goal: True 100% roundtrip interoperability between TMS/CAT tools Now part of Linport
53
Interoperability Now!/Linport Linport http://www.linport.org/ LINPortLanguage INteroperability Portfolio Created in 2012 by the merging of two initiatives: Multilingual Electronic Dossier The Container Project Sponsored: the European Union DG Translation JAIMCATT (http://jiamcatt.org/) -http://jiamcatt.org/ Joint Inter-Agency Meeting on Computer-Assisted Translation and Terminology
54
OAXAL in Action
55
Translating English Soccer Articles into Arabic 24x7
57
Browser-Based Workbench
58
OAXAL In Action
59
Contact details: Andrzej Zydroń azydron@xtm-intl.com http://www.xtm-intl.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.