Download presentation
Presentation is loading. Please wait.
Published byRichard Ward Modified over 9 years ago
1
From Code to XLIFF Bridging the Chasm Dr. Stephen Flinter Connect Global Solutions LRC Conference – 19 November 2003
2
Agenda The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary
3
The Problem The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary
4
The Problem XLIFF has made the representation of resources translation/localisation friendly Non-trivial to convert existing files to XLIFF Adding new file formats can be painful
5
XLIFF Transformation Definition: XLIFF Transformation is the process by which native file formats are transformed into XLIFF, and from XLIFF back to its native format (after translation). File formats include: Java,.properties, XML, HTML, custom.
6
Architecture
7
.com Business Model Parody of the.com business model that has been floating around the web: –Get lots of users –??? –Profit
8
XLIFF Transformation Model The XLIFF transformation model could be described in similar terms: –Native file format –??? –XLIFF
9
Architecture
10
Current Approaches The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary
11
Current Approaches to XLIFF Use XLIFF as native format Use commercial tools Use regular expressions & scripts
12
XLIFF as Native Format Use XLIFF from software development onwards No transformation required Preferred approach in the long term
13
Disadvantages Requires significant changes to the software development process How to handle legacy resources? –Back to the original problem
14
Commercial Tools Tool support for XLIFF is improving all the time. Advantages of support and expertise of tool developer.
15
Disadvantages However, many tools still only read XLIFF, and won’t generate XLIFF from native formats Won’t necessarily support all formats required Can be difficult to identify in-line tags
16
Scripts and Regular Expressions Use a scripting language (e.g. perl, python, WordBasic) Encode rules to extract translatable resources using regular expressions
17
Examples StringRegular Expression “Translatable text” /”([^”]*)”/ id1 = Translatable text /.* = (.*)/
18
Advantages Superficially simple to develop Plenty of powerful RE languages (especially perl) available Full control and ownership of how the formats are managed
19
Disadvantages Error prone – difficult to cover all situations To remove all errors, often have to add many parsing rules Has to be redone for every new file type RE’s have to change for inline tags
20
Other Examples print(“First string”); print(“Second” + “ string”); print(“Third \”string\””); print(“Fourth {0} string”);
21
Summary This approach is doomed to failure because of the disconnect between the grammar of the language, and the regular expressions used to identify strings.
22
Grammar Based Approach The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary
23
A New Approach With this approach, we look at the language grammar (EBNF) Identify grammar productions that can hold translatable text Generate a parser that accepts instances of the grammar and emits XLIFF
24
Grammar-based Architecture
26
Architecture New component: XLIFF parser generator (XPG) Accepts a JavaCC grammar Allows one or more productions to be marked as translatable Generate the “extract” and “merge” programs
27
JavaCC JavaCC: Java Compiler Compiler Modelled after lex & yacc Works on EBNF-type grammars rendered as JavaCC.jj files JavaCC grammar available for most modern programming languages.
28
Big Win Direct, one-to-one correspondence between the grammar and the mechanism for identifying strings.
29
Advantages Consistent high quality –Guaranteed to work in every case – for all instances of the grammar. Painless –No scripting/regular expressions required –Extractor and merger generated automatically Fast –Just need to identify the strings in the grammar
30
Example Extract from Java BNF ::= | | ::= " ?" ::= | ::= except " and \ |
31
JavaCC Extract void Literal() : {} { | BooleanLiteral() | NullLiteral() }
32
< STRING_LITERAL: "\"" ( (~["\"","\\","\n","\r"]) | ("\\" ( ["n","t","b","r","f","\\","'","\""] | ["0"-"7"] ( ["0"-"7"] )? | ["0"-"3"] ["0"-"7"] ["0"-"7"] ) )* "\"" >
33
Identifying We identify the as a language item that may contain strings XPG then generates a new grammar, which compiles to the extractor. The extractor then generates XLIFF.
34
Modified JavaCC Grammar void Literal() : {} { | StringLiteral() | BooleanLiteral() | NullLiteral() }
35
StringLiteral() void StringLiteral() : { Token t; } { t = { String s = t.image.substring(1, t.image.length() - 1); pw.println(" "); pw.println(" " + s + " "); pw.println(" "); }}
36
Other XPG Tasks Create XLIFF surrounding tags Create skeleton file Embed code for handling inline tags
37
Inline Tags Example: –“Click on the {0} button to start the {1} job” The {0} and {1} constitute inline tags Not part of grammar itself Can vary from application to application We must be able to extract these based on regular expressions: –{[0-9]+}
38
XPG and Inline Tags Embeds code to read a set of regular expressions from a file. When the extractor identifies a string: –Executes RE on string –Moves matches to XLIFF inline tag
39
Final Architecture
40
XPG & XML The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary
41
XPG and XML Applications A similar approach can be applied to XML Schemas Uses XSTL & DOM rather than JavaCC Can identify XML tags and attributes that may contain text
42
Summary XPG is an approach to XLIFF transformation that corresponds to the grammar of the language being transformed. This ensures consistent, error free and rapid XLIFF transformation. The XPG approach is suitable for computer languages and markup
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.