Presentation is loading. Please wait.

Presentation is loading. Please wait.

From Code to XLIFF Bridging the Chasm Dr. Stephen Flinter Connect Global Solutions LRC Conference – 19 November 2003.

Similar presentations


Presentation on theme: "From Code to XLIFF Bridging the Chasm Dr. Stephen Flinter Connect Global Solutions LRC Conference – 19 November 2003."— Presentation transcript:

1 From Code to XLIFF Bridging the Chasm Dr. Stephen Flinter Connect Global Solutions LRC Conference – 19 November 2003

2 Agenda The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary

3 The Problem The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary

4 The Problem XLIFF has made the representation of resources translation/localisation friendly Non-trivial to convert existing files to XLIFF Adding new file formats can be painful

5 XLIFF Transformation Definition: XLIFF Transformation is the process by which native file formats are transformed into XLIFF, and from XLIFF back to its native format (after translation). File formats include: Java,.properties, XML, HTML, custom.

6 Architecture

7 .com Business Model Parody of the.com business model that has been floating around the web: –Get lots of users –??? –Profit

8 XLIFF Transformation Model The XLIFF transformation model could be described in similar terms: –Native file format –??? –XLIFF

9 Architecture

10 Current Approaches The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary

11 Current Approaches to XLIFF Use XLIFF as native format Use commercial tools Use regular expressions & scripts

12 XLIFF as Native Format Use XLIFF from software development onwards No transformation required Preferred approach in the long term

13 Disadvantages Requires significant changes to the software development process How to handle legacy resources? –Back to the original problem

14 Commercial Tools Tool support for XLIFF is improving all the time. Advantages of support and expertise of tool developer.

15 Disadvantages However, many tools still only read XLIFF, and won’t generate XLIFF from native formats Won’t necessarily support all formats required Can be difficult to identify in-line tags

16 Scripts and Regular Expressions Use a scripting language (e.g. perl, python, WordBasic) Encode rules to extract translatable resources using regular expressions

17 Examples StringRegular Expression “Translatable text” /”([^”]*)”/ id1 = Translatable text /.* = (.*)/

18 Advantages Superficially simple to develop Plenty of powerful RE languages (especially perl) available Full control and ownership of how the formats are managed

19 Disadvantages Error prone – difficult to cover all situations To remove all errors, often have to add many parsing rules Has to be redone for every new file type RE’s have to change for inline tags

20 Other Examples print(“First string”); print(“Second” + “ string”); print(“Third \”string\””); print(“Fourth {0} string”);

21 Summary This approach is doomed to failure because of the disconnect between the grammar of the language, and the regular expressions used to identify strings.

22 Grammar Based Approach The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary

23 A New Approach With this approach, we look at the language grammar (EBNF) Identify grammar productions that can hold translatable text Generate a parser that accepts instances of the grammar and emits XLIFF

24 Grammar-based Architecture

25

26 Architecture New component: XLIFF parser generator (XPG) Accepts a JavaCC grammar Allows one or more productions to be marked as translatable Generate the “extract” and “merge” programs

27 JavaCC JavaCC: Java Compiler Compiler Modelled after lex & yacc Works on EBNF-type grammars rendered as JavaCC.jj files JavaCC grammar available for most modern programming languages.

28 Big Win Direct, one-to-one correspondence between the grammar and the mechanism for identifying strings.

29 Advantages Consistent high quality –Guaranteed to work in every case – for all instances of the grammar. Painless –No scripting/regular expressions required –Extractor and merger generated automatically Fast –Just need to identify the strings in the grammar

30 Example Extract from Java BNF ::= | | ::= " ?" ::= | ::= except " and \ |

31 JavaCC Extract void Literal() : {} { | BooleanLiteral() | NullLiteral() }

32 < STRING_LITERAL: "\"" ( (~["\"","\\","\n","\r"]) | ("\\" ( ["n","t","b","r","f","\\","'","\""] | ["0"-"7"] ( ["0"-"7"] )? | ["0"-"3"] ["0"-"7"] ["0"-"7"] ) )* "\"" >

33 Identifying We identify the as a language item that may contain strings XPG then generates a new grammar, which compiles to the extractor. The extractor then generates XLIFF.

34 Modified JavaCC Grammar void Literal() : {} { | StringLiteral() | BooleanLiteral() | NullLiteral() }

35 StringLiteral() void StringLiteral() : { Token t; } { t = { String s = t.image.substring(1, t.image.length() - 1); pw.println(" "); pw.println(" " + s + " "); pw.println(" "); }}

36 Other XPG Tasks Create XLIFF surrounding tags Create skeleton file Embed code for handling inline tags

37 Inline Tags Example: –“Click on the {0} button to start the {1} job” The {0} and {1} constitute inline tags Not part of grammar itself Can vary from application to application We must be able to extract these based on regular expressions: –{[0-9]+}

38 XPG and Inline Tags Embeds code to read a set of regular expressions from a file. When the extractor identifies a string: –Executes RE on string –Moves matches to XLIFF inline tag

39 Final Architecture

40 XPG & XML The XLIFF Transformation Problem Current approaches Grammar based approach – XPG XPG & XML Summary

41 XPG and XML Applications A similar approach can be applied to XML Schemas Uses XSTL & DOM rather than JavaCC Can identify XML tags and attributes that may contain text

42 Summary XPG is an approach to XLIFF transformation that corresponds to the grammar of the language being transformed. This ensures consistent, error free and rapid XLIFF transformation. The XPG approach is suitable for computer languages and markup


Download ppt "From Code to XLIFF Bridging the Chasm Dr. Stephen Flinter Connect Global Solutions LRC Conference – 19 November 2003."

Similar presentations


Ads by Google