A report by Olaf-Michael Stefanov to the JIAMCATT community

Slides:



Advertisements
Similar presentations
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Advertisements

HTML5 and CSS3 Illustrated Unit B: Getting Started with HTML
Session: Technologies for the Multilingual Web. The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web)
XML for Information Management – Day 2 Airi Salminen University of Erlangen-Nuremberg Computational Linguistics Instructor: Professor Airi Salminen
XML for Information Management – Day 2 Airi Salminen University of Erlangen-Nuremberg Computational Linguistics Instructor: Professor Airi Salminen
A Practical Introduction to XML in Libraries Marty Kurth NYLA October 22, 2004.
Tutorial 3: Adding and Formatting Text. 2 Objectives Session 3.1 Type text into a page Copy text from a document and paste it into a page Check for spelling.
Facilitate Open Science Training for European Research Where Librarians can learn and teach Open Science for European Researchers LIBER 2015 London,
XP New Perspectives on Microsoft Access 2002 Tutorial 71 Microsoft Access 2002 Tutorial 7 – Integrating Access With the Web and With Other Programs.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
CNIT 133 Interactive Web Pags – JavaScript and AJAX Review HTML5.
Turkey IDA Info-Day PM Session, September 25, 2003 CIRCA 1 CIRCA : The IDA Collaborative Software Tool Grzegorz Ambroziewicz European Commission - DG Enterprise.
Publishing Digital Content to a LOR Publishing Digital Content to a LOR 1.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
1 XML at a neighborhood university near you Innovation 2005 September 16, 2005 Kwok-Bun Yue University of Houston-Clear Lake.
CREATED BY ChanoknanChinnanon PanissaraUsanachote
Copyright OASIS, 2002 OASIS Topic Maps Technical Committees Standards Update Presentation Knowledge Technologies Conference Seattle , March 11 Bernard.
Chapter 1 Understanding the Web Design Environment Principles of Web Design, 4 th Edition.
Week 1 Understanding the Web Design Environment. 1-2 HTML: Then and Now HTML is an application of the Standard Generalized Markup Language Intended to.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Tyler Snow Brigham Young University Translation Research Group.
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
MultilingualWeb – Language Technology A New W3C Working Group Felix Sasaki, David Filip, David Lewis.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
XML 101 Holly Hyland Session Objectives –XML Basics –Building Standards History Current State Future Vision.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Andy Dawson– University College London 1 EABH SUMMER SCHOOL Web Page Construction Andy Dawson Department of Information Studies, UCL.
Cascading Style Sheets CSS. Source W3Schools
Jozef Goetz, STEM Summer Camp Dr. Jozef Goetz.
ITS 2.0 in XLIFF 2 FEISGILTT Dublin June 2014 Yves Savourel ENLASO Corporation This presentation was made possible by.
INTRODUCTION JavaScript can make websites more interactive, interesting, and user-friendly.
XP Review 1 New Perspectives on JavaScript, Comprehensive1 Introducing HTML and XHTML Creating Web Pages with HTML.
Subject: Internationalization of AJAX applications using ITS and XML, Best practices and application. Doctoral Program in Technology and Software Engenieering.
Web Design Principles 5 th Edition Chapter 3 Writing HTML for the Modern Web.
HTML5 and CSS3 Illustrated Unit B: Getting Started with HTML.
Extensible Markup Language (XML) Pat Morin COMP 2405.
XML BASICS and more…. What is XML? In common:  XML is a standard, simple, self-describing way of encoding both text and data so that content can be processed.
Working with Cascading Style Sheets
PIRUS PIRUS -Publisher and Institutional Repository Usage Statistics
Objective % Select and utilize tools to design and develop websites.
Getting Started with CSS
Unit 4 Representing Web Data: XML
Prepared by: Galya STATEVA, Chief expert
REPORTING SDG INDICATORS USING NATIONAL REPORTING PLATFORMS
XML QUESTIONS AND ANSWERS
Software Documentation
GBIF Governing Board 20 12th Global Nodes Meeting
Objective % Select and utilize tools to design and develop websites.
Markup Languages Gilok Choi 9/17/2018
Dave Lewis W3C MultilingualWeb - Language Technology Working Group
Web Programming– UFCFB Lecture 9
Part of the Multilingual Web-LT Program
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
2. An overview of SDMX (What is SDMX? Part I)
ESS Standardisation State of play
Tutorial 7 – Integrating Access With the Web and With Other Programs
Web Programming– UFCFB Lecture 9
CSE591: Data Mining by H. Liu
Use Cases Simple Machine Translation (using Rainbow)
Linked Data Reuse in the Language Services Industry
The single digital gateway
Introduction “Technologies for the Multilingual Web” & ITS 2
European Statistical System Metadata Handler ESS MH (Super) Providers
QoS Metadata Status 106th OGC Technical Committee Orléans, France
SDMX IT Tools SDMX Registry
HTML5 and CSS3 Illustrated Unit B: Getting Started with HTML
Presentation transcript:

ITS 2.0 Drafting version 2 of the W3C Internationalization Tag Set standard – a status report A report by Olaf-Michael Stefanov to the JIAMCATT community with thanks to the MLW-LT ITS 2.0 chairs: David Filip, David Lewis, and Felix Sasaki and to all the members of the W3C MLW-LT Working Group United Nations Office at Nairobi, 15-17 May 2013

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya What is ITS 2.0 ? ITS 2.0 is a technology to add metadata to Web content for the benefit of localization language technologies, and internationalization. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

What is the ITS 2.0 specification The ITS 2.0 specification both identifies concepts (such as “Translate”) that are important for internationalization and localization, and defines implementations of these concepts (termed “ITS data categories”) as a set of elements and attributes calling it the Internationalization Tag Set (ITS). 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya What is ITS 2.0 The document provides implementations for HTML, serializations in NIF (NLP Interchange Format), and provides definitions of ITS elements and attributes in the form of XML Schema , and RELAX NG. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

How is ITS 2.0 being developed? The MultilingualWeb-LT Working Group is defining meta-data for web content (mainly HTML5) and "deep Web" content to facilitate interaction between multilingual technologies and localization processes, and will demonstrate interoperable implementations. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

How is drafting & testing financed The Multilingual-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in the area of Language Technologies. Grant Agreement No. 287815. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Where and how is ITS 2.0 being standardized? The MultilingualWeb-LT (Language Technologies) Working Group is part of the W3C Internationalization Activity and the MultilingualWeb community. Aims: define the Internationalization Tag Set (ITS 2.0), that is: meta-data for web content (mainly HTML5) and deep Web content, for example a CMS or XML files from which HTML pages are generated, that facilitates its interaction with multilingual technologies and localization processes. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya IST 2.0 and translation As such IST 2.0 is not – strictly speaking – a standard for translation but in reality the boundary between translation and localization isn’t so clearly defined, and Translation services, providers, and tools developers can and should learn from the localization people about the complexity of standards needed in their (related) profession. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya Relation to ITS 1.0 Adopts and maintains the following principles from ITS 1.0: the use of data categories to define discrete units of functionality It adopts the separation of data category definition from the mapping of the data category to a given content format It adopts the conformance principle of ITS1.0 that an implementation only needs to implement one data category to claim conformance to ITS 2.0 ITS 2.0 supports all ITS 1.0 data category definitions and adds new definitions, with the exceptions of Directionality and Ruby. ITS 2.0 adds a number of new data categories not found in ITS 1.0. While ITS 1.0 addressed only XML, ITS 2.0 specifies implementations of data categories in both XML and HTML. 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Data categories from ITS 1.0 Translate Localization Note Terminology Directionality * (may be replaced in HTML5) Language information Elements within text 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

New Data categories in ITS 2.0 Domain Text Analysis Locale Filter Provenance External Resource Target Pointer Id Value Preserve Space Localization Quality Issue Localization Quality Rating MT Confidence Allowed Characters Storage Size 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya What is defined For each Data category: Default values Inheritance, whether ITS info is applicable to child elements Whether data category information applies locally, is provided through Global rules with additional rules about Global adding of information or Global pointing to existing information An ITS application is free to decide what pieces of content it uses, e.g. terminology, pointers, Id value 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya About the draft The draft text contains 93 blocks of Example code Describes relation to ITS 1.0 standard Motivation for ITS, including listing some typical problems Potential Users and Ways to use Support for legacy HTML content Several drafts actually formally published (for public review and feedback) 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Defines basic contents How to choose Local and Global markup approaches Overriding and Inheritance Adding information or pointing to existing information Goes into considerable length on Notation and Terminology Defines Conformance types and classes Provides some 20 sections on processing of ITS information Has a section each, on using ITS in HTML and XHTML 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Describing Data Categories For each of the 20 Data Categories: provides a normative Definition section, and an Implementation section, with Examples 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya Clearly indicates precisely with sections are informative, normative and non-normative. Provides 10 appendices, including Schemas for ITS, and List of ITS 2.0 Global Elements and Local Attributes 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Some development statistics from the drafting process Requirements gathering document: W3C public working draft Wiki version 21.000+ access 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Test suite development started August 2012, driven by Trinity College Dublin (TCD) Input: Files with ITS 2.0 metadata Output: metadata overview On schedule, by 15 March 2013, 80% of tests successful. Test suite master dashboard on GitHub Current state of tests Total number of input and reference output files: 225 Total number of tests from all implementers: 1104 Current coverage: 894 tests successfully run (81%). Requirement met that 2 successfully tested implementations have been run for each and every feature in draft standard 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya Since start of project 9 Face-to-face meetings 54 working group meetings (9 Mar 2012 – 15 May 2013) (mostly 1/week, but 2/wk for much of 1st Quarter 2013) 5 lengthly online drafting sessions by the 7 editors of the draft specification from June – December 2012 40+ individuals participating 2100+ emails, aggressive standardization progress Engaging “invited experts” and additional participants into the EU project, including higher-level decision makers for some 10 software developers 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya Potential Users Content producers Workflow managers Vendors of content-related tools MT Systems Text analytics Schema (also called “host vocabulary”) developers Translation tool and workflow developers ? 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya Upcoming and wrap-up The EU funded project under which IST 2.0 is being developed wraps up at the end of 2013 Before then a draft standard document and all necessary supporting documentation and tests (inputs and outputs) need to be submitted Draft then moves into the formal review and approval stages of W3C – the World Wide Web Consortium All the information gathered during the project becomes publically available on a web-site to be announced. Further follow-up, especially with regard to defining conformant implementations, can be expected 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya

Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya For further info: MultilingualWeb-LT Working Group Home Page: http://www.w3.org/International/multilingualweb/lt/ The Working Group wiki: http://www.w3.org/International/multilingualweb/lt/wiki/Main_Page The ITS 2.0 draft: http://www.w3.org/TR/its20/ ITS 2.0 Test Suite Dashboard: http://htmlpreview.github.io/?https://raw.github.com/finnle/ITS-2.0-Testsuite/master/its2.0/testSuiteDashboard.html Using ITS 2.0 in DocBook, an initial implementation: http://xmlguru.cz/2013/05/docbook-and-its2 16 May 2013 Olaf-Michael Stefanov, JIAMCATT-2013 @ UNON, Nairobi, Kenya