An Open Localisation Interface to CMS using OASIS Content Management Interoperability Services Provide name of demo, name of presenter (and affiliation.

Slides:



Advertisements
Similar presentations
Chungnam National University DataBase System Lab
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
IATI Technical Advisory Group Technical Proposals Simon Parrish IATI Technical Advisory Group, DIPR March 2010.
Texas Digital Library Services Preservation Network.
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
UKOLN, University of Bath
Overview Environment for Internet database connectivity
OASIS OData Technical Committee. AGENDA Introduction OASIS OData Technical Committee OData Overview Work of the Technical Committee Q&A.
Service Description: WSDL COMP6017 Topics on Web Services Dr Nicholas Gibbins –
Database System Concepts and Architecture
XML: Extensible Markup Language
How did we get here? (CMIS v0.5) F2F, January 2009.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
DAVID M. KROENKE’S DATABASE PROCESSING, 10th Edition © 2006 Pearson Prentice Hall 13-1 COS 346 Day 25.
1 Workshop on Metadata Interoperability for Electronic Records Management November 15, 2001 Archives II, College Park, MD.
Secure Systems Research Group - FAU Web Services Standards Presented by Keiko Hashizume.
(C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program.
1 © Talend 2014 XACML Authorization Training Slides 2014 Jan Bernhardt Zsolt Beothy-Elo
GMD German National Research Center for Information Technology Innovation through Research Jörg M. Haake Applying Collaborative Open Hypermedia.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
San José, CA – September, 2004 Localizing with XLIFF and ICU Markus Scherer Raghuram (Ram) Viswanadha IBM San.
An Open Localisation Interface to CMS using OASIS Content Management Interoperability Services Aonghus Ó hAirt, Dominic Jones, Leroy Finn and David Lewis.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Interoperability with CMIS and Apache Chemistry
Apache Chemistry face-to-face meeting April 2010.
Trimble Connected Community
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
Metadata Tools and Methods Chris Nelson Metanet Conference 2 April 2001.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
RDA Data Foundation and Terminology (DFT) IG: Introduction Prepared for RDA 6 th Plenary Paris, Sept. 25, 2015 Gary Berg-Cross, Raphael Ritz Co-Chairs.
The MultilingualWeb-LT Working Group receives funding by the European Commission (project name LT-Web) through the Seventh Framework Programme (FP7) in.
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Tyler Snow Brigham Young University Translation Research Group.
Open Data Protocol * Han Wang 11/30/2012 *
© Copyright 2008 STI INNSBRUCK NLP Interchange Format José M. García.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
MultilingualWeb – Language Technology A New W3C Working Group Felix Sasaki, David Filip, David Lewis.
© 2008 IBM Corporation ® IBM Cognos Business Viewpoint Miguel Garcia - Solutions Architect.
National Center for Supercomputing Applications NCSA OPIE Presentation November 2000.
1 CS 430 Database Theory Winter 2005 Lecture 17: Objects, XML, and DBMSs.
Internationalization: Implementing the XLIFF Standard Jon Allen, Producer instructional media + magic, inc. JA-SIG Summer Conference 2003 June 10, 2003.
10/18/2015 NORTEL NETWORKS CONFIDENTIAL – FOR TRAINING PURPOSES ONLY Global Documentation Evolution System Overview and End-to-End Process Training.
Content Repositories with CMIS and Apache Chemistry Stephan Klevenz, SAP AG November 2011.
PASSOLO ® Makes Your Software Ready for the Global Market Localisation Standards The Tools Developer’s Perspective.
Using Semantic Mapping to Manage Heterogeneity in XLIFF Interoperability by Dave Lewis, Rob Brennan, Alan Meehan, Declan O’Sullivan CNGL Centre for Global.
© Copyright 2013 STI INNSBRUCK “How to put an annotation in HTML?” Ioannis Stavrakantonakis.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
Machine Translate Post Edit Quality Check Extract Content I18N Text Analysis Curate Corpora Workflow Analysis Segment Identify Terms Translate Provenance.
ITS 2.0 in XLIFF 2 FEISGILTT Dublin June 2014 Yves Savourel ENLASO Corporation This presentation was made possible by.
Object storage and object interoperability
IBM Global Services © 2005 IBM Corporation SAP Legacy System Migration Workbench| March-2005 ALE (Application Link Enabling)
XACML Showcase RSA Conference What is XACML? n XML language for access control n Coarse or fine-grained n Extremely powerful evaluation logic n.
Hyperion Artifact Life Cycle Management Agenda  Overview  Demo  Tips & Tricks  Takeaways  Queries.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
A report by Olaf-Michael Stefanov to the JIAMCATT community
Using E-Business Suite Attachments
Dave Lewis W3C MultilingualWeb - Language Technology Working Group
Building the Localization Web
Part of the Multilingual Web-LT Program
2. An overview of SDMX (What is SDMX? Part I)
Technical Capabilities
Business Process Management
Use Cases Simple Machine Translation (using Rainbow)
Linked Data Reuse in the Language Services Industry
SDMX IT Tools SDMX Registry
Presentation transcript:

An Open Localisation Interface to CMS using OASIS Content Management Interoperability Services Provide name of demo, name of presenter (and affiliation DCU, TCD, UL, UCD ) Only display your relevant track logo(s) – remove the others. Affiliate demo presenters (Cultura, AXES, etc) – add your project logo. Introduce yourself and give the reviewers a one/two sentence synopsis of what you are demo’ing today. Affiliate demo presenters – mention the affiliated project that you are representing Highlight cross-CNGL collaboration if relevant….e.g., ‘ this demo is a collaborative effort between researchers from Dublin City University and Trinity College Dublin. Aonghus Ó hAirt, Dominic Jones, Leroy Finn and David Lewis Centre for Next Generation Localisation, Trinity College Dublin

Challenges for Interoperability More iterative workflows From push-based hand-offs To change notification & fine grained retrievals Extending data management to support innovation Statistical MT, named entity recognition, text analytics for QA and terminology management All require up-to-date relevant training corpora Solutions must sit comfortably with technology of: Content Management, Web Publishing & Localisation No one Standard can address it all Integrating ITS and CMIS for L10n & a little on XLIFF, RDF and Open Provenance

CMS-TMS Interoperability Interoperability roadblock High variety: Content Management Systems Content formats Increasingly dynamic Add language resource curation Data driven MT, text analytics Client editors Create Content CMS LSP translators Prepare Content CMS/ CVS Terminology tools Translate Content CAT TMS QA QA tools design Publish Content Web CMS

Internationalisation Tag Set (ITS) Allows I18n and L10n tools to be instructed to treat specific text in specific ways Principles: Minimise disturbance of original content Don’t reinvent wheel Link to existing meta-data before adding new Defined distinct, independent Data Categories Identify relevant text using: Attributes to existing elements: LOCAL selection Xpath selectors in a special element: GLOBAL selection

C. Lieske, F. Sasaki 2010

ITS 1.0 Data Categories Translate: Localization Note: Terminology Mark whether the content of an element or attribute should be translated or not Localization Note: Communicate notes to localizers about a particular item of content Terminology Mark terms and optionally associate them with information, such as definitions Directionality Specify the base writing direction of blocks, embeddings and overrides for the Unicode bidirectional algorithm Ruby Provide a short annotation of an associated base text, particularly useful for East Asian languages Language Information Express the language of a given piece of content Element within Text Identify how an element behaves relative to its surrounding text, eg. for text segmentation purposes

ITS 2.0 Draft Data Categories I18n Locale Filter External Resource Preserve Space Allowed Characters Storage Size ID Value Language Technology Domain MT confidence Disambiguation Text Analysis Annotation Provenance & QA Quality Issue Quality Précis Translation Provenance Agent Trans Revision Prov Agent Standoff Provenance

ITS and Content Management Global ITS rules can be defined in an external file Attribute applied to a node with following precedence: LOCAL attributes Embedded GLOBAL rules in reverse order External GLOBAL rules in reverse order ITS allows tool-specific mechanisms for associating global rules with content – precedence not specified Common practice to apply a given set of rules to all documents in a project with the same schema Can this scale to multiple overlapping schema? Can we use some CMS-level meta-data interoperability solution?

CMS Interoperability Integrating with CMS requires the use of an API. Until now, most CMS used proprietary APIs Proprietary interfaces to CMS lead to limited support, vendor lock-in and poor interoperability between CMS and with localisation tools Content Management Interoperability Service (CMIS) from OASIS offers a standardised API for interacting with CMS Localisation is out of scope for CMIS How can CMIS facilitate the localisation of content across multiple CMS?

OASIS Content Management Interoperability Services (CMIS) “defines a domain model and Web Services and Restful AtomPub bindings that can be used by applications to work with one or more Content Management repositories/systems.” (CMIS standard) Published in 2010 Participation from Adobe, Alfresco, EMC, IBM, Microsoft, Oracle, SAP, and others.

CMIS Implementations Alfresco 3.3+ Apache Chemistry InMemory Server Athento COI Day Software CRX EMC Documentum eXo Platform with xCMIS Fabasoft HP Autonomy Interwoven Worksite IBM Content Manager IBM FileNet Content Manager IBM Content Manager On Demand IBM Connections Files IBM LotusLive Files IBM Lotus Quickr Lists ISIS Papyrus Objects KnowledgeTree 3.7+ Maarch 1.3 Magnolia (CMS) 4.5 Microsoft SharePoint Server 2010 NCMIS NemakiWare Nuxeo Platform 5.5 O3spaces 3.2+ OpenIMS OpenWGA 5.2+ PTC Windchill SAP NetWeaver Cloud Document Seapine Surround SCM 2011.1 Sense/Net 6.0+ TYPO3 VB.CMIS Note the distinction between client and server implementations

CMIS Objects A repository is a container of objects. Objects have four base types: Document object – “elementary information entities managed by the repository” Folder object – “serves as the anchor for a collection of file-able objects” Relationship object – “instantiates an explicit, binary, directional, non-invasive, and typed relationship between a Source Object and a Target Object” Policy object – “represents an administrative policy that can be enforced by a repository, such as a retention management policy.” (CMIS Specification)

CMS-L10n Interoperability: Two Requirements Flexible ITS rule to document bindings The same rule to be applied to multiple documents Multiple rules to be applied to individual documents Specify the precedence order in which rules are processed for a document Aim to support external ITS rules via CMIS Need to signal L10n-relevant updates to documents MLW-LT (ITS2.0) workgroup identified a requirement for such ‘readiness’ signalling Aim to support open asynchronous change notification for CMIS

Design: Extending CMIS Implementations Two approaches to modelling the localisation information: Custom content modelling Alfresco aspects Implementation in repository Alfresco (primary) Nuxeo (basic testing)

ITS rules using Policy Objects Translate rules as policy objects

Translate rules as folder objects ITS Rules as Folders Translate rules as folder objects

Signalling Readiness from CMS Readiness meta-data Indicates the readiness of a document for submission to L10n processes or provide an estimate of when it will be ready for a particular process Data model ready-to-process – type of process to be perfomred next process-ref – a pointer to an external set of process type definitions used for ready-to-process ready-at – defines the time the content is ready for the process, it could be some time in the past, or some time in the future revised – indicates is this is a different version of content that was previously marked as ready for the declared process priority – high or low complete-by – indicates target date-time for completing the process

Polling extension to CMIS Polling schemes describe the way in which documents are polled for updated readiness properties scheme name / ID polling interval notification method notification target / host port (for network connection) readiness property readiness value

Polling sequence

Readiness Readiness modelled as custom object Readiness modelled with an aspect

Polling Schemes

Document model with localisation

Technical setup Repository browser tool Polling system Notification system Test tools

Evaluation Notification response time

Evaluation Performance evaluation

Content Management - L10n Workflow Integration ITS Web-based PE Source CMS Parse, filter, segment MT MT Workflow Management Target CMS XLIFF/PROV XLIFF+ITS TM Reassemble RDF provenance store XLIFF store QA viewer CAT Content Management Localisation Preparation Translation Management

XLIFF and Open Provenance Capture XLIFF transformations that operate on content and its meta-data as the result of content processing by different localisation workflow services A provenance model used to capture process operations agents and properties of those processes Support managing & auditing quality of processes correlating output of individual steps with professional, crowd and consumer judgement support end-to-end process management terminology management On-demand language resource assembly e.g. for parallel text for MT training

Linked Localisation Data: RDF-based logging Open Provenance Vocabulary http://openprovenance.org/ Active W3C Provenance working group

LT Assisted Localisation Process Provenance 12401 wasControlledBy value “I am a string” j.doe wasGeneratedBy Machine Trans wasTranslatedFrom wasTranslatedFrom value “Je suis un phrase” 2010-02-14T10:30:00 wasGeneratedAt Prof trans wasGeneratedBy c3po 16723 16740ms expended 15601 wasGeneratedAt 2010-02-09T12:30:00 xml:lang value 15790 “Je suis une string” 2010-02-13T10:07:00 wasGeneratedAt CrowdPE wasGeneratedBy d.jones wasControlledBy wasRevisedFrom fr-FR value “Je suis un string” 15771 wasAnnotatedWith 2010-02-12T13:17:00 m.bean wasGeneratedAt Crowd rate wasControlledBy wasGeneratedBy “Poor” value 16727 2010-02-14T14:05:00 l.jfinn wasGeneratedAt Text Classify wasControlledBy wasGeneratedBy anomolous value wasAnnotatedWith 16734 2010-02-14T13:21:00 s.curran wasGeneratedAt Trans QA wasControlledBy wasGeneratedBy pass value wasAnnotatedWith

Future LSP-Neutral Open Service CMIS+ITS+PROV TMS+ L10n tools XLIFF+ITS CMIS+ ITS+XLIFF+PROV Common Services Content status/ update Provenance query Resource Curation/ Sharing Client Source CMS TMS+ L10n tools XLIFF+ITS CMIS+ ITS+XLIFF+PROV QA viewer Target CMS TMS+ L10n tools XLIFF+ITS CMIS+ ITS+XLIFF+PROV LSPs

Conclusion Have extended CMIS to support: Document level ITS rules Open document change notification mechanism Strong potential to streamline CMS-L10n integration in combination with XLIFF and PROV Achieved with current CMIS specification Custom extension to folder object Custom extension to policy object may be better Next Steps Combining standards for vendor-neutral CMS integration Aligh with ITS2.0 and XLIFF2.0 Discuss extensions with CMIS-compliant vendors

Questions. Thank You. Follow ITS Use Case at: http://www.w3.org/International/multilingualweb/lt/wiki/CMS_Neutral_External_ITS_Rules_and_Readiness Follow XLIFF+ITS mapping at: http://www.w3.org/International/multilingualweb/lt/wiki/XLIFF_Mapping