Leveraging SET, OWL, CAM and Dictionary based tools to enabled automated cross-dictionary domain translations David Webber OASIS SET TC / CAM TC (with.

Slides:



Advertisements
Similar presentations
Visual Scripting of XML
Advertisements

ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
Page 1 Integrating Multiple Data Sources using a Standardized XML Dictionary Ramon Lawrence Integrating Multiple Data Sources using a Standardized XML.
NIEM, CAM and the 7 “D’s” David Webber - Public Sector NIEM Team, November 2011 NIEM Test Model Data Deploy Requirements Build Exchange Generate Dictionary.
Semantics and Information Exchanges Overview – Public Sector NIEM Team, June 2011 CAM Test Model Data Deploy Requirements Build Exchange Generate Dictionary.
SRDC Ltd. 1. Problem  Solutions  Various standardization efforts ◦ Document models addressing a broad range of requirements vs Industry Specific Document.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
CIS607, Fall 2005 Semantic Information Integration Article Name: Clio Grows Up: From Research Prototype to Industrial Tool Name: DH(Dong Hwi) kwak Date:
IRS XML Standards & Tax Return Data Strategy For External Discussion June 30, 2010.
XSD and jCAM tutorial - Leveraging Open Standards - XSD ingesting - Interoperability for SOA / WSDL - Exchange Management David RR Webber Chair OASIS CAM.
XML Exchange Development CAM Technology Tutorial – Public Sector NIEM Team, June 2011 CAM Test Model Data Deploy Requirements Build Exchange Generate Dictionary.
1 1 Roadmap to an IEPD What do developers need to do?
Open Standard Voting Localization with CAM - Localization Mechanisms - Publishing Localizations - Leveraging Open Standards - XSD ingesting David RR Webber.
A Use Case for SAML Extensibility Ashish Patel, France Telecom Paul Madsen, NTT.
Process-oriented System Automation Executable Process Modeling & Process Automation.
Introduction to ebXML Mike Rawlins ebXML Requirements Team Project Leader.
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
1 Dictionary Driven Exchange Content Assembly Blueprints Concepts, Procedures and Techniques (CAM – Content Assembly Mechanism Specification) Author: David.
1 CIM User Group Conference Call december 8th 2005 Using UN/CEFACT Core Component methodology for EIC/TC 57 works and CIM Jean-Luc SANSON Electrical Network.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Copyright OASIS, 2005 / 2007 CAM Technology Introduction David Webber Chair OASIS CAM TC Presentation February 9 th, 2007 Bethesda MD.
OFC304 Excel 2003 Overview: XML Support Joseph Chirilov Program Manager.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
NHS CFH Approach to HL7 CDA Rik Smithies Chair HL7 UK NProgram Ltd.
An Experimental Assessment of Semantic Web-based Integration Support - Industrial Interoperability Focus - Nenad Anicic, Nenad Ivezic, Serm Kulvatunyou.
1 1 Roadmap to an IEPD What do developers need to do?
Agenda Introduction to MDHT MDHT Capabilities MDHT support using Consolidated CDA 1.
Using Vocabulary Services in Validation of Water Data May 2010 Simon Cox, JRC Jonathan Yu & David Ratcliffe, CSIRO.
Introduction to XML. XML - Connectivity is Key Need for customized page layout – e.g. filter to display only recent data Downloadable product comparisons.
1 CIS336 Website design, implementation and management (also Semester 2 of CIS219, CIS221 and IT226) Lecture 6 XSLT (Based on Møller and Schwartzbach,
Interoperability in Information Schemas Ruben Mendes Orientador: Prof. José Borbinha MEIC-Tagus Instituto Superior Técnico.
NIEM Blue Team Presentation April 20, 2010 Phil Letowt, Mini Kanwal, Ken Sall, David Webber ICE OCIO / Task ASAS ICE Information Exchange Reuse with NIEM.
1 Quick Guide to CAM Dictionaries Building and using Canonical XML dictionaries for CAM (CAM – Content Assembly Mechanism Specification) Author: David.
Proposal for a Revised Technical Framework for UN/CEFACT eProcurement impact 1.
David Webber, NIEM Team, Oracle Public Sector Rapid NIEM XML Exchange Design, Semantics and UML Models NIEM Test Model Data Deploy Requirements Build Exchange.
Copyright © 2004 by The Web Services Interoperability Organization (WS-I). All Rights Reserved 1 Interoperability: Ensuring the Success of Web Services.
SDMX Standards Relationships to ISO/IEC 11179/CMR Arofan Gregory Chris Nelson Joint UNECE/Eurostat/OECD workshop on statistical metadata (METIS): Geneva.
1 Twitter #NIEMNTE3 Lightning-Fast IEPD Development Techniques Presenter David RR Webber Information Architect Oracle Public Sector.
Interfacing Registry Systems December 2000.
Context Inspired Component Architecture Navigating the Shifting Currents of Data xmlCoP Meeting May 18, 2005 ANSI Accredited Standards Committee X12 Ralph.
ISURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains Prof. Dr. Asuman Dogac METU-SRDC Turkey METU.
10/18/20151 Business Process Management and Semantic Technologies B. Ramamurthy.
Development Process and Testing Tools for Content Standards OASIS Symposium: The Meaning of Interoperability May 9, 2006 Simon Frechette, NIST.
1 Quick Guide to CAM Blueprints Using blueprints to develop XML exchange templates and schema (CAM – Content Assembly Mechanism Specification) Author:
1 1 CAM Toolkit for NIEM IEPD Development Phil Letowt – DHS/ICE David Webber – ICE Data Architect.
Accessing Data Using XML CHAPTER NINE Matakuliah: T0063 – Pemrograman Visual Tahun: 2009.
Common Terminology Services 2 CTS 2 Submission Team Status Update HL7 Vocabulary Working Group May 17, 2011.
DLMS XML Update Supply PRC May 18, 2007 Thomas Lyons.
Leveraging SET, OWL, CAM and Dictionary based tools to enabled automated cross-dictionary domain translations David Webber OASIS SET TC / CAM TC (with.
U NITED N ATIONS C ENTRE F OR T RADE F ACILITATION A ND E LECTRONIC B USINESS Under the auspices of United Nations Economic Commission for Europe UN/CEFACT.
Working with XML Schemas ©NIITeXtensible Markup Language/Lesson 3/Slide 1 of 36 Objectives In this lesson, you will learn to: * Declare attributes in an.
Dictionary based interchanges for iSURF -An Interoperability Service Utility for Collaborative Supply Chain Planning across Multiple Domains David Webber.
CEN/ISSS eBIF GTIB Project Meeting, Brussels Mar , 2009 CEN/ISSS eBIF GTIB Project Meeting, Brussels 1 CEN/ISSS eBIF Global eBusiness Interoperability.
Inference-based Semantic Mediation and Enrichment for the Semantic Web AAAI SSS-09: Social Semantic Web: Where Web 2.0 Meets Web 3.0 March 25, 2009 Dan.
OASIS CAM Technology - Brief Introduction January, 2008 Creating EDXL Use Pattern Templates David Webber Chair OASIS CAM TC
E-Gov Language Processing Requirements, Approach, References.
Manufacturing Systems Integration Division Development Process and Testing Tools for Content Standards Simon Frechette National Institute of Standards.
Semantic Data Extraction for B2B Integration Syntactic-to-Semantic Middleware Bruno Silva 1, Jorge Cardoso 2 1 2
July 11, 2008OASIS SET TC OASIS Semantic Support for Electronic Business Document Interoperability (SET) TC Overview.
EbXML Semantic Content Management Mark Crawford Logistics Management Institute
Silverstein Group Presenter Moshe Silverstein A Content Assembly Mechanism Technology Overview Context & Integration A Content Assembly Mechanism Technology.
CHAPTER NINE Accessing Data Using XML. McGraw Hill/Irwin ©2002 by The McGraw-Hill Companies, Inc. All rights reserved Introduction The eXtensible.
Rendering XML Documents ©NIITeXtensible Markup Language/Lesson 5/Slide 1 of 46 Objectives In this session, you will learn to: * Define rendering * Identify.
OASIS SET Reference Implementation iSURF Interoperability Service Utility for CPFR© Prof. Dr. Asuman Dogac and Yildiray Kabak.
Product Training Program
Asuman Dogac, METU, Turkey Yildiray Kabak, SRDC Ltd.,Turkey
Managers’ briefing: Why XBRL?
Metadata The metadata contains
Business Process Management and Semantic Technologies
Presentation transcript:

Leveraging SET, OWL, CAM and Dictionary based tools to enabled automated cross-dictionary domain translations David Webber OASIS SET TC / CAM TC (with excerpts from iSURF presentation by Prof. Dr. Asuman Dogac, METU-SRDC, Turkey) OASIS SET TC Automating Intra-domain Mappings

Agenda Part I: Introduction –  Intra-domain example use cases  Challenges and Opportunities Part II: Roadmap –  CAM templates, OWL, XPath, Dictionaries, CCTS  Using Dictionary based approach and SET Tools for aligning structure components across syntax vocabularies within a domain Part III: Summary –  Next Steps

Part I: Intra-domain Example Use Cases

Information Exchange Interoperability Many common domains are using multiple vocabularies that have arisen historically over time – e.g. banking, healthcare, supply chain 1. These may be weakly or strongly aligned depending on the domain and fragmentation / marketplaces within it All domains share common components such as organisation, person, customer, vehicle, address. 1 – X12/EDI, UN/CEFACT, UBL, GS1, xCBL, cXML, FIX, SWIFT, HL7, more…

Dictionary alignment task challenges Each domain can be inspected by comparing the vocabulary dictionaries Creating dictionaries in a common reference format has previously been complex and manual intensive process Even within a domain implementation the vocabulary maybe fragmented and inconsistent because information models evolve over time

Opportunities and Potential Creating a domain agnostic set of methods and tools that allow alignment within any domain to facilitate consistent information definitions Leverage the approach to also support semi or fully automated mapping patterns and templates Use open standards and open source tools Provide open public roadmap for tool vendors Allow standards groups to publish their exchanges in an open non-proprietary syntax and rule system Enable SMBs to build once, exchange to many

Part II: Roadmap – CAM templates, OWL, XPath Dictionaries and CCTS

CAM templates, OWL and dictionaries Information components derive their meaning and semantics from the context of their use pattern, not the physical name label, e.g.  Customer/Account/Number  Order/Item/Number CAM templates and OWL terms share ability to express use patterns that can be inspected and equivalence deduced using software agents that traverse the exchange structure components Matching is based on rules that can be tailored and reference to dictionaries of known properties Allows automated generation of domain dictionaries

CAM templates, XPath and dictionaries CAM toolkit contains dictionary analysis tools that can:  Create a new dictionary from existing domain exchange transactions  Merge dictionaries together  Compare exchange transactions to dictionary definitions and produce spreadsheet of matches and deltas  Report XPath location usage patterns of all unique items and exchange transactions  Assign unique UID values to each component

CAM dictionary generation overview XSD schemas XSD schemas CAM Templates CAM Templates XSLT script XSLT script Master Dictionary Master Dictionary Compare & Merge Components: Name Description Type Restrictions Relationships Usage occurrences XSLT script XSLT script UID

Dictionary Tools Generate a dictionary of core components from a set of exchange templates Separate dictionary content by namespace Merges annotations and type definitions from exchange template into dictionary Compare each exchange template to the master domain dictionary Produce spreadsheet workbooks Update spreadsheet and export back to dictionary core components

Create Dictionary – CAM process Select Dictionary; empty for new create, or existing for merge Output dictionary filename Select template content namespace to match with Merge mode; use true to combine content

Compare to Dictionary Pick dictionary to compare with Name of result cross-reference file

View Cross-Reference as Spreadsheet

Roadmap Summary Develop crosswalks:  Convert XSD schema to CAM templates  Leverage template structure and XPath rules to build dictionaries with UID labels  Build OWL relationships from dictionaries  Compare each dictionary to master dictionary and reference OWL and type knowledge bases to align  Produce spreadsheet for manual review  Save final results back to master dictionary Build runtime templates:  Compare individual CAM templates to master dictionary, generate cross-walk section between components  Cross-walk can contain alignment rules in XPath for content handling (e.g. code values and re-formatting)

CAM template to OWL exporter Currently CAM toolkit contains a variety of exporter tools into XSD schema, XML dictionary and XML test case example generation Opportunity to write exporter that generates OWL terms directly from CAM template patterns in dictionary Using XSLT to accomplish this, so can be easily adapted, extended and tailored Allows OWL-based reasoner to act with CAM Reasoner can also then update CAM dictionary to complete the semantic mapping

CAM to OWL generation overview OWL terms instances OWL terms instances Extract and Generate Components: Name Description Type Restrictions Relationships Master Dictionary 1 1 UID 2 2 Reasoner UID Insert UID couplets Output UID couplet pairings XML XSLT script XSLT script XSLT script XSLT script 5 5

Dictionaries, UIDs, and CAM templates Within a dictionary each unique context of an item can be assigned a UID label value These UID label values can then be inserted as references into a CAM template Each UID couplet across exchange formats within a domain can be marked as equivalent (aliases) or similar (rules associated) For similar items, CAM supports transform rules 1 The UID couplets allow automated mapping across CAM template definitions 1 - Using standard XPath syntax

Explicate semantics related with the different usages of document data types Different document standards use Data Types differently For example, “Code.Type" in one standard is represented by “Text.Type" in another standard and yet with “Identifier.Type" in another standard This knowledge in real world is expressed through class equivalences so that not only the humans but also the reasoner knows about it  Code.Type ≡ Text.Type  Name.Type ≡ Text.Type  Identifier.Type ≡ Text.Type  Can cross-reference via UID as well as type

Dictionary Alignment Step Human / OWL inspectors Dictionary alignment report produces known equivalents listing (confidence 100%), and then lesser equivalence rankings based on matching factors Component compound relationships resolved using CAM template structure layouts Human inspection then reviews and resolves and updates dictionary (using Excel spreadsheet workbook format) New dictionary produced Iterative refinement over time can enhance alignment along with common practices through industry agreements

From Dictionary to Runtime Mapping Once dictionary is available with UID couplets for domain crosswalks – proceed to align  Take templates of actual exchanges – and label these with UID couplets  Lookup UID couplets in dictionary and update target template with UID from couplet  Take completed templates – use to drive actual mapping processes

Create UID driven mapping template CAM template (source) CAM template (source) CAM template (target) CAM template (target) Domain Master Dictionary Domain Master Dictionary UID Lookup UID couplet XSLT script XSLT script UIDs Updated CAM template (matched targets) Updated CAM template (matched targets) UIDs Same, or Similar (+ optional XPath mapping rule) Rules

Automated UID driven mapping CAM template (source) CAM template (source) CAM template (matched targets) CAM template (matched targets) XSLT script XSLT script UIDs Rules Input XML instances Input XML instances Apply UID matches and rules Output mapped XML instances Output mapped XML instances

Dictionary approach summary 1. If the document components of two different domain standards share the same semantic properties:  Use this as an indication that they may be similar 2. Some explicitly defined semantic properties may imply further implicit semantic relationships:  Use a reasoner to obtain implicit relationships  Align to dictionary definitions allowing crosswalk  Create harmonized dictionary lookup  Use abstract UID as common reference (linkage between language specific named types/objects) 3. Explicate semantics related with the different usages of document data types in different document schemas to obtain some desired interpretations by means of such informal semantics Determine similar/match relationships and rules for constraint alignment and compound component relationships (e.g. date- time vice date and time) 4. Provide dictionary structure format for managing relationships Leverage existing OASIS CAM and ebXML Registry TC work

Part III: Summary – Next Steps

Value Proposition Mapping templates provide localization mechanism to tailor input and outputs to patterns and scenarios Domain mapping automation reduces burden on participants to maintain multiple mappings Removes issues surrounding versioning and exchange transaction structure differences Simplifies testing and setup Allows alignment over time to coherent domain reference dictionaries; mitigates collisions Lowers costs of entry and participation

Tools needed CAM  Schema ingesting  Dictionary builder OWL  Reasoner  CAM dictionary to OWL generator  Extend CAM dictionary format for couplets / rules  Extend reasoner to update dictionary couplets Mapping  XSLT engine to read input, templates and create output (Can use existing XSLT CAM validator as basis)        

GS1.XMLUIDUBL 2.0 Forecast.Indicator.IndicatorA1034Forecast.BasedOnConsensus_Indicator.Indicator PartyIdentification.DetailsC3401PartyIdentification.Details PartyIdentification.Primary_Identification.GLN_IdentifierC3402PartyIdentification.Identifier NonGLN_PartyIdentification.DetailsC3451PartyIdentification.Details NonGLN_PartyIdentification.Identification.TextC3452PartyIdentification.Identifier ElectronicDocument.Status.IdentifierD4310Forecast.DocumentStateCode.Code Abstract_Forecast.Purpose.ForecastPurposeCriteriaType_CodeE0010Forecast.PurposeCode.Code Multi_unitMeasure.Measure.MeasureF0301Dimension.Measure Abstract_Forecast_TimeStampedTradeItemQuantity.Association. Code E0451Forecast.Identifier.Identifer Date_TimePeriod.EndDate.Date_DateTimeT0012Period.EndDate.Date, Period.EndTime.Time Date_TimePeriod.BeginDate.Date_DateTimeT0013Period.StartDate.Date, Period.StartTime.Time TimePeriod.DetailsT0009Period.Details TimePeriod.Length.Duration_MeasureT0008Period.Duration.Measure TimePeriod.Type.CodeT0021Period.DescriptionCode.Code TradeItemIdentification.DetailsF0340ItemIdentification.Details TradeItemIdentification.Primary_Identification.GTIN_IdentifierF0341ItemIdentification.Identifier NonGTIN_TradeItemIdentification.DetailsF0342ItemIdentification.Details NonGTIN_TradeItemIdentification.Identification.Type_CodeItemIdentification.Extended_Identifier.Identifier The above equivalences are labelled as couplets through the UID dictionary cross-references and can be stored back into CAM templates section for runtime crosswalk use.

Runtime crosswalks between template structure member items UID: T0015 UID: D4310 UID: C3402