Simon Cox Research Scientist 16 April 2008

Slides:



Advertisements
Similar presentations
Forest Markup / Metadata Language FML
Advertisements

DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
Routemap to derive ISO models from BUFR Why do we need both ISO and BUFR models? –The BUFR data model is very large – much larger in principle than most.
Community semantics and interoperability: the ISO/TC 211 framework and the “Hollow World” Simon Cox CSIRO Exploration and Mining 6 September.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Information modelling – tools Simon Cox.
Designing GML application schemas for Observations and Measurements Simon Cox CSIRO Exploration and Mining 6 January 2006.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
UML CASE Tool. ABSTRACT Domain analysis enables identifying families of applications and capturing their terminology in order to assist and guide system.
CS 290C: Formal Models for Web Software Lecture 6: Model Driven Development for Web Software with WebML Instructor: Tevfik Bultan.
Framework for Model Creation and Generation of Representations DDI Lifecycle Moving Forward.
Domain Modelling and Implementation From model to implementation Simon Cox Research Scientist Sydney - December, 3 rd 2010.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Information modelling – standards context Simon Cox.
Domain Modelling and Implementation Canonical modelling approach Simon Cox Research Scientist Sydney - December, 3 rd 2010.
Vocabulary Services “Huuh - what is it good for…” (in WDTS anyway…) 4 th September 2009 Jonathan Yu CSIRO Land and Water.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM Making your data accessible on the network using standards Bruce Simons.
The role of registries within a spatial data infrastructure Simon CoxRob Atkinson Research ScientistSpatial Architect 16 April 2008.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. Towards Translating between XML and WSML based on mappings between.
GeoSciML cool logo. GeoSciML v3.0 – the CGI-IUGS geoscience data model I nternational U nion of G eological S ciences C ommission for the Management and.
XML in Development of Distributed Systems Tooling Programming Runtime.
Introduction to MDA (Model Driven Architecture) CYT.
Geology, mining, groundwater, landscape and soils The ‘Earth Science’ domains Bruce Simons Spatial Information Modelling Community of Practice workshop,
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Information Viewpoints and Geoscience Service Architectures Simon Cox Research Scientist 13 December 2007.
ET-ADRS-1, April ISO 191xx series of geographic information standards.
What is Information Modelling (and why do we need it in NEII…)? Dominic Lowe, Bureau of Meteorology, 29 October 2013.
Designing GML application schemas for Observations and Measurements Simon Cox CSIRO Exploration and Mining 22 March 2006.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Making Geological Map Data for the Earth Accessible OneGeology: assisting Geological Surveys worldwide to interoperate seamlessly on the Next Generation.
Interoperable sharing of groundwater data across international boarders is essential for the proper management of global water resources. However storage.
The CGI: Advancing International Geoscience Data Interoperability John Broome - CGI Council - Earth Sciences Sector, Natural Resources Canada.
Web Services and Geologic Data Interchange Simon Cox CSIRO Exploration & Mining
Standards-based methodology for developing a geoscience markup language Simon Cox Research Scientist 9 August 2008.
Geography Markup Language (GML). What is GML? – Scope  The Geography Markup Language is  a modeling language for geographic information  an encoding.
® Sponsored by G eo S ci ML : v4 Modularization OGC TC Crystal City March 24, 2014.
Introduction to GeoSciML: standard encoding for transfer of geoscience information Simon Cox CSIRO Exploration and Mining 11 September 2006.
Develop Use Cases Evaluate Existing Models Develop/Extend Model Test ModelDocument 1. Commercial This use-case involves identifying the location and properties.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
WIGOS Data model – standards introduction.
Geography Markup Language (GML). GML What is GML? – Scope  The Geography Markup Language is  a modeling language for geographic information  an encoding.
ISO 191** Overview A “Family” of Standards. Resources ISO Standards Web Page – Technical.
® Using (testing?) the HY_Features model, 95th OGC Technical Committee Boulder, Colorado USA Rob Atkinson 3 June 2015 Copyright © 2015 Open Geospatial.
Harmonisation of water observations data standards Pete Taylor 29 th September OGC TC – Darmstadt 2009 Water for a Healthy Country.
® Hosted and Sponsored by Observed Observable Properties - from HY_Features perspective - OGC Hydrology Domain Working Group 3 rd Meeting, Reading, UK,
Leverage and Delegation in Developing an Information Model for Geology Simon Cox Research Scientist 14 December 2007.
AN ORGANISATION FOR A NATIONAL EARTH SCIENCE INFRASTRUCTURE PROGRAM AuScope Grid Architecture “Where does your architecture fit in with the big picture?”
Ontologies Reasoning Components Agents Simulations An Overview of Model-Driven Engineering and Architecture Jacques Robin.
Leverage and Delegation in Developing an Information Model for Geology Simon Cox Research Scientist 14 December 2007.
1 Model Driven Health Tools Design and Implementation of CDA Templates Dave Carlson Contractor to CHIO
Implementing distributed geoscience information systems using Open GIS Web Services Simon Cox CSIRO Exploration & Mining
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
U.S. Department of the Interior U.S. Geological Survey WaterML Presentation to FGDC SWG Nate Booth January 30, 2013.
Develop Use Cases Evaluate Existing Models Develop/Extend Model Test ModelDocument Use Cases 1. Commercial This use-case involves identifying the location.
INSPIRE Conference 2011, Edinburgh Workshop “INSPIRE and open standards for sustainable growth“ Clemens Portele, interactive instruments GmbH Clemens Portele.
Summary Report Project Name: Model-Driven Health Tools (MDHT)
XML QUESTIONS AND ANSWERS
Document, Index, Discover, Access
Software Quality Engineering
Geospatial Knowledge Base (GKB) Training Platform
The Re3gistry software and the INSPIRE Registry
Evaluating Compuware OptimalJ as an MDA tool
2. An overview of SDMX (What is SDMX? Part I)
Constructing MDA-based Application Using Rational XDE for .NET
An Introduction to Software Architecture
Session 2: Metadata and Catalogues
Session 3: Information Modelling and Information Communities
CORE Name: CORE® Description:
CSE591: Data Mining by H. Liu
IDEAS Chris Partridge 6/27/2019.
Software Architecture & Design
SDMX IT Tools SDMX Registry
Presentation transcript:

Simon Cox Research Scientist 16 April 2008 Standards-based methodology for developing a geoscience markup language Markup languages have been developed for data transfer in a variety of earth science disciplines. Most of these have been developed using an informal methodology – typically guided by a data model implicitly defined in some existing document or database, but with the XML schema often designed directly using ad-hoc patterns, or sometimes created automatically by some proprietary toolkit. This often leads to a language that is efficient for a single application or within a workgroup or community, but with limited scope for interoperability across domain boundaries. The latter is a serious constraint to the use of data from diverse sources in cross-disciplinary investigations. A uniform methodology, based on standards published by Open Geospatial Consortium and ISO, has been developed and applied in the design of GeoSciML. The method is based on the Object Management Group’s Model Driven Architecture (MDA), with model design in UML using the General Feature Model from ISO 19109, the use of components from other standards in the ISO 19100 series, and production of the XML schema following the encoding rules specified in ISO 19136. The resultant encoding shows a literal and explicit relationship to the UML model. This is unlikely to be as compact as hand-coded special cases, but is consistently structured across similar models. Full structure and meaning is preserved, and compactness is easily dealt with using standard compression techniques. Furthermore, the use of standard components for elements that are common across domains ensures maximum interoperability. To assist in the use of this methodology, we have developed two tools: “HollowWorld” – a UML template with ISO 19100 components, stereotypes, and tags pre-loaded, plus some other cross-domain components; “FullMoon” – a UML processing framework, based on application of sets of rules against the XMI representation of a model. We use a UML design tool that allows direct binding to one or more SVN repositories. These host the various UML packages that are under separate governance arrangements. This overcomes an important limitation of most UML-based methodologies, which effectively treat the entire model as a single artefact. Three rule-sets are available For FullMoon: validating the UML model with respect to the profile described in ISO 19136 generating GML-conformant XML schema according to the rules in ISO 19136 generating human-readable documentation of the model, in the form of an HTML frameset. Use of these tools has allowed the GeoSciML team to develop and maintain the model as a single normative artefact (XMI). Implementation views in XML Schema and HTML documentation are generated automatically at significant release points. This addresses two key issues with ad-hoc approaches: ensuring normative and descriptive content are consistent across maintenance activities, and the ability to support convenient cross-reference between the conceptual model and the XML encoding. Simon Cox Research Scientist 16 April 2008

Outline The problem GeoSciML Re-use and delegation patterns Tooling to support methdology Summary CSIRO EGU-2008-A-02998 Cox Standards-based methodology

The problem CSIRO EGU-2008-A-02998 Cox Standards-based methodology

The problem Typical markup language strategy: XML is a meta-language Manually crafted schema Implicit data model from existing db or processing service Ad-hoc xml patterns Single use-case  no interoperability XML is a meta-language Syntactic vs structural or semantic interoperability CSIRO EGU-2008-A-02998 Cox Standards-based methodology

A better way Model-driven, standards-based UML formalization Automatic transformation to implementation platform (e.g. XML) Standard meta-model ISO General Feature Model + Coverage Standard X-domain components Geometry, CRS, Temporal Observations, Sampling Features Formally governed vocabularies Published in online registers CSIRO EGU-2008-A-02998 Cox Standards-based methodology

GeoSciML CSIRO EGU-2008-A-02998 Cox Standards-based methodology

GeoSciML A language for exchange of geoscience information UML logical model XML transfer encoding Scope: interpreted geology and supporting observations MappedFeature, GeologicUnit, GeologicStructure, Fossil, Geologic timescale, Borehole, Observation, etc i.e. information required to maintain geologic maps Design in pictures – but using a formal notation: Unified Modeling Language UML Automatic transformation into an XML Schema XML document format is for data transfer uses Geology model, not geological-map model Maps are views of the world, projected or sampled on a particular plane etc. CSIRO EGU-2008-A-02998 Cox Standards-based methodology

GeologicUnit A core part of the model: Geologic Unit Some “simple” attributes, plus some complex properties (associations). Specializations as LithologicUnit, ChronostratigraphicUnit, DeformationUnit (maybe more to come). CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Use of standards CSIRO EGU-2008-A-02998 Cox Standards-based methodology

MappedFeature ISO 19109 Feature Model ISO 19107 Geometry ISO 19115 Metadata OGC 07-002 Sampling Model How does the use of this framework show up? Click Use of standard UML stereotypes ( specific encoding patterns) Reference to standard external components E.g. Geometry GM_Object (from ISO 19107), metadata MD_Metadata (from ISO 19115), SamplingFeatures (from OGC O&M) CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Boreholes, outcrops and specimens ISO/OGC Sampling Model ISO/OGC Observation model CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Borehole logs ISO/OGC Sampling Model ISO/OGC Coverage Model CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Orderly delegation of responsibility Interoperability levels: Schematic/model – common XML Schema GeoScML v2.0 - see other paper in this conference Semantic – common vocabularies CGI GeoSciML provides the data structure E.g. LithostratigraphicUnit is a kind of GeologicFeature with the properties “preferredAge”, “classifier”, “beddingPattern” etc Data providers use appropriate vocabs and reference systems See SOA/Registry paper GeoSciML model/schema defines a the data structures – to quite a high degree of detail, But the data values in many cases can be scoped “at run time” to a specific vocabulary, scale, etc i.e. the governance of scales, vocabularies, etc are delegated to the data provider Though for maximum interoperability it is recommended to use published, well-governed vocabularies etc. CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Example Most property values are references to registers <MappedFeature>     …     <observationMethod> <CGI_TermValue>             <value codeSpace="urn:cgi:classifierScheme:GA:1MillionGeology_ObservationMethods“ >GSNSW785</value>         </CGI_TermValue></observationMethod>     <positionalAccuracy> <CGI_NumericValue>             <principalValue uom="urn:ogc:def:uom:UCUM:m">500</principalValue>         </CGI_NumericValue> </positionalAccuracy>     <samplingFrame xlink:href="urn:cgi:classsifier:GA:SurfaceGeologyOfEasternAustralia_1MillionScale"/>      <specification>         <LithologicUnit >             <gml:description>Mafic volcaniclastic sandstone, siltstone, shale, chert; minor limestone, conglomerate</gml:description>             <gml:name codeSpace="urn:cgi:classifierScheme:GA:StratigraphicLexicon:Unitname“ >Kabadah Formation</gml:name>             <gml:name codeSpace="urn:cgi:classifierScheme:GA:StratigraphicLexicon:Map_symbol“ >Ojck</gml:name>             <gml:name codeSpace="urn:ietf:rfc:2141">urn:cgi:feature:GA:Stratno:29570</gml:name>             <observationMethod> <CGI_TermValue>                     <value codeSpace="urn:cgi:classifierScheme:GA:ObservationMethods“ >published description</value>                 </CGI_TermValue> </observationMethod>             <purpose>typicalNorm</purpose>             <preferredAge> <GeologicEvent> <eventAge> <CGI_TermValue>                             <value codeSpace="urn:cgi:classifierScheme:ICS:StratChart:2004“ >urn:cgi:classifier:ICS:StratChart:2004:Ordovician</value>                         </CGI_TermValue> </eventAge>  <eventProcess> <CGI_TermValue>                             <value codeSpace="urn:cgi:classifierScheme:GA:Process">unspecified</value>                         </CGI_TermValue> </eventProcess>  </GeologicEvent> </preferredAge> … Most property values are references to registers Common values  interoperability CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Extensibility Related communities are already building specializations on top of GeoSciML GroundWaterML GeochronML Mineral Occurrences ML As well as internal extensibility, GeoSciML is designed to be extended or specialized by sub-domains within, and related to, geosciences For example, two papers in this afternoon’s session describe languages explicitly derived from GeoSciML. CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Extensibility methodology Same pattern as GeoSciML’s specialization of ISO & O&M … The pattern used to accomplish this follows exactly the same method as the basic GeosciML design i.e. specialization-of, and reference-to externally governed components. (the blue classes are in the GWML domain). (N.B. this is enforced within the development environment by the use of “controlled packages” in a variety of SubVersion code repositories). CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Tooling CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Tooling to support standards-based approach UML for design, XML for transfer HollowWorld UML template Standard UML profile ISO 19100 components OGC Observation & Sampling components FullMoon XMI processor to automate XML schema documentation production GeoSciML documentation CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Delegation CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Benefits of delegating governance Platform supports governance arrangements UML packages (XML namespaces) reflect system boundaries  discrete governance arrangements Markup conventions support late-binding of selected elements (esp. vocabularies and scales) Understand the scope and reach of your community Only maintain the elements that are: important to you not governed by someone else Enable extensions to your model Publish re-usable components in http repository e.g. XMI of UML model; XML Schema Maintain your components in an orderly way Don’t cause surprises! CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Summary CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Key points Methodology for information communities to reach consensus UML design stays close to conceptual level Re-use of cross-domain components and standard applications Patterns enable delegation to the appropriate authority Enhanced interoperability GeoSciML is an example of a community agreement developed using a standards-based methodology Specialized schemas are being built on top of GeoSciML Standards build on standards Don’t re-invent unnecessarily - its easier (and more interoperable) to borrow elements already managed by someone else Allow others to borrow yours But this imposes an obligation on you to maintain an orderly governance process. CSIRO EGU-2008-A-02998 Cox Standards-based methodology

Thank you Exploration & Mining Simon Cox Research Scientist Contact Us Phone: 1300 363 400 or +61 3 9545 2176 Email: enquiries@csiro.au Web: www.csiro.au Exploration & Mining Simon Cox Research Scientist Phone: 08 6436 8639 Email: Simon.Cox@csiro.au Web: www.seegrid.csiro.au Thank you