Presentation is loading. Please wait.

Presentation is loading. Please wait.

Simon Cox Research Scientist 16 April 2008

Similar presentations


Presentation on theme: "Simon Cox Research Scientist 16 April 2008"— Presentation transcript:

1 Simon Cox Research Scientist 16 April 2008
Standards-based methodology for developing a geoscience markup language Markup languages have been developed for data transfer in a variety of earth science disciplines. Most of these have been developed using an informal methodology – typically guided by a data model implicitly defined in some existing document or database, but with the XML schema often designed directly using ad-hoc patterns, or sometimes created automatically by some proprietary toolkit. This often leads to a language that is efficient for a single application or within a workgroup or community, but with limited scope for interoperability across domain boundaries. The latter is a serious constraint to the use of data from diverse sources in cross-disciplinary investigations. A uniform methodology, based on standards published by Open Geospatial Consortium and ISO, has been developed and applied in the design of GeoSciML. The method is based on the Object Management Group’s Model Driven Architecture (MDA), with model design in UML using the General Feature Model from ISO 19109, the use of components from other standards in the ISO series, and production of the XML schema following the encoding rules specified in ISO The resultant encoding shows a literal and explicit relationship to the UML model. This is unlikely to be as compact as hand-coded special cases, but is consistently structured across similar models. Full structure and meaning is preserved, and compactness is easily dealt with using standard compression techniques. Furthermore, the use of standard components for elements that are common across domains ensures maximum interoperability. To assist in the use of this methodology, we have developed two tools: “HollowWorld” – a UML template with ISO components, stereotypes, and tags pre-loaded, plus some other cross-domain components; “FullMoon” – a UML processing framework, based on application of sets of rules against the XMI representation of a model. We use a UML design tool that allows direct binding to one or more SVN repositories. These host the various UML packages that are under separate governance arrangements. This overcomes an important limitation of most UML-based methodologies, which effectively treat the entire model as a single artefact. Three rule-sets are available For FullMoon: validating the UML model with respect to the profile described in ISO 19136 generating GML-conformant XML schema according to the rules in ISO 19136 generating human-readable documentation of the model, in the form of an HTML frameset. Use of these tools has allowed the GeoSciML team to develop and maintain the model as a single normative artefact (XMI). Implementation views in XML Schema and HTML documentation are generated automatically at significant release points. This addresses two key issues with ad-hoc approaches: ensuring normative and descriptive content are consistent across maintenance activities, and the ability to support convenient cross-reference between the conceptual model and the XML encoding. Simon Cox Research Scientist 16 April 2008

2 Outline The problem GeoSciML Re-use and delegation patterns
Tooling to support methdology Summary CSIRO EGU-2008-A Cox Standards-based methodology

3 The problem CSIRO EGU-2008-A Cox Standards-based methodology

4 The problem Typical markup language strategy: XML is a meta-language
Manually crafted schema Implicit data model from existing db or processing service Ad-hoc xml patterns Single use-case  no interoperability XML is a meta-language Syntactic vs structural or semantic interoperability CSIRO EGU-2008-A Cox Standards-based methodology

5 A better way Model-driven, standards-based
UML formalization Automatic transformation to implementation platform (e.g. XML) Standard meta-model ISO General Feature Model + Coverage Standard X-domain components Geometry, CRS, Temporal Observations, Sampling Features Formally governed vocabularies Published in online registers CSIRO EGU-2008-A Cox Standards-based methodology

6 GeoSciML CSIRO EGU-2008-A Cox Standards-based methodology

7 GeoSciML A language for exchange of geoscience information
UML logical model XML transfer encoding Scope: interpreted geology and supporting observations MappedFeature, GeologicUnit, GeologicStructure, Fossil, Geologic timescale, Borehole, Observation, etc i.e. information required to maintain geologic maps Design in pictures – but using a formal notation: Unified Modeling Language UML Automatic transformation into an XML Schema XML document format is for data transfer uses Geology model, not geological-map model Maps are views of the world, projected or sampled on a particular plane etc. CSIRO EGU-2008-A Cox Standards-based methodology

8 GeologicUnit A core part of the model: Geologic Unit
Some “simple” attributes, plus some complex properties (associations). Specializations as LithologicUnit, ChronostratigraphicUnit, DeformationUnit (maybe more to come). CSIRO EGU-2008-A Cox Standards-based methodology

9 Use of standards CSIRO EGU-2008-A Cox Standards-based methodology

10 MappedFeature ISO 19109 Feature Model ISO 19107 Geometry
ISO Metadata OGC Sampling Model How does the use of this framework show up? Click Use of standard UML stereotypes ( specific encoding patterns) Reference to standard external components E.g. Geometry GM_Object (from ISO 19107), metadata MD_Metadata (from ISO 19115), SamplingFeatures (from OGC O&M) CSIRO EGU-2008-A Cox Standards-based methodology

11 Boreholes, outcrops and specimens
ISO/OGC Sampling Model ISO/OGC Observation model CSIRO EGU-2008-A Cox Standards-based methodology

12 Borehole logs ISO/OGC Sampling Model ISO/OGC Coverage Model
CSIRO EGU-2008-A Cox Standards-based methodology

13 Orderly delegation of responsibility
Interoperability levels: Schematic/model – common XML Schema GeoScML v2.0 - see other paper in this conference Semantic – common vocabularies CGI GeoSciML provides the data structure E.g. LithostratigraphicUnit is a kind of GeologicFeature with the properties “preferredAge”, “classifier”, “beddingPattern” etc Data providers use appropriate vocabs and reference systems See SOA/Registry paper GeoSciML model/schema defines a the data structures – to quite a high degree of detail, But the data values in many cases can be scoped “at run time” to a specific vocabulary, scale, etc i.e. the governance of scales, vocabularies, etc are delegated to the data provider Though for maximum interoperability it is recommended to use published, well-governed vocabularies etc. CSIRO EGU-2008-A Cox Standards-based methodology

14 Example Most property values are references to registers
<MappedFeature>     …     <observationMethod> <CGI_TermValue>             <value codeSpace="urn:cgi:classifierScheme:GA:1MillionGeology_ObservationMethods“ >GSNSW785</value>         </CGI_TermValue></observationMethod>     <positionalAccuracy> <CGI_NumericValue>             <principalValue uom="urn:ogc:def:uom:UCUM:m">500</principalValue>         </CGI_NumericValue> </positionalAccuracy>     <samplingFrame xlink:href="urn:cgi:classsifier:GA:SurfaceGeologyOfEasternAustralia_1MillionScale"/>      <specification>         <LithologicUnit >             <gml:description>Mafic volcaniclastic sandstone, siltstone, shale, chert; minor limestone, conglomerate</gml:description>             <gml:name codeSpace="urn:cgi:classifierScheme:GA:StratigraphicLexicon:Unitname“ >Kabadah Formation</gml:name>             <gml:name codeSpace="urn:cgi:classifierScheme:GA:StratigraphicLexicon:Map_symbol“ >Ojck</gml:name>             <gml:name codeSpace="urn:ietf:rfc:2141">urn:cgi:feature:GA:Stratno:29570</gml:name>             <observationMethod> <CGI_TermValue>                     <value codeSpace="urn:cgi:classifierScheme:GA:ObservationMethods“ >published description</value>                 </CGI_TermValue> </observationMethod>             <purpose>typicalNorm</purpose>             <preferredAge> <GeologicEvent> <eventAge> <CGI_TermValue>                             <value codeSpace="urn:cgi:classifierScheme:ICS:StratChart:2004“ >urn:cgi:classifier:ICS:StratChart:2004:Ordovician</value>                         </CGI_TermValue> </eventAge>  <eventProcess> <CGI_TermValue>                             <value codeSpace="urn:cgi:classifierScheme:GA:Process">unspecified</value>                         </CGI_TermValue> </eventProcess>  </GeologicEvent> </preferredAge> Most property values are references to registers Common values  interoperability CSIRO EGU-2008-A Cox Standards-based methodology

15 Extensibility Related communities are already building specializations on top of GeoSciML GroundWaterML GeochronML Mineral Occurrences ML As well as internal extensibility, GeoSciML is designed to be extended or specialized by sub-domains within, and related to, geosciences For example, two papers in this afternoon’s session describe languages explicitly derived from GeoSciML. CSIRO EGU-2008-A Cox Standards-based methodology

16 Extensibility methodology
Same pattern as GeoSciML’s specialization of ISO & O&M … The pattern used to accomplish this follows exactly the same method as the basic GeosciML design i.e. specialization-of, and reference-to externally governed components. (the blue classes are in the GWML domain). (N.B. this is enforced within the development environment by the use of “controlled packages” in a variety of SubVersion code repositories). CSIRO EGU-2008-A Cox Standards-based methodology

17 Tooling CSIRO EGU-2008-A Cox Standards-based methodology

18 Tooling to support standards-based approach
UML for design, XML for transfer HollowWorld UML template Standard UML profile ISO components OGC Observation & Sampling components FullMoon XMI processor to automate XML schema documentation production GeoSciML documentation CSIRO EGU-2008-A Cox Standards-based methodology

19 Delegation CSIRO EGU-2008-A Cox Standards-based methodology

20 Benefits of delegating governance
Platform supports governance arrangements UML packages (XML namespaces) reflect system boundaries  discrete governance arrangements Markup conventions support late-binding of selected elements (esp. vocabularies and scales) Understand the scope and reach of your community Only maintain the elements that are: important to you not governed by someone else Enable extensions to your model Publish re-usable components in http repository e.g. XMI of UML model; XML Schema Maintain your components in an orderly way Don’t cause surprises! CSIRO EGU-2008-A Cox Standards-based methodology

21 Summary CSIRO EGU-2008-A Cox Standards-based methodology

22 Key points Methodology for information communities to reach consensus
UML design stays close to conceptual level Re-use of cross-domain components and standard applications Patterns enable delegation to the appropriate authority Enhanced interoperability GeoSciML is an example of a community agreement developed using a standards-based methodology Specialized schemas are being built on top of GeoSciML Standards build on standards Don’t re-invent unnecessarily - its easier (and more interoperable) to borrow elements already managed by someone else Allow others to borrow yours But this imposes an obligation on you to maintain an orderly governance process. CSIRO EGU-2008-A Cox Standards-based methodology

23 Thank you Exploration & Mining Simon Cox Research Scientist
Contact Us Phone: or Web: Exploration & Mining Simon Cox Research Scientist Phone: Web: Thank you


Download ppt "Simon Cox Research Scientist 16 April 2008"

Similar presentations


Ads by Google