Metadata Catalogue Implementations: GSQ & TERN for the ANZLIC Metadata WG Nicholas Car | Senior Experimental Scientist 21 February 2019 Land & Water
What is interesting about these? both are catalogues contributing to a total information graph both use familiar off-the-shelf tech + extensions both use the latest DXWG stuff
GSQ – Geological Survey of QLD GSQ is undertaking a “Geoscience Data Modernisation Project” CSIRO is providing a model-based approach to thinking about data also training GSQ staff to model and implement models
GSQ – perspectives All GSQ data can be viewed from at least 2 perspectives
GSQ – perspectives All GSQ data can be viewed from at least 2 perspectives ‘regular’ catalogues (GN, CKAN, MAGDA etc.) only deal with the management perspective catalogue tools can cater for some ‘realm’ perspective classes
GSQ – perspectives Realm perspective cataloguing in CKAN Either the final or an initial catalogue
GSQ – perspectives All GSQ data can be viewed from at least 2 perspectives ‘regular’ catalogues (GN, CKAN, MAGDA etc.) only deal with the management perspective catalogue tools can cater for some ‘realm’ perspective classes not all – not GSQ Samples this will be a catalogue just like GAs!
GSQ – making it all work Linked Data brings all the catalogues together overarching model crosses the perspectives all tools understand URIs all tools ‘speak’ RDF caches of all content can be made management realm source isSampleOf Dataset A Borehole N Sample X
GSQ – Use of vocabs in catalogues GSQ is creating vocabs for all code list items in catalogues CKAN cat. calls on vocabs at page-load time for up-to-date use CKAN “Scheming” extension
GSQ – Use of vocabs in catalogues GSQ is creating vocabs for all code list items in catalogues CKAN cat. calls on vocabs at page-load time for up-to-date use GSQ using VocPrez to list all their vocabs and others of interest to them
GSQ – CKAN export as RDF A test geochemistry dataset available in RDF conformant to DCAT (rev) and a geochemistry profile thereof
GSQ – classes & tool list Datasets - CKAN Organisations - CKAN Permits - CKAN People - CKAN? Sites (inc. Boreholes & Mines) - CKAN but perhaps others Samples - Graph DB + Linked Data API Observations - Relational DB + application
TERN – Terrestrial Ecosystems Research NEtwork CSIRO is providing a model-based approach to thinking about data also training GSQ TERN staff to model and implement models CSIRO is recommending data exchange with partner agencies (suppliers and users) use RDF as the format & the TERN model as the data model
GSQ TERN – perspectives All GSQ TERN data can be viewed from at least 2 perspectives
TERN – classes & tool list Datasets - GeoNetwork Organisations - Drupal People - Drupal Sites - Graph DB + Linked Data API Samples - Graph DB + Linked Data API Observations - Graph DB + Linked Data API
TERN – data ingestion TERN can get data in a number of ways: human data entry – via catalogue forms machine data entry – via API will all be RDF APIs validate data validator API RDF document if valid, store DB
TERN & GSQ data pooling in both GSQ & TERN cases, when (meta)data from the various systems needs to be pooled for cross-querying, we have 2 options” cross indexing: one system knows about some stuff in other systems caching bring data from multiple systems into one DB very easy if all systems speak RDF! tested at scale in a number of projects already
Thank you Nicholas Car Senior Experimental Scientist t +61 7 3833 5632 e nicholas.car@csiro.au w people.csiro.au/Nicholas-Car Add Business Unit/Flagship Name