DATA SYSTEMS FOR SAMPLE- BASED OBSERVATIONS 1 Kerstin Lehnert.

Slides:



Advertisements
Similar presentations
The Benefits of Cross- Linking The International Continental Scientific Drilling Program (ICDP) Jens Klump et al. Knowledge by Networking - Digitising.
Advertisements

Page 1© Crown copyright 2006 Registry technology & case study implementation J. Tandy, D. Thomas - November 2006.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
ODM2: Developing a Community Information Model and Supporting Software to Extend Interoperability of Sensor and Sample Based Earth Observations Jeffery.
Metadata Standards for Sample- Based Observations Kerstin Lehnert EGU General Assembly 2011.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
Introducing Symposia : “ The digital repository that thinks like a librarian”
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.
CORDRA Philip V.W. Dodds March The “Problem Space” The SCORM framework specifies how to develop and deploy content objects that can be shared and.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
METS-Based Cataloging Toolkit for Digital Library Management System Dong, Li Tsinghua University Library
Data Resources US Perspective Kerstin Lehnert Suzanne Carbotte Lamont-Doherty Earth Observatory of Columbia University.
Information Requirements for Integrating Spatially Discrete, Feature- Based Earth Observations Jeffery S. Horsburgh Anthony Aufdenkampe, Kerstin Lehnert,
Open Access to Grey Literature on e-Infrastructures: The BELIEF-II Project Digital Library Stefania Biagioni, Donatella Castelli, Franco Zoppi CNR-ISTI.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
INTEGRATED DATA SYSTEM FOR CRITICAL ZONE OBSERVATORIES Mark Williams, University of Colorado.
1 Data Integration Community of Practice Meeting September 15, 2009 Science Data Integration.
Application of International GeoSample Number (IGSN) to Sample Collections Sri Vinay Geoinformatics for Geochemistry (GfG) Program Lamont Campus of Columbia.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
INTEGRATED OCEAN DRILLING PROGRAM MANAGEMENT INTERNATIONAL International Data Exchange Workshop – Kiel, Germany – May 9-11, 2007 SEDIS Scientific Earth.
T43C-1647 The EarthChem Deep Lithosphere Dataset: Digital Access to Mantle Xenolith Petrological Data The EarthChem Deep Lithosphere Dataset: Digital Access.
VO Sandpit, November 2009 CEDA Metadata Steve Donegan/Sam Pepler.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Semantic linking of data and journal publications in the STD-DOI project Jens Klump and STD-DOI Team European GeoInformatics Workshop Edinburgh, 7 March.
Kerstin Lehnert Lamont-Doherty Earth Observatory, Columbia University.
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Metadata harvesting in regional digital libraries in PIONIER Network Cezary Mazurek, Maciej Stroiński, Marcin Werla, Jan Węglarz.
ESIP & Geospatial One-Stop (GOS) Registering ESIP Products and Services with Geospatial One-Stop.
EarthChem: Geochemistry Information Network Registry for Earth samples that administers unique identifiers Kerstin Lehnert Steve Goldstein.
VIVO and Scholarly Repositories: Synergistic Opportunities.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
The IBM and CentAm subduction areas are linked by plate tectonics, in between lies the carbonate- rich equatorial Pacific--one of the two most important.
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
OAI Overview DLESE OAI Workshop April 29-30, 2002 John Weatherley
1 Using the GEOSS Common Infrastructure in the Air Quality & Health SBA: Wildfire & Smoke Assessment Prepared by the GEOSS AIP-2 Air Quality & Health Working.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
How to Implement an Institutional Repository: Part II A NASIG 2006 Pre-Conference May 4, 2006 Technical Issues.
Metadata and OAI DLESE OAI Workshop April 29-30, 2002 Katy Ginger Presentation available at:
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Every bit counts Data management and data publication in the earth sciences Jens Klump et al. International Data Exchange Workshop Kiel, 10 May 2007.
Metadata and OAI DLESE OAI Workshop June 29 to July 2, 2002 Katy Ginger Presentation available at:
Institutional Repositories: the DSpace Experience Ann J. Wolpert Director of Libraries Massachusetts Institute of Technology.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Fire Emissions Network Sept. 4, 2002 A white paper for the development of a NSF Digital Government Program proposal Stefan Falke Washington University.
Breakout Session 2.2: A sustainable GEO Information System of Systems Chair: Lorenzo Bigagli Rapporteur: Greg Yetman.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
The Research Data Archive at NCAR: A System Designed to Handle Diverse Datasets Bob Dattore and Steven Worley National Center for Atmospheric Research.
1 Using the GEOSS Common Infrastructure in the Air Quality & Health SBA: Wildfire & Smoke Assessment Prepared by the GEOSS AIP-2 Air Quality & Health Working.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
The Earth Information Exchange. Portal Structure Portal Functions/Capabilities Portal Content ESIP Portal and Geospatial One-Stop ESIP Portal and NOAA.
EPA’s Water Quality Exchange (WQX) Annual Exchange Network Users’ Meeting April 18-19, 2006.
Nancy J. Hoebelheinrich, Metadata Coordinator, Stanford University 1 Metadata for the NGDA: Developing a Shared Approach Joint UCSB / Stanford meeting.
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
IODE Ocean Data Portal - technological framework of new IODE system Dr. Sergey Belov, et al. Partnership Centre for the IODE Ocean Data Portal.
Grid Services for Digital Archive Tao-Sheng Chen Academia Sinica Computing Centre
Jens Klump | OCE Science Leader Earth Science Informatics
GEOSS Component and Service Registry (CSR)
Flanders Marine Institute (VLIZ)
Accessing a national digital library: an architecture for the UK DNER
A step-by-step guide to DOI registration
Implementing an Institutional Repository: Part II
WGISS Connected Data Assets Oct 24, 2018 Yonsook Enloe
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
Bird of Feather Session
4/5 May 2009 The Palazzo dei Congressi di Stresa Stresa, Italy
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Robert Dattore and Steven Worley
Presentation transcript:

DATA SYSTEMS FOR SAMPLE- BASED OBSERVATIONS 1 Kerstin Lehnert

2

Data from Samples  Distributed data acquisition  Different labs/researchers analyze the same sample or subsamples of it.  Distributed data publication  Different data for the same sample are published in different papers.  Distributed data archiving  Data for the same sample are kept in different data systems.  Integrated data access required to maximize utility. 3

Geochemical Data  diverse  hundreds of parameters  thousands of materials  vary with space and time over a range of more than ten orders of magnitude  complex  mostly sample-based with complex relations among samples & subsamples  distributed data acquisition (one sample analyzed in different labs by different researchers at different times)  Idiosyncratic data acquisition methods 4

Geoinformatics for Geochemistry  DATABASES  thematic geochemical databases (PetDB, SedDB, VentDB)  DATA REPOSITORY  Geochemical Resource Library  REGISTRIES  System for Earth Sample Registration SESAR  IEDA Data Publication Agent of the STD-DOI system (DataCite®)  GeoPass: single sign-on authentication system  DATA ACCESS & ANALYSIS TOOLS  GfG user interfaces  EarthChem Data Engine (Portal) 5

EarthChem XML DB Metadata catalog datasets (original data & derived products) GCDM DB GfG Architecture 6 USGS NAVDAT GEOROC EarthChem Portal GfG Data Entry User Submission External Databases Topical Data Collections Geochemical Resource Library

GeoChemical Data Model 7 observed value publicationdata source method/DQ sample feature of interest collection, geospatial analysis material preparation, obs. point

Metadata  Geospatial  Geographical coordinates  Geographical names  Collection  Sampling technique  Field program  Description & Age  Classification  Texture  Alteration  Age  Data Quality  Technique  Instrument  Laboratory  Precision  Reference material measurements  Correction procedures

9

10

11

12

13

14

15

Standards for Data Access & Integration  WMS, WFS  For visualization tools  OAI-PMH  For joint data inventories  EarthChemML  For integration across geochemical data systems  For interoperability with other systems 16

17

IEDA System-wide Inventory Inventory Expedition Metadata Reference Metadata Dataset Metadata Geospatial Metadata RSS feed MGDS SESAR EarthChem GRL Geochem DBs Object Registration   Object Metadata Object Registration   Object Metadata  Chemical Data Cruise Info   Chemical Data Cruise Info  DOI Registration

EarthChem Portal 19 PetDB Others USGS GEORO C NAVDA T EarthChem Data Engine Database XML EarthChem Data Engine Search & Visualization Partner databases encode their data & metadata in XML and send them to the EarthChem portal database in Kansas. Queries submitted at the EarthChem portal search the contents of the EarthChem Portal Database.

20

Access Levels

EarthChemML

EarthChem Repository: user submission  need tools that are easy to use and support the data flow from lab to publication  ideally, represent ‘pipelines’ for data capture early in the data acquisition process  tools need to include data validation and DQC procedures  offer citable data publication  need data policies 23

IEDA data publication service 24

STD-DOIs  The STD-DOI metadata are mainly Dublin Core elements, plus data specific elements.  The metadata transmitted to the National Library via web service (HTTP/SOAP) and incorporated into the library catalogue.  The metadata may contain references to other objects (DOI, IGSN,...):  Element  isCited, isParent, isChild, isDuplicate, … 25

STD-DOIs  The element can be used to point to other electronic objects:  Point to the literature where the data set is interpreted.  Point to samples, from which the data were derived.  Point to other datasets that belong to the same collection of datasets.  These links can be used by machines (e.g. data portals) to make search suggestions and thus aid discovery of data, literature and samples, or other added value services. 26

STD-DOI System Architecture

Data DOIs 28

Information Discovery Link to publication Citation of data IGSN points to sample

The International GeoSample Number 30

Ambiguous Sample Naming Examples from the PetDB Database Sample names are duplicated. Sample names are modified or changed. Sample names are duplicated. Sample names are modified or changed.

 Provides & manages unique identifiers for samples  IGSN - International Geo Sample Number  Assigned upon registration of sample metadata  Catalogs & archives sample metadata  Access to sample metadata via web site & web services  Long-term preservation of metadata  Link to sample archives  Facilitates links to data  IGSN will be incorporated into persistent resolvable GUIDs

IGSN:SIO8JH3M4 International GeoSample Number A Global Unique Identifier for Earth Samples  Strict syntax (9 digits, alphanumeric)  First three characters are unique user code (registered with SESAR)  Last 6 characters are random numbers + letters  Allows 2,176,782,336 sample identifiers per registrant  Does not replace personal or institutional names.  Applied to samples & sub-samples  system tracks relations 33 Name space 

Geoinformatics for Geochemistry Core Core Section 1 Core Section 3 Core Section 2 Sample 1 Sample 2 Sample 1 Sample 2 Sample 3 Sample 1 Sample 2 Sample 3 Rock powder Mineral conc. Leachate Fossil separate Microprobe mount Parent Child Parent IGSN:XXX IGSN:XXX0065B3 IGSN:XXX9K23G6 IGSN:XXX07ST4K IGSN:XYZ0G693M IGSN:ABC0L98SW IGSN:ABC0L53NW IGSN:ABC0L653X IGSN:ABC078HGB

Sample Types  “Sampling events” such as holes, cores, dredges, stratigraphic sections  “Individual samples”: specimens rocks, minerals, fossils, fluid samples, precipitates, synthetic material, etc.  “Sub-samples” of any of above: processed samples such as mineral or fossil separates, leachates, thin sections, etc.

Sample Registration Spreadsheet forms for batch loading Interoperability (web services) Interoperability SESAR Web Site

Implementation Challenges  Diversity of users  Large sampling campaigns (IODP, ICDP, ECS)  Repositories  Data systems  Individual investigators  Diversity of sample types  Integration into existing policies, procedures, data systems  International scope  Connectivity in the field 37

Solutions  Schema improvements  Web-service based registration from client data systems  Distributed system of registration nodes (Trusted Agents)  Handle service for IGSNs (persistent, resolvable)   Tools to facilitate registration  iSESAR (registration via iPhone)  eCollections (personal sample management)  webCollections (hosting services for repositories)  IGSN International Consortium 38