Download presentation
Presentation is loading. Please wait.
Published byEdwina Neal Modified over 9 years ago
1
An Example in The DCO Data Portal Formal Specification of Data Types in the Deep Carbon Observatory Data Portal Xiaogang (Marshall) Ma (max7@rpi.edu), John Erickson, Patrick West, Stephan Zednik, Peter Fox, Han Wang, Yu Chenmax7@rpi.edu Tetherless World Constellation, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY, USA (Image credit: Ainsley Seago, PLoS Biology) Background Data types are often treated only as syntax of variables, such as integer, float, boolean, character, and string, etc. Such declaration does not offer any domain specific meaning to the data types. Our intention is to let a data type include more meanings, such as who create the data type, the source standard that the data type derives from, the operations that can be done on datasets of that data type, and typical scientific domains, software programs and/or instruments that use the data type. Initial results have already been achieved in the Deep Carbon Observatory (DCO) Data Portal (http://info.deepcarbon.net). Nature of Efforts A registered DCO dataset is asserted as an instance of a BASIC DATA TYPE, such as Dataset, Image, Video, and Audio, etc. It is possible to further annotate a registered dataset with the SPECIFIC DATA TYPES defined by the DCO community members. Our Aim Any humans or machines facing a data type can quickly understand or be in a situation to at least process details within the dataset without even downloading it. Our Aim Any humans or machines facing a data type can quickly understand or be in a situation to at least process details within the dataset without even downloading it. Initial Results Updates to the DCO Ontology: A new class dco:DataType. Each specific data type is an instance of it An object property dco:hasDataType linking a dataset and a data type A collection of other classes and properties associated with dco:DataType Scan to get a copy of the poster: Each registered object, such as a dataset or a data type, has a unique identifier called DCO ID, which is similar to the DOI for a journal paper. (Images credit: deepcarbon.net and X. Ma) Geospatial/geotemporal: country, latitude, longitude, elevation Geospatial/geotemporal: country, latitude, longitude, elevation Geologic context: rock types/mineralogy, age, structure/tectonic, depth Geologic context: rock types/mineralogy, age, structure/tectonic, depth Field Geochemical: P, T, fluid comp. (inorganic, organic), pH, Eh, EC, biomarkers, gases, isotopes, sampling protocols, sample storage, sample archiving and tracking, time series results Field Geochemical: P, T, fluid comp. (inorganic, organic), pH, Eh, EC, biomarkers, gases, isotopes, sampling protocols, sample storage, sample archiving and tracking, time series results Analytical: measurement type, sample preparation, instrument type, instrument conditions, accuracy, precision, error propagation Analytical: measurement type, sample preparation, instrument type, instrument conditions, accuracy, precision, error propagation Bench Geochemical: P, T, fluid comp. (inorganic, organic), pH, Eh, EC, biomarkers, gases, sampling protocols, sample storage, sample archiving, isotopes Bench Geochemical: P, T, fluid comp. (inorganic, organic), pH, Eh, EC, biomarkers, gases, sampling protocols, sample storage, sample archiving, isotopes Biochemical: microbial inventory, DNA sequencing [data links to DL], substrates Biochemical: microbial inventory, DNA sequencing [data links to DL], substrates Monitoring: time series, sensor data recovery, resolution (signal/noise) – link to R&F Monitoring: time series, sensor data recovery, resolution (signal/noise) – link to R&F Modeling: empirical, canned codes (e.g. EQ3/EQ6; Chiller, GWB), MD Modeling: empirical, canned codes (e.g. EQ3/EQ6; Chiller, GWB), MD Kinetics: dynamics of chemical deep carbon processes; field-based versus laboratory-base Kinetics: dynamics of chemical deep carbon processes; field-based versus laboratory-base Thermodynamics: equation of state of carbon-bearing systems; link to robust data sets identified in EPC Thermodynamics: equation of state of carbon-bearing systems; link to robust data sets identified in EPC Surface and interface science, catalysis: solid-fluid interactions under extreme conditions Surface and interface science, catalysis: solid-fluid interactions under extreme conditions … … Future Works More use case analyses relevant to data types in the DCO community Refine the schema for the annotation and provenance of specific data types A faceted ‘data type browser’ on the DCO Data Portal Interoperability between DCO specific data types and data types registered in other communities. Future Works More use case analyses relevant to data types in the DCO community Refine the schema for the annotation and provenance of specific data types A faceted ‘data type browser’ on the DCO Data Portal Interoperability between DCO specific data types and data types registered in other communities. WHY Should You Care? Data types make aspects of data more visible Data types group data sets with similar characteristics Data types will help you find data sets matching your needs Data types enable machines to find tools and algorithms for specific datasets More features in an ‘inter-linked world’… WHY Should You Care? Data types make aspects of data more visible Data types group data sets with similar characteristics Data types will help you find data sets matching your needs Data types enable machines to find tools and algorithms for specific datasets More features in an ‘inter-linked world’…
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.