Presentation is loading. Please wait.

Presentation is loading. Please wait.

More Data Management and Services

Similar presentations


Presentation on theme: "More Data Management and Services"— Presentation transcript:

1 More Data Management and Services
Spring Training 8 Sep 2017 Canberra More Data Management and Services Jingbo Wang

2 Turning barrier to drivers
driver: large scale Source:

3 F. Findable A. Accessible I. Interoperable R. Reusable FAIR principles
driver: large scale

4 driver: large scale

5 Access Data through NCI’s data services
Open Geospatial Consortium Services Web mapping services Web coverage services Web feature services Web processing services Web coverage-processing services Other NCI data services THREDDS data subsetting GSKY ESGF (services used for international Climate Model Intercomparison Projects (MIPS)) ASVO ERDDAP, GeoServer, Rasdaman The most broad-scale, general purpose server has become TDS because of the range of data services and protocols supported. driver: large scale

6 Scalable and Distributed Data Access - GSKY
WMS request WCS request WPS request Features Distributed Scalable Concurrent Multi Cloud Input Output OGC Request OGC Output User’s browser WMS client (courtesy of Pablo Rozas Larraondo and Joseph Antony) GSKY

7 Provenance implementation to Access versioned Data
final version raw data Publication with data extract reference URI points to data extract URI points to an earlier version x of data extract URI points to an even earlier version 1 of data extract URI points the original source used to generate a series of data extracts Metadata Data Data reference: Wang et al Enabling dynamic access to dynamic petascale Earth Systems and Environmental data collections is easy: citing and reproducing the actual data extracts used in research publications is NOT. American Geophysical Union Fall meeting.

8 Access quality Data Data Quality Control (QC), Quality Assurance (QA) report, benchmarking use cases should be available for the community. When users access the data, they can also access the data quality report. NCI Data Quality Strategy Reference: Evans. et. al (invited) A data quality strategy for programmatic access to large collections of diverse datasets on an integrated high performance platform. DMPI Informatics.

9 Research Graph Researcher A Data 1 Grant a Paper I, II Researcher B
use Data 1 Supported by Grant a Paper I, II Generate Researcher B Data 1 Supported by Grant b Paper II, III use Generate Researcher B Data 2 Supported by Grant b Paper IV use Generate © National Computational Infrastructure 2016

10 Researcher A Grant a Paper I Paper II Data 1 Grant b Paper III
Generate Paper I use Supported by Generate Paper II Data 1 Grant b Generate Paper III Researcher B Supported by use use Supported by Generate Data 2 Paper IV © National Computational Infrastructure 2016

11 More information: http://researchgraph.org/schema

12 http://sydney. edu. au/vetscience/about/staff/profiles/kathy. belov
Dr Emily Wong © National Computational Infrastructure 2016 Prof. Katherine Belov IDMM Immunome Database for Marsupials and Monotremes Live Demo:

13 Example questions that can be answered by querying this graph database
What datasets are connected to each other? A group is downloading the climate re-analysis data, and another group is doing the same. They don’t know each other. Possible collaboration? data2 data3 data4 data5 researcher 2 researcher 1 paper 2 paper 1 dataset Any conflict of interest? © National Computational Infrastructure 2016

14


Download ppt "More Data Management and Services"

Similar presentations


Ads by Google