Data and meta-data exploration and data quality reporting for GCOS Jared Lewis ( Bodeker Scientific, New Zealand Meta-data is the often overshadowed by its big brothers, the data and its uncertainty Meta-data is extremely important. This has been alluded to by other presenters, but is typically focussed on the geographic meta-data (Visualising and subsetting datasets) Also important for future reprocessing of datasets, Today I want to discuss the use of meta-data for engaging data uses and evaluating data quality This is a case study on meta-data exploration of the GRUAN data
GRUAN GCOS Reference Upper Air Network Continuous long-term records of high-quality reference data Best possible uncertainties Best estimate + Uncertainty GRUAN Measurement Uncertainty of input data Traceable sensor calibration Transparent processing algorithm Disregarded systematic effects Black box software Proprietary methods GRUAN's goal is to provide continuous long-term records of high-quality reference data. Reference data in this respect means that the data are free of instrumental effects, that all known systematic biases have been corrected for and best-possible uncertainty estimates are provided. For example in case of a long-term data record which is the composite of various measurement systems (e.g. radiosondes) you don't have to worry about homogenisation. In other words: as a climate researcher doing analyses you can simply use GRUAN data as is. The concept of reference data quality is illustrated by this picture: Black box software: you don’t know what one has done to your data Disregarded systematic effects: your data is wrong! Proprietary methods: when improved/new corrections become available in the future you can't reprocess your data.
The problem Extracting value from meta-data is difficult Large number of variables Large amounts of data Need to know the question you want to answer Discuss GRUAN and how its meta-data is underutilised.
Similar problem Websites create large volumes of logs containing meta-data Meta-data provides information about customers Large incentive to extract value for decision making ($$$) Sparked the “Big Data” revolution New industry using meta-data New tools Need to make sure that the audience knows that the web site
Visualising Meta-data Log files Adapting existing technology to solve our problem
Visualising Meta-data Flight Data Adapting existing technology to solve our problem
3 minutes
Potential Benefits Empowers users to answer their own questions Identify where the data may be falling short of satisfying user’s needs Provide quantitative feedback for sites Benchmark existing sites Identify problems early
Outlook Kibana is being actively developed to be more user friendly Help the GRUAN working group to capture useful metrics Create site specific dashboards and metrics Need to find other datasets which could use these tools Key Message: By promoting the ability for data users to explore the meta-data may increase their use of the datasets