® Spatial Data Quality Workshop. GeoViQua project Dan Cornford, Lorenzo Bigagli, Jon Blower, Victoria Luch, Maud van der Broek and Simon Thum, Joan Masó Splinter session SPM1.34 Room: Y7 (46) Tuesday 8 th April 19:00–20:00
Spatial Data Quality Workshop. GeoViQua project2 The aim GeoViQua will provide a set of scientifically developed software components and services that facilitate the creation, search and visualization of quality information on EO data integrated and validated in the GEOSS Common Infrastructure. Pilot case studies C R O S S S B A Community building GEO S&T Label
Abstract GeoViQua is a European FP7 project that significantly contributes to the Global Earth Observation System of Systems GEOSS by adding rigorous data quality representations to existing search and visualization in the GEO Portal functionalities. The open workshop will review and present these developments: –The GeoViQua quality framework that enhances producer metadata, and proposes the addition of user feedback. The producer model builds on existing ISO standards (19115 and 19157) adding reference dataset information, citations, traceability of quality statements and discovered issues. The user model informs the database structure for a feedback server from which comments, citations, discovered issues, ratings and reports of usage may be stored and retrieved. –A Quality-aware discovery service, namely a quality-aware extension of the OGC Catalog Service for the Web (CSW-Q), which could cope with quality-constrained search. It will be included in the GEOSS Discovery and Access Broker. –A standard-based visualization approach for the visualization of quality / uncertainty information in 2D is developed using OGC Web Map Service (WMS). This extension reuses concepts in UncertML? in ncWMS server implementation. –A GEO label as a graphical representation of a dataset in the GEOSS (or other data portals and clearinghouses) based on the quality information that is available for it. –A user feedback catalogue where users can introduce comments, citations, discovered issues, ratings and reports of usage. This information can then be retrieved by the Discovery and Access Broker. –Some enhancements in metadata presentation such as, metadata size by side comparison, rubric metadata completeness assessment, provenance visualization, etc. Spatial Data Quality Workshop. GeoViQua project3
Agenda 19:00Introduction to the GeoViQua quality framework –Dan Cornford 19:10Quality-aware GEOSS Discovery and Access Broker –Lorenzo Bigagli 19:20Quality in OGC Web Map Service WMS-Q –Jon Blower 19:30GEO label –Victoria Luch 19:40A user feedback catalogue –Maud van der Broek and Simon Thum 19:50 Enhancements in metadata presentation –Joan Masó Spatial Data Quality Workshop. GeoViQua project4
® Introduction to the GeoViQua quality framework Dan Cornford 19:00
From requirements process and user interviews Quality models
Producers quality data model
GeoViQua Data model Statistical uncertainties: UncertML m 3.6 Value of the vertical DEM accuracy m Explicit recognition that errors acceptably fit a Normal distribution with mean 1.2 An overall positive bias was observed A difficult feature to convey by traditional means)
The need for a measure dictionary Absolute external positional accuracy2 Anweisung Straßeninformationsbank (Bundes…1 Codelist omission2 completeness198 Feature represented as a single object2 horizontal3146 Horizontal Positional Accuracy3265 Lagegenauigkeit3 Latitude Resolution3437 Longitude Resolution3350 Mean value of positional uncertainties (2D)3 Overlapping polygon2 Quantitative Attribute Accuracy Assessment255 Rate of missing items87 Sach- und Geodatenüberprüfung7 Temporal Resolution2870 Überprüfung der Toplogie2 Valid code Test2 Vertical Positional Accuracy1826 Vertical Resolution812 vertikal348 Vollständigkeit4 Current quality measure names in the GCI –Nothing to do with ISO19138 list of possible measures –Not well defined
Consumer quality data model Explained later
® Quality-aware GEOSS Discovery and Access Broker Lorenzo Bigagli 19:10
URL’s of interest GeoViQua DAB – Capabilities Document – demo/services/cswiso?service=CSW&REQUEST=GetCapabilities& Version=2.0.0http:// /gvq- demo/services/cswiso?service=CSW&REQUEST=GetCapabilities& Version=2.0.0 GeoViQua test portal – Spatial Data Quality Workshop. GeoViQua project12
Demo portal Spatial Data Quality Workshop. GeoViQua project13
® Quality in OGC Web Map Service WMS-Q Jon Blower 19:20
Scope and aims Our aim was to develop a specification and prototype for a “quality-enabled” Web Map Service (“WMS-Q”) “Quality” means different things: –Completeness, consistency, accuracy, lineage … We focused on two main aspects of data quality: –Visualizing thematic accuracy, expressed as uncertainties –Linking to further information recorded in metadata documents We considered quality information at various levels: –Dataset, variable and sample level We aimed to avoid extending WMS1.3.0, restricting ourselves to specializations of the spec GeoViQua project ( Yang et al, 2012 (doi: /rsta )
Background on sample-level quality: UncertML and NetCDF-U We consider statistical uncertainties to be the most useful measure of thematic accuracy at the variable and sample level UncertML provides a taxonomy of terms for quantifying and exchanging uncertainty information, considering: –Samples (uncertainties represented by recording each individual sample from a population) –Statistics (e.g. mean, variance, summarizing groups of samples) –Distributions (e.g. Gaussian, Binomial, where the mathematical form of the uncertainties are understood) NetCDF-U (OGC discussion paper) provides a means for encoding UncertML concepts in NetCDF format. (Climate and Forecast conventions have a more basic means of encoding uncertainty). UncertML: NetCDF-U: OGC discussion paper OGC
Semantic groupings of WMS Layers We need a method to convey that individual Layers are related semantically –E.g. one Layer represents the variance of another Layer We use Layer nesting for this, coupled with Keywords from the UncertML vocabulary See fragment of Capabilities document (right, simplified) –Shows that uncertainties are normally distributed Also applies to other kinds of semantic groupings –E.g. components of a velocity field Sea Surface Temperature normal sst Sea Surface Temperature Mean normal#mean sst Sea Surface Temperature Variance normal#variance
Styling of Layers There are many different ways of representing uncertainties visually: –Contours, textures, shading, transparency, bivariate colour maps… Different methods suit different datasets and users WMS provides two methods: –Named Styles – simple but inflexible –Styled Layer Descriptors and Symbology Encoding – more flexible but still rather basic for raster data ncWMS provides some simple extensions to WMS None of these meet the use cases for visualization of uncertainty Hence we have developed a new XML language for specifying styles for raster data –Named styles can map to XML definitions for backward compatibility
Sample XML style descriptor Mean field rendered as a colour-mapped raster Standard deviation field rendered as a stipples, with 5 different levels
Conclusions and Future Work First version of a set of “WMS-Q” conventions published as Engineering Report (OGC document ) –Compatible with WMS 1.3.0, with one very minor alteration (the “type” of the MetadataURL). Focuses on conveying uncertainties of raster data –Different conventions would be required for categorical and/or vector data Uses UncertML vocabulary, compatible with NetCDF-U Requires new styling mechanism, beyond SLD/SE –This new mechanism works for other use cases too, not just uncertainties (e.g. vectors) –Gives clients fine-grained control over styling –Intend to publish as discussion paper when ready Prototype software based on ncWMS demonstrated here –Will be part of core ncWMS release in due course Future work will focus on: –Constraining behaviour of GetFeatureInfo for –Linking with clients (e.g. Godiva2, Greenland)
® GEO label Victoria Luch 19:30
What is it? –The GEO Label is intended to “assist the user to assess the scientific relevance, quality, acceptance and societal needs of the components” (ST Task Team, 2010). Purposes? –be a quality indicator for GEOSS geospatial data and datasets Problem: Usability depends on data application; there is no defined threshold. –improve user recognition and trust in validated datasets. Problem: who is going to certify this? –assist in searching by providing users with visual clues of dataset quality and relevance. –provide accreditation, provenance, monitoring –increase visibility of EO data –Emphasize in open access and easy availability Possible shape? –Certification label –A formal way to present quality indicators provenance attribution GEOLabel
Phases Phase I: An online questionnaire was conducted to define the initial user and producer views on the role that a GEO Label should serve. –Present some examples of common review and rating systems commonly used seals that use click-to-verify functionality. –to identify the participants’ opinion on such systems. Phase II: A further study presenting some GEO Label examples will be conducted, which will be based on our first study results. We will elicit feedback on these examples under controlled conditions and in a well-managed and structured way. Phase III: We will create physical prototypes which will be used in a human subject study. The most successful prototypes will then be used to define the GEO Label concept and the role that a GEO Label will serve. Copyright © 2012 Open Geospatial Consortium
Conclusions of phase 1 We received a total of 87 valid responses: 57 from dataset users and 30 from dataset producers –Overall, the results of our study show that users and producers of geospatial data appear to have generally very positive attitudes towards the development and introduction of a GEO label. Copyright © 2012 Open Geospatial Consortium
How important of the following informational aspects are – expert judgement of the dataset and its quality; – a dataset’s compliance with international standards; – community advice and recommendations on what datasets are best to use; – information about the reputation of the dataset provider; – dataset citations (e.g., a list of journal articles or other publications where the dataset has been used and quality checks have been reported); – ‘soft knowledge’ (subjective and informal statements) about the dataset quality that is provided by the creator or provider of the dataset; and – an ability to visualise metadata records side-by-side when comparing two or more datasets. Copyright © 2012 Open Geospatial Consortium
The role of the GEO label The majority (50 respondents) indicated preference for a drill-down interrogation facility, with a large number of respondents additionally and/or alternatively stating preference a certification seal.The majority (50 respondents) indicated preference for a drill-down interrogation facility, with a large number of respondents additionally and/or alternatively stating preference a certification seal. Overall, the results show that users and producers of geospatial data agree on the benefits of introducing a GEO label, with no distinct difference being apparent between user and producer views.Overall, the results show that users and producers of geospatial data agree on the benefits of introducing a GEO label, with no distinct difference being apparent between user and producer views. Copyright © 2012 Open Geospatial Consortium
Second questionnaire Section A - general information about the respondent Section B - show several GEO labels in the search results and see what summary respondents want Section C - show a full GEO label page with detailed information (citations, user reviews) and see if respondents like the idea; Section D - closing summary with general comments about the GEO label. Copyright © 2012 Open Geospatial Consortium
® A user feedback catalogue Maud van der Broek and Simon Thum 19:40
Components focused on user feedback Page 29 Silver Spring, USA. GeoViQua CREAF March 27, 2013
User feedback model Page 30 Silver Spring, USA. GeoViQua CREAF March 27, 2013
Service is already in place /api/v1/feedback/1/ Page 31 Silver Spring, USA. GeoViQua CREAF March 27, 2013 We will work on perfecting this service preparing a editor conecting to the GEOPortal
® Enhancements in metadata presentation Joan Masó 19:50
Provenance visualization Spatial Data Quality Workshop. GeoViQua project33
Provenance visualization Spatial Data Quality Workshop. GeoViQua project34
Metadata comparison Spatial Data Quality Workshop. GeoViQua project35
Evaluation of the metadata completeness Spatial Data Quality Workshop. GeoViQua project36
® Thanks!