Beyond Metadata: Towards User- Centric Description of Data Quality Michael F. Goodchild University of California Santa Barbara.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Chapter 2 Conceptual frameworks for spatial analysis.
Status on the Mapping of Metadata Standards
Geographic Interoperability Office ISO and OGC Geographic Information Service Architecture George Percivall NASA Geographic.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 3. Representing Geography.
Conceptualization and Measurement
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
B O L T S S.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 7. Generalization, Abstraction,
GIS Error and Uncertainty Longley et al., chs. 6 (and 15) Sources: Berry online text, Dawn Wright.
Lecture 24: More on Data Quality and Metadata By Austin Troy Using GIS-- Introduction to GIS.
1 ISO – Metadata Next Generation International consensus being built on structured metadata within a broader Geomatics Standard under ISO Technical.
GIS Geographic Information System
GIS Brownbag Series Part 2: How does GIS model our world?
Looking Forward Mike Goodchild. Where is ESRI going? 9.0 –massively expanded toolbox –script management and metadata –Python, JScript, Perl –visual modeling.
PROCESS IN DATA SYSTEMS PLANNING DATA INPUT DATA STORAGE DATA ANALYSIS DATA OUTPUT ACTIVITIES USER NEEDS.
1 Geographic Information Systems (GIS) Fundamentals for Program Managers.
Measurement-Based GIS Michael F. Goodchild University of California Santa Barbara.
Metadata Standards Anita Coleman, Asst. Prof. School of Information Resources & Library Science, University of Arizona, Tucson.
It’s the Geography, Cupid!. GTECH 201 Lecture 04 Introduction to Spatial Data.
Digimap Training Workshops Map scales Any map is a scale representation of the Earth’s surface Scale may be defined as: – the ratio between distance measured.
Lineage February 13, 2006 Geog 458: Map Sources and Errors.
Attribute databases. GIS Definition Diagram Output Query Results.
GIS and Spatial Analysis Michael F. Goodchild University of California Santa Barbara.
Spatial data quality February 10, 2006 Geog 458: Map Sources and Errors.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Geographical Information System GIS By: Yahia Dahash.
Rebecca Boger Earth and Environmental Sciences Brooklyn College.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
Data Quality Data quality Related terms:
Introduction to Geospatial Metadata – FGDC CSDGM National Coastal Data Development Center A division of the National Oceanographic Data Center Please .
GSP 270 Digitizing with an Introduction to Uncertainty and Metadata
GIS 1110 Designing Geodatabases. Representation Q. How will we model our real world data? A. Typically: Features Continuous Surfaces and Imagery Map Graphics.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
GIS Data Quality.
Thinking Critically about Geospatial Data Quality Michael F. Goodchild University of California Santa Barbara.
material assembled from the web pages at
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
How do we represent the world in a GIS database?
Mondays, 3:00-3:50 p.m. Wilkinson credit Geo 507 Virtual Seminar in Geographic Information Science.
Quantitative Analysis. Quantitative / Formal Methods objective measurement systems graphical methods statistical procedures.
URBDP 422 Urban and Regional Geo-Spatial Analysis Lecture 2: Spatial Data Models and Structures Lab Exercise 2: Topology January 9, 2014.
Midterm next Wednesday. Midterm May start off with multiple choice Bulk will be short answer/short essay Lecture PPTs and your notes, readings in Longley.
Media Arts and Technology Graduate Program UC Santa Barbara MAT 259 Visualizing Information Winter 2006George Legrady1 MAT 259 Visualizing Information.
Rupa Tiwari, CSci5980 Fall  Course Material Classification  GIS Encyclopedia Articles  Classification Diagram  Course – Encyclopedia Mapping.
Future Directions for Geolibraries Michael F. Goodchild University of California Santa Barbara.
Current and Potential Uses for GIS in Academic Arctic Research Michael F. Goodchild University of California Santa Barbara.
Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio.
New Directions in Remote Sensing Education Michael F. Goodchild University of California Santa Barbara.
Why Standardize Metadata?. Why Have a Standard? Think for a moment how hard it would be to… … bake a cake without standard units of measurement. … put.
Cmpe 589 Spring Measurement Theory Front-End –Design –Design Review and Inspection –Code –Code Inspections –Debug and Develop Test Cases  Integration.
Remote Sensing and Geographic Information Systems An introduction to the world of mapping your watershed!
1 Overview Finding and importing data sets –Searching for data –Importing data_.
Digital Library The networked collections of digital text, documents, images, sounds, scientific data, and software that are the core of today’s Internet.
Slide 1 SDTSSDTS FGDC CWG SDTS Revision Project ANSI INCITS L1 Project to Update SDTS FGDC CWG September 2, 2003.
Topology Relationships between features: Supposed to prevent:
Geog 458: Map Sources and Errors January 27, 2006 Midterm Review Sample questions.
GIS September 27, Announcements Next lecture is on October 18th (read chapters 9 and 10) Next lecture is on October 18th (read chapters 9 and 10)
ESRI Education User Conference – July 6-8, 2001 ESRI Education User Conference – July 6-8, 2001 Introducing ArcCatalog: Tools for Metadata and Data Management.
Data Visualization as a Tool for Communicating Ocean Science Rob Bochenek Information Architect Axiom Consulting & Design.
Measurements and Data. Topics Types of Data Distance Measurement Data Transformation Forms of Data Data Quality.
Scale is the relationship between the size of features on a map and the size of the corresponding objects in the real world. Scale is commonly expressed.
Measurement and Scaling Concepts
Data Quality Data quality Related terms:
CHARACTERIZING SPATIAL DATA UNCERTAINTY IN GIS
“Honest GIS”: Error and Uncertainty
Data Model.
The bottom-up approach: Challenges in the production of statistical grid data Rina Tammisto European Forum for Geostatistics, Workshop 1- 3 October 2008.
Measurement-Based GIS
Geographic Information Systems
Presentation transcript:

Beyond Metadata: Towards User- Centric Description of Data Quality Michael F. Goodchild University of California Santa Barbara

Metadata n Data about data –handling instructions –catalog entry –fitness for use n What is known about data quality –a measure of the success of spatial data quality research –much progress has been made –FGDC CSDGM 1994 –ISO –DDI –EML

Two tests of success n Geobrowsers –Google Earth –geotagging –Wikimapia –Where 2.0

CSDGM, ISO n Do they match the state of research? –early 1990s –SDTS discussions of 1980s –the five-fold way positional accuracy attribute accuracy logical consistency completeness lineage n Do they represent a user perspective? –committees staffed by data producers –production control mechanisms?

Producer or user? n Producer-centric –details of the production process: the measurement and compilation systems used –tests of data quality conducted under carefully controlled conditions –formal specifications of data set contents n User-centric –effects of uncertainties on specific uses of the data, from simple queries to complex analyses –simple descriptions of quality that are readily understood by non-expert users –tools to enable the user to determine the effects of quality on results

Increasing complexity n Self-documentation –notes to oneself n A colleague –brief description n Another discipline, language, culture –ideal metadata/data ratio?

social distance complexity of metadata

Seven issues n Areas in which research has moved beyond the standards –Accuracy of Spatial Databases 1989 –Measurements from Maps 1989 –15 books –1000 journal articles

1. Decoupling the representative fraction n Ratio of distance on the map to distance on the ground –no flat map of a curved surface can have a constant RF n RF as a surrogate –positional accuracy –spatial resolution –map content n RF undefined for digital data –inherited from source maps –extended by convention aerial photographs (RF of the photographic plate) digital orthoimagery (positional accuracy)

2. Accuracy or uncertainty? n Accuracy –a true value z exists –a measured value z* –error z*-z –RMSE –theory of measurement error –error propagation n Uncertainty –vagueness in definitions no truth perhaps a consensus? –lack of replicability n Change of paradigm around 1992 CSDGMISO accuracy857 uncertainty00

3. Objects and fields n A fundamental distinction –1992 –appears nowhere in the standards n Discrete object conceptualization –an empty table top –occupied by discrete, countable objects –points, lines, areas, volumes n Continuous field conceptualization –a mapping from location x to value z –a single-valued function of location

z'(x) = z(x) + δz(x)

Separability n Phenomenon conceptualized as a field –impossible to separate positional and attribute accuracy –interval/ratio (elevation) –nominal (land cover class)

4. Granularity n Metadata definable at any level –individual vertex –point, line, area –layer –geodatabase n Metadata as a form of generalization –economies of scale n Spatial non-stationarity n Multiple lineages

5. Collection-level metadata n Describing the properties of entire collections n The Geospatial One-Stop – n There will always be more than one one-stop –how to know where to look?

GOS coverage, 1/06

6. Spatial dependence n Tobler’s First Law –nearby things are more similar than distant things –applies to errors –relative accuracy almost always better than absolute accuracy –covariances as important as variances

Marginal or joint properties? n Visualization of marginal properties n Analytic functions respond to joint properties –slope –area n Joint properties must be described at a higher level –relative errors of vertex positions –described at level of vertex collection

Cross-correlation n How are errors on Layer 1 related to errors on Layer 2? n Error as an issue in interoperability –what happens if I superimpose these layers? n Two layers will almost always not fit –depends on lineage of each –how bad is the misfit? –will it affect my analysis? n Binary metadata –the ability of a pair of data sets to interoperate –not available from either’s unary metadata n If GIS is about overlay –then binary metadata are essential

The way forward n Reopen the metadata debate –an unpopular move –it’s hard enough to persuade people to provide metadata –a standard before its time –standards should emerge only after research is complete n It’s our responsibility –the research task does not end with journal publication –metadata standards express the state of our research n Many other issues not related to data quality –possible allies