DATA QUALITY AND ERROR  Terminology, types and sources  Importance  Handling error and uncertainty.

Slides:



Advertisements
Similar presentations
Introduction There is a tendancy to assume all data in a GIS, both locational and attribute, is accurate. This is never the case. Today we will look at:
Advertisements

Managing Error, Accuracy, and Precision In GIS. Importance of Understanding Error *Until recently, most people involved with GIS paid little attention.
GIS Error and Uncertainty Longley et al., chs. 6 (and 15) Sources: Berry online text, Dawn Wright.
Raster Based GIS Analysis
MANAGING A GIS PROJECT. Starting Points for GIS: Do your homework: GIS, RS, GPS Get familiar with the terminology Gain general knowledge of spatial analysis:
GIS: The Grand Unifying Technology. Introduction to GIS  What is GIS?  Why GIS?  Contributing Disciplines  Applications of GIS  GIS functions  Information.
1 CPSC 695 Data Quality Issues M. L. Gavrilova. 2 Decisions…
Introduction to Cartography GEOG 2016 E
CS 128/ES Lecture 4b1 Spatial Data Formats.
TERMS, CONCEPTS and DATA TYPES IN GIS Orhan Gündüz.
Lecture 24: More on Data Quality and Metadata By Austin Troy Using GIS-- Introduction to GIS.
Cartographic and GIS Data Structures
QA/QC: A Checklist for Quality and Control
Geog 458: Map Sources and Errors January 20, 2006 Data Storage and Editing.
Week 17GEOG2750 – Earth Observation and GIS of the Physical Environment1 Lecture 14 Interpolating environmental datasets Outline – creating surfaces from.
Geog 458: Map Sources and Errors Uncertainty January 23, 2006.
Geographic Information Systems
PROCESS IN DATA SYSTEMS PLANNING DATA INPUT DATA STORAGE DATA ANALYSIS DATA OUTPUT ACTIVITIES USER NEEDS.
So What is GIS??? “A collection of computer hardware, software and procedures that are used to organize, manage, analyze and display.
CE Introduction to Surveying and Geographic Information Systems
©2005 Austin Troy. All rights reserved Lecture 3: Introduction to GIS Part 1. Understanding Spatial Data Structures by Austin Troy, University of Vermont.
Lecture 4. Interpolating environmental datasets
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
Week 16 GEOG2750 – Earth Observation and GIS of the Physical Environment 1 Lecture 13 Error and uncertainty Outline – terminology, types and sources –
Spatial data quality February 10, 2006 Geog 458: Map Sources and Errors.
February 15, 2006 Geog 458: Map Sources and Errors
Copyright, © Qiming Zhou GEOG1150. Cartography Quality Control and Error Assessment.
GI Systems and Science January 23, Points to Cover  What is spatial data modeling?  Entity definition  Topology  Spatial data models Raster.
9. GIS Data Collection.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Lecture 23: Brief Introduction to Data quality By Austin Troy Using GIS-- Introduction to GIS.
Introduction to the course January 9, Points to Cover  What is GIS?  GIS and Geographic Information Science  Components of GIS Spatial data.
Data Quality Data quality Related terms:
Prepared by Abzamiyeva Laura Candidate of the department of KKGU named after Al-Farabi Kizilorda, Kazakstan 2012.
©2005 Austin Troy. All rights reserved Lecture 3: Introduction to GIS Understanding Spatial Data Structures by Austin Troy, Leslie Morrissey, & Ernie Buford,
Data Quality Issues-Chapter 10
Understanding and Interpreting maps
Map Scale, Resolution and Data Models. Components of a GIS Map Maps can be displayed at various scales –Scale - the relationship between the size of features.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
Scale, Resolution and Accuracy in GIS
Basic Geographic Concepts GEOG 370 Instructor: Christine Erlien.
GIS Data Quality.
Data input 1: - Online data sources -Map scanning and digitizing GIS 4103 Spring 06 Adina Racoviteanu.
Chapter 3 Digital Representation of Geographic Data.
How do we represent the world in a GIS database?
Cartographic and GIS Data Structures Dr. Ahmad BinTouq URL:
URBDP 422 Urban and Regional Geo-Spatial Analysis Lecture 2: Spatial Data Models and Structures Lab Exercise 2: Topology January 9, 2014.
Introduction to Cartographic Modeling
Data Storage and Editing (17/MAY/2010) Dr. Ahmad BinTouq URL:
Uncertainty How “certain” of the data are we? How much “error” does it contain? Also known as: –Quality Assurance / Quality Control –QAQC.
1 Spatial Data Models and Structure. 2 Part 1: Basic Geographic Concepts Real world -> Digital Environment –GIS data represent a simplified view of physical.
GIS Data Structures How do we represent the world in a GIS database?
URBDP 591 A Lecture 17: Mistakes that Scientists Make Objectives Evaluating Empirical Research Learning from Mistakes Mistakes in Research Design Mistakes.
School of Geography FACULTY OF ENVIRONMENT School of Geography FACULTY OF ENVIRONMENT GEOG5060 GIS & Environment Dr Steve Carver
NR 143 Study Overview: part 1 By Austin Troy University of Vermont Using GIS-- Introduction to GIS.
AN INTRODUCTION TO GIS SYSTEMS TAKEN AND MODIFIED FROM TEXT BY David J. Buckley Corporate GIS Solutions Manager Pacific Meridian Resources, Inc.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
GIS September 27, Announcements Next lecture is on October 18th (read chapters 9 and 10) Next lecture is on October 18th (read chapters 9 and 10)
Review.
What is GIS? “A powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data”
Raster Data Models: Data Compression Why? –Save disk space by reducing information content –Methods Run-length codes Raster chain codes Block codes Quadtrees.
CENTENNIAL COLLEGE SCHOOL OF ENGINEERING & APPLIED SCIENCE VS 361 Introduction to GIS ERROR, ACCURACY & PRECISION COURSE NOTES 1.
Data Storage & Editing GEOG370 Instructor: Christine Erlien.
Data Quality Data quality Related terms:
“Honest GIS”: Error and Uncertainty
Statistical surfaces: DEM’s
Cartographic and GIS Data Structures
Spatial interpolation
URBDP 422 URBAN AND REGIONAL GEO-SPATIAL ANALYSIS
Geographic Information Systems
Presentation transcript:

DATA QUALITY AND ERROR  Terminology, types and sources  Importance  Handling error and uncertainty

DATA QUALITY GIGO: garbage in, garbage out Because it’s in the computer, don’t mean it’s right Accept there will always be errors in GIS

INTRODUCTION GIS - great tool for spatial data analysis and display question: what about error?  data quality, error and uncertainty  error propagation  confidence in GIS outputs be careful, be aware, be upfront

TERMINOLOGY various (often confused terms) in use:  error  uncertainty  accuracy  precision  data quality

ERROR AND UNCERTAINTY Error wrong or mistaken degree of inaccuracy in a calculation  e.g. 2% error Uncertainty lack of knowledge about level of error unreliable

Accuracy and Precision Accuracy extent of system-wide bias in measurement process Precision level of exactness associated with measurement Imprecise Precise InaccurateAccurate

DATA QUALITY degree of excellence in data general term for how good the data is takes all other definitions into account  error  uncertainty  precision  accuracy

DATA QUALITY based on the following elements:  positional accuracy  attribute accuracy  logical consistency  data completeness

POSITIONAL ACCURACY spatial: deviance from true position (horizontal or vertical) general rule: be within the best possible data resolution  i.e: for scale of 1:50,000, error can be no more than 25m can be measured in root mean square error (RMS) - measure of the average distance between the true and estimated location temporal: difference from actual time and/or date

ATTRIBUTE ACCURACY classification and measurement accuracy  a feature is what the GIS thinks it to be i.e. a railroad is a railroad and not a road i.e. a soil sample agrees with the type mapped rated in terms of % correct in a database, forest types are grouped and placed within a boundary in reality - no solid boundary where only pine trees grow on one side and spruce on the other

ATTRIBUTE ACCURACY

LOGICAL CONSISTENCY presence of contradictory relationships in the database non-spatial  crimes recorded at place of occurrence, others at place where report taken  data for one country is for 2000, another for 2001  data uses different source or estimation technique for different years

LOGICAL CONSISTENCY spatial  overshoots and gaps in road networks or parcel polygons Good logical consistency

COMPLETENESS reliability concept  are all instances of a feature the GIS claims to include, in fact, there? partially a function of the criteria for including features  when does a road become a track? simply put, how much data is missing?

SOURCES OF ERROR sources of error:  data collection and input  human processing  actual changes  data manipulation  data output

DATA COLLECTION AND INPUT inherent instability of phenomena itself  random variation of most phenomena (i.e. leaf size)  edges may not be sharp boundaries (i.e. forest edges) description of source data  data source  name, date of collection, method of collection, date of last modification, producer, reference, scale, projection  inclusion of metadata

DATA COLLECTION AND INPUT instrument inaccuracies:  satellite/air photo/GPS/spatial surveying  e.g. resolution and/or accuracy of digitizing equipment thinnest visible line: mm at scale of 1:20, feet anything smaller, not able to capture  attribute measuring instruments

DATA COLLECTION AND INPUT model used to represent data  e.g. choice of datum, classification system data encoding and entry  e.g. keying or digitizing errors original digitised

DATA COLLECTION AND INPUT Attribute uncertainty uncertainty regarding characteristics (descriptors, attributes, etc.) of geographical entities types: imprecise or vague, mixed up, plain wrong sources: source document, misinterpretation, database error

HUMAN PROCESSING misinterpretation (i.e. photos), spatial and attribute effects of classification (nominal/ordinal/ interval) effects of scale change and generalization Scale of data Global DEM European DEM Nation al DEM Local DEM

HUMAN PROCESSING generalization - simplification of reality by cartographer to meet restrictions of map scale and physical size, effective communication and message 1:500,000 1:25,000 1:10,000 City of Sapporo, Japan can result in: reduction, alteration, omission and simplification of map elements

ACTUAL CHANGES gradual natural changes: river courses, glacier recession catastrophic changes: fires, floods, landslides seasonal and daily changes: lake/sea/river levels man-made: urban development, new roads attribute change: forest growth (height), discontinued trail/roads, road surfacing

ACTUAL CHANGES age of data Northallerton circa 1867 Northallerton circa 1999

DATA MANIPULATION vector to raster conversion errors coding and topological mismatch errors:  cell size (majority class and central point)

DATA MANIPULATION vector to raster conversion errors coding and topological mismatch errors:  grid orientation

DATA MANIPULATION compounding effects of processing and analysis of multiple layers  if two layers each have correctness of 90%, the accuracy of the resulting overlay is around 81% density of observations - TIN modeling and interpolation inappropriate or inadequate class intervals or inputs for models

DATA OUTPUT scaling accuracies  detail on scale bar and scale type error caused by inaccuracy of the output devices:  resolution of computer screen or printer  colour palettes: intended colours don’t match from screen to printer

DATA OUTPUT USE information may be incorrectly understood information may be inappropriately used

HANDLING ERROR must learn to cope with error and uncertainty in GIS applications  minimise risk of erroneous results  minimise risk to life/property/environment more research needed:  mathematical models  procedures for handling data error and propagation  empirical investigation of data error and effects  procedures for using output data uncertainty estimates  incorporation as standard GIS tools

HANDLING ERROR Awareness  knowledge of types, sources and effects Minimization  use of best available data  correct choices of data model/method Communication  to end user!