1 CPSC 695 Data Quality Issues M. L. Gavrilova. 2 Decisions…

Slides:



Advertisements
Similar presentations
Copyright, © Qiming Zhou GEOG1150. Cartography Data Models for Computer Cartography.
Advertisements

WFM 6202: Remote Sensing and GIS in Water Management © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 6202: Remote Sensing and GIS in Water Management Akm.
GIS for Environmental Science
Managing Error, Accuracy, and Precision In GIS. Importance of Understanding Error *Until recently, most people involved with GIS paid little attention.
School of Environmental Sciences University of East Anglia
GIS: The Grand Unifying Technology. Introduction to GIS  What is GIS?  Why GIS?  Contributing Disciplines  Applications of GIS  GIS functions  Information.
Introduction to Cartography GEOG 2016 E
Maps as Numbers Getting Started with GIS Chapter 3.
Cartographic and GIS Data Structures
Geographic Information Systems
CPSC 695 Future of GIS Marina L. Gavrilova. The future of GIS.
Geog 458: Map Sources and Errors January 20, 2006 Data Storage and Editing.
Geographic Information Systems. What is a Geographic Information System (GIS)? A GIS is a particular form of Information System applied to geographical.
GIS 200 Introduction to GIS Buildings. Poly Streams, Line Wells, Point Roads, Line Zoning,Poly MAP SHEETS.
Maps as Numbers Lecture 3 Introduction to GISs Geography 176A Department of Geography, UCSB Summer 06, Session B.
So What is GIS??? “A collection of computer hardware, software and procedures that are used to organize, manage, analyze and display.
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
GEOGRAPHIC INFORMATION SYSTEM GIS are tools that allow for the processing of spatial data into information, generally information tied explicitly to, and.
GIS Tutorial 1 Lecture 6 Digitizing.
Lineage February 13, 2006 Geog 458: Map Sources and Errors.
Spatial Data: Elements, Levels and Types. Spatial Data: What GIS Uses Bigfoot Sightings: Spatial Data.
Week 16 GEOG2750 – Earth Observation and GIS of the Physical Environment 1 Lecture 13 Error and uncertainty Outline – terminology, types and sources –
GIS Development: Step5 - DB Planning and Design Step6 - Database Construction Step7 - Pilot/Benchmark (Source: GIS AsiaPacific, June/July & August/September.
Geography 241 – GIS I Dr. Patrick McHaffie Associate Professor Department of Geography Cook County, % population < 5.
Copyright, © Qiming Zhou GEOG1150. Cartography Quality Control and Error Assessment.
GI Systems and Science January 23, Points to Cover  What is spatial data modeling?  Entity definition  Topology  Spatial data models Raster.
9. GIS Data Collection.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Data Quality Data quality Related terms:
Spatial data Visualization spatial data Ruslan Bobov
Spatial data models (types)
Data Quality Issues-Chapter 10
GROUP 4 FATIN NUR HAFIZAH MULLAI J.DHANNIYA FARAH AN-NUR MOHAMAD AZUWAN LAU WAN YEE.
Ref: Geographic Information System and Science, By Hoeung Rathsokha, MSCIM GIS and Remote Sensing WHAT.
Data Sources Sources, integration, quality, error, uncertainty.
GIS technologies and Web Mapping Services
Map Scale, Resolution and Data Models. Components of a GIS Map Maps can be displayed at various scales –Scale - the relationship between the size of features.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
OVERVIEW- What is GIS? A geographic information system (GIS) integrates hardware, software, and data for capturing, managing, analyzing, and displaying.
Geographic Information System GIS This project is implemented through the CENTRAL EUROPE Programme co-financed by the ERDF GIS Geographic Inf o rmation.
Applied Cartography and Introduction to GIS GEOG 2017 EL Lecture-2 Chapters 3 and 4.
GIS Data Quality.
Data input 1: - Online data sources -Map scanning and digitizing GIS 4103 Spring 06 Adina Racoviteanu.
Data Sources Sources, integration, quality, error, uncertainty.
8. Geographic Data Modeling. Outline Definitions Data models / modeling GIS data models – Topology.
How do we represent the world in a GIS database?
Cartographic and GIS Data Structures Dr. Ahmad BinTouq URL:
Difference Between Raster and Vector Images Raster and vector are the two basic data structures for storing and manipulating images and graphics data on.
URBDP 422 Urban and Regional Geo-Spatial Analysis Lecture 2: Spatial Data Models and Structures Lab Exercise 2: Topology January 9, 2014.
1 Peter Fox GIS for Science ERTH 4750 (98271) Week 8, Tuesday, March 20, 2012 Analysis and propagation of errors.
DATA QUALITY AND ERROR  Terminology, types and sources  Importance  Handling error and uncertainty.
Lecture 3 The Digital Image – Part I - Single Channel Data 12 September
URBDP 591 A Lecture 17: Mistakes that Scientists Make Objectives Evaluating Empirical Research Learning from Mistakes Mistakes in Research Design Mistakes.
School of Geography FACULTY OF ENVIRONMENT School of Geography FACULTY OF ENVIRONMENT GEOG5060 GIS & Environment Dr Steve Carver
Review: Exam I GEOG 370 Instructor: Christine Erlien.
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Distance measure Point A: UTM Eastings = 450,000m; Northings = 4,500,000m Point B: UTM Eastings = 550,000m; Northings = 4,500,000m.
Data Entry Getting coordinates and attributes into our GIS.
What is GIS? “A powerful set of tools for collecting, storing, retrieving, transforming and displaying spatial data”
Data Storage & Editing GEOG370 Instructor: Christine Erlien.
Czech Technical University in Prague Faculty of Transportation Sciences Department of Transport Telematics Doc. Ing. Pavel Hrubeš, Ph.D. Geographical Information.
Lecture 24: Uncertainty and Geovisualization
Data Quality Data quality Related terms:
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Geographical Information Systems
Cartographic and GIS Data Structures
Lecture 2 Components of GIS
URBDP 422 URBAN AND REGIONAL GEO-SPATIAL ANALYSIS
Prepared by S Krishna Kumar
Geographic Information Systems
Presentation transcript:

1 CPSC 695 Data Quality Issues M. L. Gavrilova

2 Decisions…

3 Topics Data quality issues Factors affecting data quality Types of GIS errors Methods to deal with errors Estimating degree of errors

4 Development of computer methods for handling spatial data Computer cartography Computer cartography (digital mapping) is the generation, storage and editing of maps using a computer. Spatial statistics Initially, developments were in areas such as measures of spatial distribution, three-dimensional analysis, network analysis and modeling techniques. CAD Computer-aided design (CAD) is used to enter and manipulate graphics data, whereas GIS is used to store and analise spatial data. Traditional users of CAD used CAD for the production, maintenance and analysis of design graphics.

5 Causes of errors in spatial data Measurement errors: accuracy (ex. altitude measurement or soil samples, usually related to instruments) Computational errors: precision (ex. to what decimal point the data is represented?) Human error: error in using instruments, selecting scale, location of samples Data model representation errors Errors in derived data

6 What type of error?

7 Factors affecting reliability of GIS data Age of data – collected at different times Arial coverage Map scale and resolution Density of observations Data formats and exchanges in formats Accessibility

8 Data quality issues: errors during data collections

9 Data quality issues: Sources of error in GIS Errors arising from our understanding and modeling of reality The different ways in which people perceive reality can have effects on how they model the world using GIS. Errors in source data for GIS Survey data can contain errors due to mistakes made by people operating equipment, or due to technical problems with equipment. Remotely sensed, and aerial photography data could have spatial errors if they were referenced wrongly, and mistakes in classification and interpretation would create attribute errors.

10 Data quality issues: Sources of error in GIS Perceptions or reality – mental mapping Your mental map of your home town or local area is your personal representation. You could commit this to paper as a sketch map if required. Figure in next slide shows two sketch map describing the location of Northallerton in the UK. They would be flawed as cartographic products, but nevertheless are a valid model of reality. Keates (1982) considers these maps to be personal, fragmentary, incomplete and frequently erroneous. In all cases, distance and shape would be distorted in inconsistent ways and the final product would be influenced by the personality and experience of the ‘cartographer’.

11 Two mental maps of Northallerton, UK

12 Operational errors introduced during manual digitizing Psychological errors: Difficulties in perceiving the true centre of the line being digitized and inability to move the cursor cross-hairs accurately along it. Physiological errors: It result from involuntary muscle spasms that give rise to random displacement. Line thickness: The thickness of lines on a map is determined by the cartographic generation employed. Method of digitizing: Point mode and stream mode

13 Topological errors in vector GIS (a) Effects of tolerance on topological cleaning (b) Topological ambiguities in raster to vector conversion

14 Vector to raster classification error

15 Rasterization errors Vector to raster conversion can cause an interesting assortment of errors in the resulting data. For example Topological errors Loss of small polygons Effects of grid orientation Variations in grid origin and datum Topological error in vector GIS: (a) loss of connectivity and creation of false connectivity (b) loss of information

16 Errors in data processing and analysis GIS operations that can introduce errors include the classification of data, aggregation or disaggregation of area data and the integration of data using overlay technique. Where a certain level of spatial resolution or a certain set of polygon boundaries are required, data sets that are not mapped with these may need to be aggregated or disaggregated to the required level.

17 Attribute error due to processing Attribute error result from positional error (such as the missing ‘hole’ feature in map A that is present as an island in map B). If one of the two maps that overlaid contains an error, then a classification error will result in the composite map (polygon BA).

18 GIS project design and management errors

19 Project management and human error Measuring error Typos/drawing errors Incorrect implementation error Planning/coordination error Incorrect use of devices error Erroneous methodology error Other human errors

20 Geometry related errors Rounding errors Processing errors Geometric coordinate transformation Map scanning, geometric approximations Vector to fine raster errors

21 Numerical modeling - propagating errors In numerical modeling and simulation errors propagate due to the multiplication of numerical error and initial measurement error. This presents additional challenges.

22 How errors can be controlled? Estimating degree of an error is an interesting area of GIS and computational science.

23 Methods to deal with errors Initial data: control quality of measurement, develop standards, prevent human error. Data models: select correct data models based on experience or model appropriateness, reduce errors during conversion from one to another.

24 Finding and modeling errors in GIS Checking for errors Probably the simplest means of checking for data errors is by visual inspection. Various statistical methods can be employed to help pinpoint potential errors. Estimating degree of an error helps in controlling and correcting errors

25 Estimating degree of an error - map digitizing errors Map digitizing – errors in digitizing a line or a geometric shape are estimated by studying the number of segments for curve approximation and map properties (distorted, rotated map). Perkal’s concept – there is an epsilon error band around a digitized line within which the data is considered to be correct.

26 Estimating degree of an error – raster to vector graphics Statistical approaches for error estimation based on probabilities of error Estimating the size of the grid and its influence on data approximation (Switzer’s method) Double-conversion method for vector to raster conversion (Bregt’s method): convert data twice, compare differences

27 Estimating degree of an error while processing data Errors associated with overlaying data and other spatial operations on data arise from inexact representation of original data and processing errors. Measurement of agreement between an original and overlaid polygons can be taken (McAlpine and Cook)

28 Estimating degree of progressing errors Two approaches to the modeling of errors in spatial data: epsilon modeling and Monte Carlo simulation. Epsilon modeling is based on a well known method of line generation (epsilon width). A Monte Carlo simulation approach has also been used to model the effects of error propagation in GIS overlays.

29 Methods to deal with progressing errors Conversion errors – develop more reliable algorithms Numerical or progressing errors: maintain consistence of data or result utilize exact computation methods