Geog 458: Map Sources and Errors Uncertainty January 23, 2006.

Slides:



Advertisements
Similar presentations
Accuracy Assessment of Thematic Maps
Advertisements

From portions of Chapter 8, 9, 10, &11. Real world is complex. GIS is used model reality. The GIS models then enable us to ask questions of the data by.
Geographical Information Systems and Science Longley P A, Goodchild M F, Maguire D J, Rhind D W (2001) John Wiley and Sons Ltd 6. Uncertainty © John Wiley.
GIS Error and Uncertainty Longley et al., chs. 6 (and 15) Sources: Berry online text, Dawn Wright.
Raster Based GIS Analysis
Introduction to Cartography GEOG 2016 E
West Hills College Farm of the Future. West Hills College Farm of the Future Where are you NOW?! Precision Agriculture – Lesson 3.
Lecture by Austin Troy © 2005 Lecture 13: Introduction to Raster Spatial Analysis Using GIS-- Introduction to GIS Lecture Notes by Austin Troy, University.
Positional Accuracy February 15, 2006 Geog 458: Map Sources and Errors.
Geographic Information Systems
GIS Geographic Information System
Geog 458: Map Sources and Errors January 20, 2006 Data Storage and Editing.
Geographic Information Systems. What is a Geographic Information System (GIS)? A GIS is a particular form of Information System applied to geographical.
PROCESS IN DATA SYSTEMS PLANNING DATA INPUT DATA STORAGE DATA ANALYSIS DATA OUTPUT ACTIVITIES USER NEEDS.
So What is GIS??? “A collection of computer hardware, software and procedures that are used to organize, manage, analyze and display.
Uncertainty. Overview Definition, and relationship to geographic representation Conception, measurement and analysis Vagueness, indeterminacy accuracy.
Geocoding: - Table to geocode may be an ASCII, spreadsheet, dBase, or MapInfo table - Referred to as the “target” table - The target table is the attribute.
©2005 Austin Troy. All rights reserved Lecture 3: Introduction to GIS Part 1. Understanding Spatial Data Structures by Austin Troy, University of Vermont.
©2007 Austin Troy Lecture 7: Introduction to GIS 1.Queries and table operations for a single layer in Arc GIS 2.Intro to queries in Access Lecture by Austin.
Geographic Information System - ArcView University at Buffalo Summer Institute 2003 May 12, 2003.
Data Input How do I transfer the paper map data and attribute data to a format that is usable by the GIS software? Data input involves both locational.
Spatial data quality February 10, 2006 Geog 458: Map Sources and Errors.
February 15, 2006 Geog 458: Map Sources and Errors
Copyright, © Qiming Zhou GEOG1150. Cartography Quality Control and Error Assessment.
Dr. David Liu Objectives  Understand what a GIS is  Understand how a GIS functions  Spatial data representation  GIS application.
Data Acquisition Lecture 8. Data Sources  Data Transfer  Getting data from the internet and importing  Data Collection  One of the most expensive.
Lecture 23: Brief Introduction to Data quality By Austin Troy Using GIS-- Introduction to GIS.
Lecture II-2: Probability Review
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Using Geographic Information Systems (GIS) as.
Map Scale, Resolution and Data Models. Components of a GIS Map Maps can be displayed at various scales –Scale - the relationship between the size of features.
Chapter 3 Sections 3.5 – 3.7. Vector Data Representation object-based “discrete objects”
GIS Data Quality.
Data input 1: - Online data sources -Map scanning and digitizing GIS 4103 Spring 06 Adina Racoviteanu.
GIScience 2000 Raster Data Pixels as Modifiable Areal Units E. Lynn Usery U.S. Geological Survey University of Georgia.
Chapter 3 Digital Representation of Geographic Data.
8. Geographic Data Modeling. Outline Definitions Data models / modeling GIS data models – Topology.
How do we represent the world in a GIS database?
Introduction to Raster Spatial Analysis Using GIS-- Introduction to GIS Raster Query Map Calculation Zonal statistics Terrain functions Viewshed.
Quantitative Analysis. Quantitative / Formal Methods objective measurement systems graphical methods statistical procedures.
URBDP 422 Urban and Regional Geo-Spatial Analysis Lecture 2: Spatial Data Models and Structures Lab Exercise 2: Topology January 9, 2014.
Accuracy Assessment Having produced a map with classification is only 50% of the work, we need to quantify how good the map is. This step is called the.
Raster Concepts.
Role of Spatial Database in Biodiversity Conservation Planning Sham Davande, GIS Expert Arid Communities Technologies, Bhuj 11 September, 2015.
Fundamentals of GIS Lecture Materials by Austin Troy except where noted © 2008 Lecture 13: Introduction to Raster Spatial Analysis Using GIS-- By.
Accuracy of Land Cover Products Why is it important and what does it all mean Note: The figures and tables in this presentation were derived from work.
GIS Data Structures How do we represent the world in a GIS database?
Remote Sensing Classification Accuracy
Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio.
A Quick Introduction to GIS
©2007 Austin Troy Lecture 7: Introduction to GIS 1.Queries and table operations for a single layer in Arc GIS 2.Intro to queries in Access Lecture by Austin.
BOT / GEOG / GEOL 4111 / Field data collection Visiting and characterizing representative sites Used for classification (training data), information.
3 Analysis Examples from ArcGIS (1) Overlay Analysis - land use & flood zones (2) Interpolation - soil samples on a farm (3) Location Analysis - coffee.
Integrating Geographic Information Systems (GIS) into your Curriculum Teaching American History Meg Merrick & Heather Kaplinger Year 2 GIS Inservices.
GIS September 27, Announcements Next lecture is on October 18th (read chapters 9 and 10) Next lecture is on October 18th (read chapters 9 and 10)
U.S. Census Data & TIGER/Line Files
ILLINOIS - RAILTEC Slide 1 © 2013 University of Illinois at Urbana-Champaign. All Rights Reserved Pipeline Data Analysis Introduction to Pipeline Transportation.
Environmental Modeling Validating GIS Models. 1. A Habitat Model Issues: ► Mapping Florida Scrub Jay habitat in the Kennedy Space Center in the Kennedy.
This chapter talk about:  uncertainty  discusses its principles  cases and the sources of geographic uncertainty  The ways in which they operate in.
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
Accuracy Assessment of Thematic Maps THEMATIC ACCURACY.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
Geocoding Chapter 16 GISV431 &GEN405 Dr W Britz. Georeferencing, Transformations and Geocoding Georeferencing is the aligning of geographic data to a.
Accuracy Assessment of Thematic Maps
“Honest GIS”: Error and Uncertainty
Data Queries Raster & Vector Data Models
CSE 4705 Artificial Intelligence
Assessment of data quality
Lecture 1: Descriptive Statistics and Exploratory
Automating Student Yield Data Extraction
Presentation transcript:

Geog 458: Map Sources and Errors Uncertainty January 23, 2006

Outlines 1. Defining uncertainty 2. How to calculate uncertainty? 1) Nominal case: Confusion matrix 2) Interval/ratio case: RMSE 3. How to validate uncertainty? 1) Internal validation: MAUP 2) External validation: Conflation

1. Defining uncertainty Definition of uncertainty Definition of uncertainty Discrepancy between reality and its representation Discrepancy between reality and its representation Different kinds of uncertainty Different kinds of uncertainty Vagueness: representation is not well accommodated into the essence of reality (e.g. representing cities as a point layer, soil as crisp boundary)  better human conceptualization needed Vagueness: representation is not well accommodated into the essence of reality (e.g. representing cities as a point layer, soil as crisp boundary)  better human conceptualization needed Ambiguity: representation is not unilaterally agreed by users (e.g. placenames, occupation classification, indicator of environmental health)  standardization needed Ambiguity: representation is not unilaterally agreed by users (e.g. placenames, occupation classification, indicator of environmental health)  standardization needed Accuracy vs. precision Accuracy vs. precision Accuracy: difference between true values and those in DB Accuracy: difference between true values and those in DB Precision: amount of detail present in data Precision: amount of detail present in data

Questions Your diagnostics among {uncertainty, precision, positional accuracy, attribute accuracy, vagueness, ambiguity} and what are your prescriptions? Your diagnostics among {uncertainty, precision, positional accuracy, attribute accuracy, vagueness, ambiguity} and what are your prescriptions? Longitude values in decimal degree are stored as an integer Longitude values in decimal degree are stored as an integer Contour lines derived from DEM is not well lined up with DRG Contour lines derived from DEM is not well lined up with DRG The map indicates this road is bidirectional, but it turns out to be one-way The map indicates this road is bidirectional, but it turns out to be one-way Implementing intelligent geocoding system based on preposition in English (e.g. across, at, over) for international users Implementing intelligent geocoding system based on preposition in English (e.g. across, at, over) for international users Is the boundary of Mt. Everest well delineated? Is this polygon boundary a good representation of Mt. Everest? Is the boundary of Mt. Everest well delineated? Is this polygon boundary a good representation of Mt. Everest? Which is broadest? How would you communicate these errors in your data quality report? Which is broadest? How would you communicate these errors in your data quality report?

2. Calculating accuracy Nominal case Nominal case Confusion matrix (a.k.a. misclassification matrix) Confusion matrix (a.k.a. misclassification matrix) Interval/Ratio case Interval/Ratio case Root Mean Square Error (RMSE) Root Mean Square Error (RMSE) Confusion matrix is widely used to report on attribute accuracy when measured at a nominal scale RMSE is widely used to report on position accuracy when measured at a numeric scale (e.g. x, y coordinates are metric)

Confusion Matrix Table 6.2 (p. 138): evaluating classification of land parcel there are five land use code A to E Table 6.2 (p. 138): evaluating classification of land parcel there are five land use code A to E Rows and columns in misclassification matrix Rows and columns in misclassification matrix Row corresponds to the class as recorded in the database Row corresponds to the class as recorded in the database Column corresponds to the class as recorded in the field Column corresponds to the class as recorded in the field Correctly classified vs. incorrectly classified Correctly classified vs. incorrectly classified Diagonal entries represent agreement between database and field Diagonal entries represent agreement between database and field Off-diagonal entries represent disagreement between database and field Off-diagonal entries represent disagreement between database and field So how accurate would you say about this data? So how accurate would you say about this data? Since 206 (sum of diagonal entries) is correctly classified out of 304, it would be 206/304 = 68.6% Since 206 (sum of diagonal entries) is correctly classified out of 304, it would be 206/304 = 68.6%

Confusion matrix: exercise Let’s say you decide to write a test report on attribute accuracy of land use map Let’s say you decide to write a test report on attribute accuracy of land use map 100 reference points are selected to represent three classes, 49 points from natural, 28 points from agricultural, and 23 points from urban land use in your data 100 reference points are selected to represent three classes, 49 points from natural, 28 points from agricultural, and 23 points from urban land use in your data Field checks resulted in 41 points confirmed to be natural, 21 points confirmed to be agricultural, and 19 points confirmed to be urban. Field checks resulted in 41 points confirmed to be natural, 21 points confirmed to be agricultural, and 19 points confirmed to be urban. What is overall accuracy of your data? What is overall accuracy of your data?

Root Mean Square Error RMSE = RMSE = where c i is observed value and a i is true value where c i is observed value and a i is true value RMSE is the square root of sum of squared difference between observed value (ci) and its corresponding true value (ai) RMSE is the square root of sum of squared difference between observed value (ci) and its corresponding true value (ai) Indicates how much observed value is deviated from true values Indicates how much observed value is deviated from true values In the case of positional accuracy, ai will be derived from data with source in higher accuracy In the case of positional accuracy, ai will be derived from data with source in higher accuracy

RMSE: exercise Let’s say you decide to write a test report on positional accuracy of NHPN data Let’s say you decide to write a test report on positional accuracy of NHPN data You obtain data of sources with a higher positional accuracy such as geodetic points You obtain data of sources with a higher positional accuracy such as geodetic points 7 points (intersections) are selected to be compared to 7 corresponding control points 7 points (intersections) are selected to be compared to 7 corresponding control points Distances for 7 pairs are calculated as follows Distances for 7 pairs are calculated as follows What is RMSE? What is RMSE?

3. Validating accuracy Internal validation Internal validation Examines likely impacts of uncertainty upon operation results within GIS Examines likely impacts of uncertainty upon operation results within GIS What would be effects of different data aggregation schemes on operation results?: MAUP What would be effects of different data aggregation schemes on operation results?: MAUP External validation External validation Validates accuracy of test data in reference to external data sources Validates accuracy of test data in reference to external data sources How much is this data set accurate relative to reference data?: Conflation How much is this data set accurate relative to reference data?: Conflation

Modifiable Areal Unit Problem Quite simply, different aggregations yield different results Quite simply, different aggregations yield different results From Openshaw From Openshaw Because sometimes geography does not have a natural unit of analysis Because sometimes geography does not have a natural unit of analysis Population, vegetation Population, vegetation Remember census unit is artificial boundary for the purpose of enumeration Remember census unit is artificial boundary for the purpose of enumeration Space is used as a sampling scheme Space is used as a sampling scheme Question of optimal unit of analysis Question of optimal unit of analysis Urban center boundary for analyzing urban activities Urban center boundary for analyzing urban activities Metropolitan area for analyzing spatial labor market Metropolitan area for analyzing spatial labor market

Conflation Describes the range of functions that attempt to overcome differences between datasets or merge their contents as with rubber-sheeting Describes the range of functions that attempt to overcome differences between datasets or merge their contents as with rubber-sheeting Visual inspection of spatial overlay of TIGER file over GPS measurements Visual inspection of spatial overlay of TIGER file over GPS measurements Lab2: working with data of different sources, conflating test data with data of independent source (higher accuracy), visual inspection of positional accuracy, summarizing positional accuracy of test data with RMSE Lab2: working with data of different sources, conflating test data with data of independent source (higher accuracy), visual inspection of positional accuracy, summarizing positional accuracy of test data with RMSE