Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina.

Slides:



Advertisements
Similar presentations
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Advertisements

Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Appendix A. Descriptive Statistics Statistics used to organize and summarize data in a meaningful way.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Basic geostatistics Austin Troy.
University of Wisconsin-Milwaukee Geographic Information Science Geography 625 Intermediate Geographic Information Science Instructor: Changshan Wu Department.
Why Is It There? Lecture 6 Introduction to Geographic Information Systems Geography 176A 2006 Summer, Session B Department of Geography University of California,
Spatial Analysis Longley et al., Ch 14,15. Transformations Buffering (Point, Line, Area) Point-in-polygon Polygon Overlay Spatial Interpolation –Theissen.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Analysis of Research Data
Introduction to GIS: Lecture #7 (GIS Analysis) GIS Analysis Describing Attributes Statistical Analysis Spatial Description Spatial Analysis Searching for.
Lecture 07: Terrain Analysis Geography 128 Analytical and Computer Cartography Spring 2007 Department of Geography University of California, Santa Barbara.
Business Statistics - QBM117 Statistical inference for regression.
Applications in GIS (Kriging Interpolation)
Correlation and Regression Analysis
Relationships Among Variables
Understanding Research Results
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. 4.1 Chapter Four Numerical Descriptive Techniques.
Exploratory Data Analysis. Computing Science, University of Aberdeen2 Introduction Applying data mining (InfoVis as well) techniques requires gaining.
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
Copyright © Allyn & Bacon 2007 Chapter 2: Research Methods.
1 DATA DESCRIPTION. 2 Units l Unit: entity we are studying, subject if human being l Each unit/subject has certain parameters, e.g., a student (subject)
Variable  An item of data  Examples: –gender –test scores –weight  Value varies from one observation to another.
Using ESRI ArcGIS 9.3 Spatial Analyst
BPS - 3rd Ed. Chapter 211 Inference for Regression.
© Copyright McGraw-Hill CHAPTER 3 Data Description.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
Basic geostatistics Austin Troy.
Interpolation.
Why Is It There? Getting Started with Geographic Information Systems Chapter 6.
Interpolation Tools. Lesson 5 overview  Concepts  Sampling methods  Creating continuous surfaces  Interpolation  Density surfaces in GIS  Interpolators.
Geographic Information Science
Model Construction: interpolation techniques 1392.
GEOSTATISICAL ANALYSIS Course: Special Topics in Remote Sensing & GIS Mirza Muhammad Waqar Contact: EXT:2257.
TYPES OF STATISTICAL METHODS USED IN PSYCHOLOGY Statistics.
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Chapter 8 – Geographic Information Analysis O’Sullivan and Unwin “ Describing and Analyzing Fields” By: Scott Clobes.
Spatial Interpolation Chapter 13. Introduction Land surface in Chapter 13 Land surface in Chapter 13 Also a non-existing surface, but visualized as a.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Interpolation of Surfaces Spatial Data Analysis. Spatial Interpolation Spatial interpolation is the prediction of exact values of attributes at un-sampled.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
Statistical Surfaces Any geographic entity that can be thought of as containing a Z value for each X,Y location –topographic elevation being the most obvious.
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
Lecture 6: Point Interpolation
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
1 Chapter 10: Describing the Data Science is facts; just as houses are made of stones, so is science made of facts; but a pile of stones is not a house.
BPS - 5th Ed. Chapter 231 Inference for Regression.
(Unit 6) Formulas and Definitions:. Association. A connection between data values.
INTERPOLATION Procedure to predict values of attributes at unsampled points within the region sampled Why?Examples: -Can not measure all locations: - temperature.
Why Is It There? Chapter 6. Review: Dueker’s (1979) Definition “a geographic information system is a special case of information systems where the database.
Week 2 Normal Distributions, Scatter Plots, Regression and Random.
Statistics Descriptive Statistics. Statistics Introduction Descriptive Statistics Collections, organizations, summary and presentation of data Inferential.
Outline Sampling Measurement Descriptive Statistics:
Descriptive Statistics ( )
Statistical analysis.
MATH-138 Elementary Statistics
Part 5 - Chapter
Regression Analysis Module 3.
Data Mining: Concepts and Techniques
Statistical analysis.
APPROACHES TO QUANTITATIVE DATA ANALYSIS
Spatial Analysis Longley et al..
Introduction to Statistics
15.1 The Role of Statistics in the Research Process
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

Why is it there? (How can a GIS analyze data?) Getting Started, Chapter 6 Paula Messina

GIS is capable of data analysis Attribute Data Describe with statistics Analyze with hypothesis testing Spatial Data Describe with maps Analyze with spatial analysis

Describing one attribute

Attribute Description extremes range The extremes of an attribute are the highest and lowest values, and the range is the difference between them in the units of the attribute. histogram A histogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable-length bar. bell curve mean For a large number of records with random errors in their measurement, the histogram resembles a bell curve and is symmetrical about the mean.

Describing a classed raster grid % (blue) = 19/48

If the attributes are: Numbers statistical description min, max, range variance standard deviation

Statistical description Range : max-min Central tendency : mode, median, mean Variation : variance, standard deviation

Statistical description Range : outliers mode, median, mean Variation : variance, standard deviation

Elevation (book example)

GPS Example Data: Elevation Table 6.2: Sample GPS Readings Data ExtremeDateTimeDMSDMS Elev Minimum 6/14/9510:47am Maximum 6/15/9510:47pm Range 1 Day12 hours

Mean Statistical average Sum of the values for one attribute divided by the number of records X i i1= n = X / n

Variance The total variance is the sum of each record with its mean subtracted and then multiplied by itself. The standard deviation is the square root of the variance divided by the number of records less one.

l Average difference from the mean l Sum of the mean subtracted from the value for each record, squared, divided by the number of records- 1, square rooted. st.dev. = (X - X ) 2 i  n - 1 Standard Deviation

GPS Example Data: Elevation Standard Deviation Same units as the values of the records, in this case meters. Elevation is the mean (459.2 meters) plus or minus the expected error of meters Elevation is most likely to lie between meters and meters. These limits are called the error band or margin of error.

Standard Deviations and the Bell Curve Mean One Std. Dev. below the mean One Std. Dev. above the mean

Testing Means (1) Mean elevation of meters Standard deviation meters What is the chance of a GPS reading of meters? is 25.3 meters above the mean 0.31 standard deviations ( Z-score) of the curve lies between the mean and this value beyond it

Mean % % Testing Means (2)

Accuracy Determined by testing measurements against an independent source of higher fidelity and reliability. Must pay attention to units and significant digits. Not to be confused with precision!

The difference is the map GIS data description answers the question: Where? GIS data analysis answers the question: Why is it there? GIS data description is different from statistics because the results can be placed onto a map for visual analysis.

Spatial Statistical Description For coordinates, the means and standard deviations correspond to the mean center and the standard distance A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice.

Spatial Statistical Description For coordinates, data extremes define the two corners of a bounding rectangle.

Geographic extremes Southernmost point in the continental United States. Range: e.g. elevation difference; map extent Depends on projection, datum etc.

Mean Center mean y mean x

Centroid: mean center of a feature

Mean center?

Comparing spatial means

Spatial Analysis Lower 48 United States 1996 Data from the U.S. Census on gender Gender Ratio = # females per 100 males Range is What does the spatial distribution look like?

Gender Ratio by State: 1996

Searching for Spatial Pattern A linear relation is a predictable straight- line link between the values of a dependent and an independent variable. (y = a + bx) It is a simple model of correlation. A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained.

Simple linear relation dependent variable independent variable observation best fit regression line y = a + bx intercept gradient y=a+bx

Testing the relation gr = long.

GIS and Spatial Analysis Geographic inquiry examines the relationships between geographic features collectively to help describe and understand the real-world phenomena that the map represents. Spatial analysis compares maps, investigates variation over space, and predicts future or unknown maps. Many GIS systems have to be coaxed to generate a full set of spatial statistics.

You can lie with... MapsStatistics Correlation is not causation!

Terrain Analysis Paula Messina

Introduction to Terrain Analysis What is terrain analysis? How are data points interpolated to a grid? How are topographic data sets produced from non-point data? How are derivative data sets (i.e., slope and aspect maps) produced by ArcView?

What is Terrain Analysis? Terrain Analysis: the study of ground- surface relief and pattern by numerical methods (a.k.a geomorphometry). Geomorphology  qualitative Geomorphometry = quantitative

Interpolation to a Grid Assumptions: Elevations are continuously distributed The influence of one known point over an unknown point increases as distance between them decreases ?

Interpolation Using the Neighborhood Model Inverse-Distance theory dictates: The value of X > 58 The value of X < 97 The value of X is closer to 58 than 97 x

x Zx =Zx =  Z p d p -n P = 1 R  d p -n P = 1 R Z x = elevation at kernal (point x) Z p = elevation at known point p d p = distance from point x to point p n = “friction of distance” value; usually between 1 and 6 Neighborhood Interpolation Using Inverse Distance Weighting When n=2, the technique is called “inverse-squared distance weighting.” ArcGIS calls this IDW

Types of “Neighborhoods” used with IDW Nearest n Neighbors in this example, n = 3 this method isn’t effective when there are clusters of points “nearest in quadrants,” and “nearest in octants” searches can help Fixed Radius a radius is selected points are selected only if they lie within that fixed radius x x

Interpolation using the Spline Method The spline interpolator fits a minimum-curvature surface through input points. “Rubber sheet fit” The spline interpolator fits a mathematical function to a specified number of nearest points

Interpolation Using Kriging Based on regionalized variable theory Drift, random correlated component, noise This method produces a statistically optimal surface, but it is very computationally intensive Kriging is used frequently in soil science and geology

Trend Interpolator Fits a mathematical function (a polynomial of specified order) to input points Points may be chosen by nearest neighbor or radius searches --or-- All points may be used Uses a least-squares regression fit The surface produced does not necessarily pass through the points used This is an excellent choice when data points are sparse Not available as a menu item in ArcGIS

Which ArcView menu interpolator is better? IDW Assumption: The variable being mapped decreases in influence with distance Example: interpolating consumer purchasing power for a retail site analysis Spline Assumption: The variable being mapped is a smooth, continuous surface; it is not particularly good for surfaces with large variability over small horizontal distances Examples: terrain, water table heights, pollution concentration, etc.

The Finished Grid The Messina “Eyeball” Interpolator was used x Grids are subject to the “layer cake effect”

Point Data Collection in the Field It is critical to obtain data at the corners of the grid extent It is advisable to obtain the VIPs (Very Important Points) such as the highest and lowest elevations

Other Continuous Surface Sources USGS DEMs produced directly from USGS Topographic Maps Elevations of an area are averaged within the grid cell (pixel) High and low points can never be saved as a grid cell value Various techniques (i.e. stereograms) were used to accomplish this process Original datum (i.e. NAD27, NAD83) is preserved in the DEM Spatial resolution: 30m (7.5 minute data), 1 arc-second (1 degree data), 10m*, 5m* *(limited coverage)

Other Continuous Surface Sources Synthetic Aperture Radar, Side- looking Airborne Radar Shuttle Missions: Shuttle Radar Topography Mission, 2/00Shuttle Radar Topography Mission SIR-C, 1994SIR-C Other Orbiters Magellan Mapping Mission of Venus, Click here to see an animation of the Venutian surface topographyMagellan Mapping Mission of VenusClick here Airborne Radar Mappers AirSAR/TopSARAirSAR/TopSAR GeoSAR: California mappingGeoSAR Click hereClick here to link to Hunter College’s Radar Mapping Web Site

How is Slope Computed? Slope = arctan [ ( ) 2 + ( ) 2 ] Grid cell = 100m x 100m dZ dX dZ dY Calculate the slope for the central pixel. Click here Click here for the solution.

How is Aspect Computed? Aspect A’ = arctan - ( ) ( ) Grid cell = 100m x 100m dZ dY dZ dX Calculate the aspect for the central pixel. Click here Click here for the solution. If is negative, add 90 to A’ If is positive, and is negative: add 270 to A’ If is positive, and is positive: subtract A’ from 270 dZ dX dZ dY dZ dY dZ dX dZ dX