Download presentation
Presentation is loading. Please wait.
Published byDeborah Robbins Modified over 9 years ago
1
Why Is It There? Getting Started with Geographic Information Systems Chapter 6
2
6 Why Is It There? l 6.1 Describing Attributes l 6.2 Statistical Analysis l 6.3 Spatial Description l 6.4 Spatial Analysis l 6.5 Searching for Spatial Relationships l 6.6 GIS and Spatial Analysis
3
Duecker (1979) " A geographic information system is a special case of information systems where the database consists of observations on spatially distributed features, activities or events, which are definable in space as points, lines, or areas. A geographic information system manipulates data about these points, lines, and areas to retrieve data for ad hoc queries and analyses ".
4
GIS is capable of data analysis l Attribute Data –Describe with statistics –Analyze with hypothesis testing l Spatial Data –Describe with maps –Analyze with spatial analysis
5
Describing one attribute
6
Attribute Description l The extremes of an attribute are the highest and lowest values, and the range is the difference between them in the units of the attribute. l A histogram is a two-dimensional plot of attribute values grouped by magnitude and the frequency of records in that group, shown as a variable-length bar. l For a large number of records with random errors in their measurement, the histogram resembles a bell curve and is symmetrical about the mean.
7
If the records are: l Text –Length of text –word frequency –address matching l Example: Display all places called “State Street”
8
If the records are: l Classes –histogram by class –numbers in class –contiguity description
9
Describing a classed raster grid 5 10 15 20 P (blue) = 19/48
10
If the records are: l Numbers –statistical description –min, max, range –variance and standard deviation
11
Statistical description l Range (min, max, max-min) l Central tendency (mode, median, mean) l Variation (variance, standard deviation)
12
Elevation (book example)
13
Mean l Statistical average l Sum of the values for one attribute divided by the number of records X i i1= n = X
14
Computing the Mean l Sum of attribute values across all records, divided by the number of records. l A representative value, and for measurements with normally distributed error, converges on the true reading. l A value lacking sufficient data for computation is called a missing value.
15
Variance l The total variance is the sum of each record with its mean subtracted and then multiplied by itself. l The standard deviation is the square root of the variance divided by the number of records less one.
16
l Average difference from the mean l Sum of the mean subtracted from the value for each record, squared, divided by the number of records- 1, square rooted. st.dev. = (X - X ) 2 i n - 1 Standard Deviation
17
GPS Example Data: Elevation Standard deviation l Same units as the values of the records, in this case meters. l The average amount by which the readings differ from the average l Can be above or below the mean l Elevation is the mean (459.2 meters), plus or minus the expected error of 82.92 meters l Elevation is most likely to lie between 376.28 meters and 542.12 meters. l These limits are called the error band or margin of error.
18
Hypothesis testing l Establish NULL hypothesis (e.g. Values or Means are the same) l Establish ALTERNATIVE hypothesis, based on some expectation. l Test hypothesis. Try to reject NULL. l If null hypothesis is rejected, there is some support for the alternative (theory-based) hypothesis.
19
Uses of the standard deviation l Shorthand description : given the mean and s.d., we know where 67% of a random distribution lies. l A standardized measure : –a score of 80% can be good or bad, depending on the mean and s.d.
20
Testing the Mean l A test of means can establish whether two samples from a population are different from each other, or whether the different measures they have are the result of random variation.
21
Samples and populations l A sample is a set of measurements taken from a larger group or population. l Sample means and variances can serve as estimates for their populations.
22
Spatial analysis with GIS l GIS data description answers the question: Where? l GIS data analysis answers the question: Why is it there? l GIS data description is different from statistics because the results can be placed onto a map for visual analysis.
23
Spatial Statistical Description l For coordinates, the means and standard deviations correspond to the mean center and the standard distance l A centroid is any point chosen to represent a higher dimension geographic feature, of which the mean center is only one choice. l The standard distance for a set of point spatial measurements is the expected spatial error.
24
Spatial Statistical Description l For coordinates, data extremes define the two corners of a bounding rectangle.
25
Geographic extremes l Southernmost point in the continental United States. l Range: e.g. elevation difference; map extent
26
Mean Center mean y mean x
27
Centroid: mean center of a feature
28
GIS and Spatial Analysis l Descriptions of geographic properties such as shape, pattern, and distribution are often verbal l Quantitative measure can be devised, although few are computed by GIS. l GIS statistical computations are most often done using retrieval options such as buffer and spread. l Also by manipulating attributes with arithmetic commands (map algebra).
29
An example l Lower 48 United States l 1994 Data from the U.S. Census on gender l Gender Ratio = # females per 100 males l Range is 97 - 108 l What does the spatial distribution look like?
30
Gender Ratio by State: 1994
31
Searching for Spatial Pattern l A linear relationship is a predictable straight-line link between the values of a dependent and an independent variable. It is a simple model of the relationship. l A linear relation can be tested for goodness of fit with least squares methods. The coefficient of determination r-squared is a measure of the degree of fit, and the amount of variance explained.
32
Simple linear relationship dependent variable independent variable observation best fit regression line y = a + bx intercept gradient y=a+bx
33
Testing the relationship gr = 117.46 + 0.138 long.
34
Patterns in Residual Mapping l Differences between observed values of the dependent variable and those predicted by a model are called residuals. l A GIS allows residuals to be mapped and examined for spatial patterns. l A model helps explanation and prediction after the GIS analysis. l A model should be simple, should explain what it represents, and should be examined in the limits before use.
35
Mapping residuals from a model
36
Unexplained variance l More variables? l Different extent? l More records? l More spatial dimensions? l More complexity? l Another model? l Another approach?
37
GIS and Spatial Analysis l Many GIS systems have to be coaxed to generate a full set of spatial statistics.
38
Analytic Tools and GIS l Tools for searching out spatial relationships and for modeling are only lately being integrated into GIS. l Statistical and spatial analytical tools are also only now being integrated into GIS, and many people use separate software systems outside the GIS: “loosely coupled” analyses.
39
Analytic Tools and GIS l Real geographic phenomena are dynamic, but GISs have been mostly static. Time-slice and animation methods can help in visualizing and analyzing spatial trends. l GIS organizes real-world data to allow numerical description and allows the analyst to model, analyze, and predict with both the map and the attribute data.
40
You can lie with... l Maps l Statistics l Correlation is not causation!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.