Download presentation
Presentation is loading. Please wait.
Published byAngela Peters Modified over 9 years ago
1
Spatial Data Mining Satoru Hozumi CS 157B
2
Learning Objectives Understand the concept of Spatial Data Mining Understand the concept of Spatial Data Mining Learn techniques on how to find spatial patterns Learn techniques on how to find spatial patterns
3
Examples of Spatial Patterns 1855 Asiatic Cholera in London. 1855 Asiatic Cholera in London. A water pump identified as the source. A water pump identified as the source. Cancer cluster to investigate health hazards. Cancer cluster to investigate health hazards. Crime hotspots for planning police patrol routes. Crime hotspots for planning police patrol routes. Affects of weather in the US caused by unusual warming of Pacific ocean (El Nino). Affects of weather in the US caused by unusual warming of Pacific ocean (El Nino).
4
What is a Spatial Pattern? What is not a pattern? What is not a pattern? Random, haphazard, chance, stray, accidental, unexpected. Random, haphazard, chance, stray, accidental, unexpected. Without definite direction, trend, rule, method, design, aim, purpose. Without definite direction, trend, rule, method, design, aim, purpose. What is a Pattern? What is a Pattern? A frequent arrangement, configuration, composition, regularity. A frequent arrangement, configuration, composition, regularity. A rule, law, method, design, description. A rule, law, method, design, description. A major direction, trend, prediction. A major direction, trend, prediction.
5
Defining Spatial Data Mining Search for spatial patterns. Search for spatial patterns. Non-trivial search – as “automated” as possible. Non-trivial search – as “automated” as possible. Large search space of plausible hypothesis Large search space of plausible hypothesis Ex. Asiatic cholera : causes water, food, air, insects. Ex. Asiatic cholera : causes water, food, air, insects. Interesting, useful, and unexpected spatial patterns. Interesting, useful, and unexpected spatial patterns. Useful in certain application domain Useful in certain application domain Ex. Shutting off identified water pump => saved human lives. Ex. Shutting off identified water pump => saved human lives. May provide a new understanding of the world May provide a new understanding of the world Ex. Water pump – Cholera connection lead to the “germ” theory. Ex. Water pump – Cholera connection lead to the “germ” theory.
6
What is NOT Spatial Data Mining Simple querying of Spatial Data Simple querying of Spatial Data Finding neighbors of Canada given names and boundaries of all countries (Search space not large) Finding neighbors of Canada given names and boundaries of all countries (Search space not large) Uninteresting or obvious patterns Uninteresting or obvious patterns Heavy rainfall in Minneapolis is correlated with heavy rainfall in St. Paul (10 miles apart). Heavy rainfall in Minneapolis is correlated with heavy rainfall in St. Paul (10 miles apart). Common knowledge, nearby places have similar rainfall Common knowledge, nearby places have similar rainfall Mining of non-spatial data Mining of non-spatial data Diaper sales and beer sales are correlated in evenings Diaper sales and beer sales are correlated in evenings
7
Families of Spatial Data Mining Patterns Location Prediction: Location Prediction: Where will a phenomenon occur? Where will a phenomenon occur? Spatial Interactions Spatial Interactions Which subset of spatial phenomena interact? Which subset of spatial phenomena interact? Hot spot Hot spot Which locations are unusual or share commonalities? Which locations are unusual or share commonalities?
8
Location Prediction Where will a phenomenon occur? Where will a phenomenon occur? Which spatial events are predictable? Which spatial events are predictable? How can a spatial event be predicted from other spatial events? How can a spatial event be predicted from other spatial events? Examples Examples Where will an endangered bird nest? Where will an endangered bird nest? Which areas are prone to fire given maps of vegitation and drought? Which areas are prone to fire given maps of vegitation and drought? What should be recommended to a traveler in a given location? What should be recommended to a traveler in a given location?
9
Spatial Interactions Which spatial events are related to each other? Which spatial events are related to each other? Which spatial phenomena depend on other phenomenon? Which spatial phenomena depend on other phenomenon? Examples Examples Earth science: Earth science: climate and disturbance => {wild fires, hot, dry, lightning} climate and disturbance => {wild fires, hot, dry, lightning} Epidemiology: Epidemiology: Disease type and enviornmental events => {West Nile disease, stagnant water source, dead birds, mosquitoes} Disease type and enviornmental events => {West Nile disease, stagnant water source, dead birds, mosquitoes}
10
Hot spots Is a phenomenon spatially clutered? Is a phenomenon spatially clutered? Which spatial entities are unusual or share common characteristics? Which spatial entities are unusual or share common characteristics? Examples Examples Crime hot spots to plan police patrols Crime hot spots to plan police patrols
11
Spatial Queries Spatial Range Queries Spatial Range Queries Find all cities within 50 miles of Paris Find all cities within 50 miles of Paris Query has associated region (location, boundary) Query has associated region (location, boundary) Answer includes overlapping or contained data regions Answer includes overlapping or contained data regions Nearest-Neighbor Queries Nearest-Neighbor Queries Find the 10 cities nearest to Paris Find the 10 cities nearest to Paris Results must be ordered by proximity Results must be ordered by proximity Spatial Join Queries Spatial Join Queries Find all cities near a lake Find all cities near a lake Join condition involves regions and proximity. Join condition involves regions and proximity.
12
Unique Properties of Spatial Patterns Items in a traditional data are independent of each other, where as properties of location in a map are often “auto-correlated” (patterns exist) Items in a traditional data are independent of each other, where as properties of location in a map are often “auto-correlated” (patterns exist) Traditional data deals with simple domains, e.g. numbers and symbols where as spatial data types are complex Traditional data deals with simple domains, e.g. numbers and symbols where as spatial data types are complex Items in traditional data describe discrete objects where as spatial data is continuous Items in traditional data describe discrete objects where as spatial data is continuous
13
Association Rules Support = the number of time a rule shows up in a database Support = the number of time a rule shows up in a database Confidence = Conditional probability of Y given X Confidence = Conditional probability of Y given X Example Example (Bedrock type = limestone), (soil depth (sink hole risk = high) (Bedrock type = limestone), (soil depth (sink hole risk = high) Support = 20 %, confidence = 0.8 Support = 20 %, confidence = 0.8 Interpretation: Locations with limestone bedrock and low soil depth have high risk of sink hole formation. Interpretation: Locations with limestone bedrock and low soil depth have high risk of sink hole formation.
14
Apriori Algorithm to mine association rules Key challenge Key challenge Very large search space Very large search space Key assumption Key assumption Few associations are support above given threshold Few associations are support above given threshold Associations with low support are not interesting Associations with low support are not interesting Key insight Key insight If an association item set has high support, then so do all its subsets If an association item set has high support, then so do all its subsets
15
Association rules Example
16
Techniques for Association Mining Classical method Classical method Association rules given item types and transactions Association rules given item types and transactions Assumes spatial data can be decomposed into transactions Assumes spatial data can be decomposed into transactions Such decomposition may alter spatial patterns Such decomposition may alter spatial patterns New spatial method New spatial method Spatial association rule Spatial association rule Spatial co-location Spatial co-location
17
Associations, Spatial associations, co-location
18
Associations, Spatial associatins, co- location
19
Co-location Rules For point data in space For point data in space Does not need transaction, works directly with continuous space Does not need transaction, works directly with continuous space Use neighborhood definition and spatial joins Use neighborhood definition and spatial joins
20
Co-location rules
21
Clustering Process of discovering groups in large databases Process of discovering groups in large databases Spatial view: rows in a database = points in a multi- dimentional space. Spatial view: rows in a database = points in a multi- dimentional space. Visualization may reveal interesting groups Visualization may reveal interesting groups
22
Clustering Hierarchical Hierarchical All points in one cluster All points in one cluster Split and merge till a stop criterion is reached Split and merge till a stop criterion is reached Partitional Partitional Start with random central point Start with random central point Assign points to nearest central point Assign points to nearest central point Update the central points Update the central points Approach with statistical rigor Approach with statistical rigor Density Density Find clusters based on density of regions Find clusters based on density of regions
23
Outliers Observations inconsistent with rest of the dataset Observations inconsistent with rest of the dataset Observations inconsistent with their neighborhoods Observations inconsistent with their neighborhoods A local instability or discontinuity A local instability or discontinuity
24
Variogram Cloud Create a variogram by plotting attribute difference, distance for each pair of points Create a variogram by plotting attribute difference, distance for each pair of points Select points common to many outlying pairs Select points common to many outlying pairs
25
Moran Scatter Plot Plot normalized attribute values, weighted average in the neighborhood for each location Plot normalized attribute values, weighted average in the neighborhood for each location Select points in upper left and lower right quadrant Select points in upper left and lower right quadrant
26
Scatter plot Plot normalized attribute values, weighted average in the neighborhood for each location Plot normalized attribute values, weighted average in the neighborhood for each location Fit a liner regression line Fit a liner regression line Select points which are unusually far from the regression line. Select points which are unusually far from the regression line.
27
Conclusion Patterns are opposite of random Patterns are opposite of random Common spatial patterns: Common spatial patterns: Location prediction Location prediction Feature interaction Feature interaction Hot spot Hot spot Spatial patterns may be discovered using: Spatial patterns may be discovered using: Techniques like associations, clustering and outlier detection Techniques like associations, clustering and outlier detection
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.