Download presentation
Presentation is loading. Please wait.
Published byChastity Rose Modified over 8 years ago
1
DC G. P. Patil
2
This report is very disappointing. What kind of software are you using?
3
Space Age and Stone Age Syndrome Data:Space Age/Stone Age Analysis:Space Age/Stone Age DataSpace AgeStone Age Analysis Space Age++ Stone Age+
5
National Mortality Maps and Health Statistics Health Service Areas, Counties, Zip Codes, … Geographical Patterns for Health Resource Allocation Study Areas for Putative Sources of Health Hazard –Balance between dilution effect and edge effect Case Event Analysis and Ecological Analysis –Thresholds, contours, corresponding data Regional Comparisons and Rankings with Multiple Indicators/Criteria Choices of Reference/Control Areas
7
Baltimore Asthma Project Interdisciplinary Analysis of Childhood Asthma in Baltimore, MD 1.Collect and integrate in-situ measurements, remotely sensed measurements and clinical records that have possible relationships to the occurrence of asthma in the Baltimore, Maryland region. 2.Identify key trigger variables from the data to predict asthma occurrence on a spatial and temporal basis. 3.Organize a multidisciplinary team to assist in model design, analysis and interpretation of model results. 4.Develop tools for integrating, accessing and manipulating relevant health and remote sensing data and make these tools available to the scientific and health communities. Partners: Baltimore City Health Department Baltimore City School System Baltimore City Planning Council, Mayor’s Office State of Maryland Department of the Environment State of Maryland Department of Health and Human Services University of Maryland Asthma Assessment from GIS techniques Inner Harbor Time Aerosol Size The impact of asthma is escalating within the U.S. and children are particularly impacted with hospitalization increasing 74% since 1979. This study is investigating climate and environmental links to asthma in Baltimore, Maryland, a city in the top quintile for children’s asthma in the U.S.
8
Urban Heat Islands Use of aircraft and spacecraft remote sensing data on a local scale to help quantify and map urban sprawl, land use change, urban heat island, air quality, and their impact on human health (e.g. pediatric asthma)
11
Infectious Diseases Use of remote sensing data and other available geospatial data on a continental scale to help evaluate landscape characteristics that may be precursors for vector-borne diseases leading to early warning systems involving landscape health, ecosystem health, and human health Water-Borne Diseases Air-Borne Diseases Emerging Infectious Diseases
12
Geospatial Patterns and Pattern Metrics –Landscape patterns, disease patterns, mortality patterns Surface Topology and Spatial Structure –Hotspots, outbreaks, critical areas –Intrinsic hierarchical decomposition, study areas, reference areas –Change detection, change analysis, spatial structure of change Statistical Ecology, Environmental Statistics, Health Statistics—2 Multiscale Advanced Raster Map Analysis System Initiative—1
13
Statistical Ecology, Environmental Statistics, Health Statistics—3 Multiscale Advanced Raster Map Analysis System Initiative—2 Partially Ordered Sets and Hasse Diagrams –Multiple indicators, comparisons, fuzzy rankings –Intrinsic hierarchical groups, reference areas –Performance measures, composite indices System Design and Development –BAT, BPT, and synergistic collaboration –Bilateral and multilateral partnerships
14
Multiscale Advanced Raster Map System MARMAP SYSTEM Design and Development PARTNERSHIP NSF Digital Government Research Program Proposal for Invited Re-Submission
15
Consider a 21st century digital government scenario of the following nature: What message does a remote sensing-derived land cover land use map have about the large landscape it represents? And at what scale and at what level of detail? Does the spatial pattern of the map reveal any societal, ecological, environmental condition of the landscape? And therefore can it be an indicator of change?
16
Consider a 21st century digital government scenario of the following nature: How do you automate the assessment of the spatial structure and behavior of change to discover critical areas, hot spots, and their corridors? Is the map accurate? How accurate is it? How do you assess the accuracy of the map? Of the change map over time for change detection?
17
Consider a 21st century digital government scenario of the following nature What are the implications of the kind and amount of change and accuracy on what matters, whether climate change, carbon emission, water resources, urban sprawl, biodiversity, indicator species, or early warning, or others. And with what confidence, even with a single map/change-map?
18
The needed partnership research is expected to find answers to these questions and a few more that involve multicategorial raster maps based on remote sensing and other geospatial data. It is also expected to design a prototype and user-friendly advanced raster map analysis system for digital governance.
20
Geospatial Cell-based Data Kinds of Data Cell as a Unit (Regular grid layout) –Categorical –Ordinal –Numerical –Multivariate Numerical Cell as an Object (Irregular cell sizes and shapes) –Partially Ordered –Ordinal –Numerical –Multivariate Numerical
21
Approaches to Research Issues
22
Landscape Pattern Extraction Spectral data Empirical extraction Thematic data Empirical extraction Spectral data Model-based extraction Thematic data Model-based extraction
23
Empirical Pattern Extraction Thematic Data Pattern = Spatial variability in thematic maps Proposed research limited to raster maps Empirical Pattern Extractors: –Landscape metrics (e.g., FRAGSTATS) –Multiscale fragmentation profiles (entropy-based) –Patch structure metrics Scaling domain detection
29
Model-based Pattern Extraction Pattern = Spatial variability in thematic maps Proposed research limited to raster maps Possible Parametric Models: –Geostatistics (Multi-indicator) –Markov Random Fields –Hierarchical Markov Transition Matrix models (HMTM)
30
Spatial Complexity with Single Response Variable Echelons Approach Echelon method analyzes cellular data pertaining to surface variables. Examines changes in topological connectivity of upper level sets as the level changes. Echelons elucidate spatial structure, help determine critical areas and corridors, emphasize areas of complexity, and map various aspects of surface organization Response can be numerical or ordinal Cellular tessellation can be regular or irregular
39
Ingredients of an Echelon Analysis: – Tessellation of a geographic region: –Response value Z on each cell. Determines a tessellated (piece-wise constant) surface with Z as elevation. How does connectivity (number of connected components) of the tessellation change with elevation? Echelons Description -- 1 a c b k d e f g h i j a, b, c, … are cell labels
40
Think of the tessellated surface as a landform Initially the entire surface is under water As the water level recedes, more and more of the landform is exposed At each water level, cells are colored as follows: –Green for previously exposed cells (green = vegetated) –Yellow for newly exposed cells (yellow = sandy beach) –Blue for unexposed cells (blue = under water) For each newly exposed cell, one of three things happens: –New island emerges. Cell is a local maximum. Morse index=2. Connectivity increases. –Existing island increases in size. Cell is not a critical point. Connectivity unchanged. –Two (or more) islands are joined. Cell is a saddle point Morse index=1. Connectivity decreases. Echelons Description -- 2
41
a c b k d e f g h i j Echelons Illustrated -- 1 Echelon Tree a a a b,c b k d e f g h i j c Newly exposed island Island grows
42
Echelons Illustrated -- 2 Echelon Tree a a b,c b k d e f g h i j c Second island appears d a a b,c b k d e f g h i j c Both islands grow d e f,g New echelon
43
Echelons Illustrated -- 3 Echelon Tree a a b,c b k d e f g h i j c Islands join – saddle point d e f,g h New echelon a a b,c b d e f g h i j c Exposed land grows d e f,g h k i,j,k Three echelons
44
Echelons Illustrated -- 4 Each branch in echelon tree determines an echelon Each echelon consists of cells in the tessellation The echelons partition the region Each echelon determines a set of response values Z (and a corresponding set of values of the explanatory variables X, if any) Echelon Tree a a b,c b d e f g h i j c Echelon Partitioning d e f,g h k i,j,k Three echelons
45
Echelons Illustrated – 5 Higher Order Echelons Receding Waterline Previous Pictures Echelon Tree labeled with echelon orders 1 1 1 1 1 2 2 (not 3) 2 3
46
Spatial Complexity with Single Response Variable Echelons Approach Issues to be addressed: Echelon trees and maps Echelon profiles and other tree metrics Noise effects and filtering Comparing echelon trees and maps Echelon stochastics: surface simulation, tree simulation, tree metric distributions
47
Pre-Classification Change Detection Echelons Approach Change vector approach (cell by cell) with actual spectral data Change vector approach (cell by cell) with compressed (hyperclustered) spectral data Pattern-based approach (compressed data only): Compare segment pattern at time1 with segment pattern at time 2.
48
Spatial Complexity with Multiple Indicators: Echelons Approach Compare echelon features among indicators for consistency/inconsistency: –Order –Number of ancestors (distance from root of tree) Compression by treating features as pseudo- bands
49
Geospatial Analysis for Disease Surveillance—1 Case Event Point Data & Areal Unit Count Data Geospatial surveillance Cluster detection and evaluation Spatial scan statistics Choice of zonal parameter space Candidate zones as circular windows of expanding size Elliptical windows: long island breast cancer study Hyperclusters, echelon trees, upper surface sets defined by thresholds-based nodes
50
Spatio-temporal surveillance Cylinders-based spatio-temporal scan statistics Three-dimensional echelons and echelon trees Candidate zones as upper surface sets defined by thresholds-based nodes Temporal persistence and patterns Cluster alarms, suspect clusters, and their evaluation Geospatial Analysis for Disease Surveillance—2 Case Event Point Data & Areal Unit Count Data
51
Multiple cancer mortality statistics and maps Multiple disease incidence statistics and maps Across United States over years Across individual states over years Pooling over types of cancer/disease Pooling over types of people Change detection and change analysis In space, in time, in space-time Structure and behavior of chance Persistence and patterns of elevated areas Geospatial and Spatiotemporal Patterns of Change
52
SaTSCAN – 1 Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region A list of candidate zones Z is specified a priori. –This list becomes part of the parameter space and the zone must be estimated from within this list. –Each candidate zone should generally be spatially connected, e.g., a union of contiguous spatial units or cells. –Longer lists of candidate zones are usually preferable –Expanding circles or ellipses about specified centers are a common method of generating the list
53
SaTSCAN – 5 Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones ? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? A data-driven proposal: Candidate zones are the connected components of the upper level sets of the response surface. The candidate zones have a tree structure (echelon tree is a subtree), which may assist in automated detection of multiple, but geographically separate, elevated zones. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set.
54
Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Cells are objects of primary interest, such as countries, states, watersheds, counties, etc. Cell comparisons and rankings are the goals Suite of indicators are available on each cell Different indicators have different comparative messages, i.e., partial instead of linear ordering Hasse diagrams for visualization of partial orders. Multi- level diagram whose top level of nodes consists of all maximal elements in the partially ordered set of objects. Next level consists of all maximal elements when top level is removed from the partially ordered set, etc. Nodes are joined by segments when they are immediately comparable.
55
Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Issues to be addressed: Crisp rankings, interval rankings, fuzzy rankings Fuzzy comparisons Echelon analysis of partially ordered sets with ordinal response levels determined by successive levels in the Hasse diagram Hasse diagram metrics: height, width, dimension, ambiguity (departure from linear order), etc. Hasse diagram stochastics (random structure on the indicators or random structure on Hasse diagram) Hasse diagram comparisons, e.g., compare Hasse diagrams for different regions
58
Hasse Diagram (all countries)
60
Ranking Partially Ordered Sets – 1 S = partially ordered set (poset) with elements a, b, c, …. How can we rank the elements of S consistent with the partial order? Such rankings are called linear extensions of the partial order. Different people with different perceptions and priorities may choose different rankings. How many rankings assign rank 1 to element a ? Rank 2 ? Rank 3 ?, etc. If rankings are chosen randomly (equal probability), what is the likelihood that element x receives rank i ?
61
Ranking Partially Ordered Sets – 2 An Example Poset (Hasse Diagram) Some linear extensions a b dc e f a c b e d f a c e b d f a b c d e f b a c d e f b a c e d f Jump Size: 3 1 5 4 2 Jump or Imputed Link (-------) is a link in the ranking that is not implied by the partial order
62
Ranking Partially Ordered Sets – 3 In the example from the preceding slide, there are a total of 16 linear extensions, giving the following frequency table. Rank Element123456Totals a95200016 b753100 c046600 d024640 e001366 f000061016 Totals16 Each (normalized) row gives the rank-frequency distribution for that element Each (normalized) column gives a rank-assignment distribution across the poset
63
Ranking Partially Ordered Sets – 3a Rank-Frequency Distributions Element a Element c Element e Element b Element d Element f Rank
64
The rank-frequency distributions for any poset are unimodal In fact, there is a theorem which asserts that each rank- frequency distribution is log-concave, i.e., if f 1, f 2, and f 3 are the frequencies (relative or absolute) for any three consecutive ranks assigned to an element of the poset, then ( f 2 ) 2 f 1 f 3 Ranking Partially Ordered Sets – 3b Properties of Rank-Frequency Distributions f1f1 f2f2 f3f3
65
Ranking Partially Ordered Sets – 5 Linear extension decision tree a b dc e f a c e b b d ff d ed f e e f c f d ed f e e f d f e e f c f e e f c c f d ed f e e f d f e e f c b a b a d Jump Size: 1 3 3 2 3 5 4 3 3 2 4 3 4 4 2 2 Poset (Hasse Diagram)
66
Ranking Partially Ordered Sets – 8 In many cases of practical interest e(S) is too large for actual enumeration in a reasonable length of time. For example, HEI data set has 141 countries arranged in a Hasse diagram with 14 levels and level sizes 16, 14, 15, 12, 16, 24, 10, 9, 10, 7, 2, 2, 3, 1 This gives 8.6 10 105 e(S) 1.9 10 243 which is completely beyond present-day computational capabilities. So what do we do? Markov Chain Monte Carlo (MCMC) applied to the uniform distribution on the set of all linear extensions lets us estimate the normalized rank-frequency distributions. Estimating the absolute frequencies (approximate counting) is also possible but somewhat more difficult.
67
Cumulative Rank Frequency Operator – 6 An Example of the Procedure 16 The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f
68
Cumulative Rank Frequency Operator – 7 An example where must be iterated Cumulative Rank Frequency Operator – 7 An example where F must be iterated Original Poset (Hasse Diagram) a f eb c g d h a f e b ad c h g a f e b ad c h g FF 2
70
Ranking Possible Disease Clusters in the State of New York Data Matrix
71
First stage screening –Significant clusters by SaTScan and/or upper surface level echelon sets Second stage screening –Multicriteria noteworthy clusters by partially ordered sets and Hass diagrams Final stage screening –Follow up clusters for etiology, intervention based on multiple criteria using Hass diagrams Multiple Criteria Analysis Multiple Indicators and Choices Health Statistics Disease Etiology, Health Policy, Resource Allocation
72
Markov Random Fields Raster Map Pr [ pixel response | all other pixels ] = Pr [ pixel response | neighboring pixels ] (Parametric Form)
73
DIG Model Model Description-1 Z Latent Surface gets green gets red lattice points
74
DIG Model Model Description-2 Grid with lattice points t Standard normal gaussian process Z(t) on the grid with correlation function (h) Partition A1, A2, …, Ak of Z-axis Replace Z(t) by the disjunctive indicators of A1, A2, …, Ak, which determine a unique category (color code) for grid point t
75
DIG Model Model Description-3 The surface values Z(t) are latent (hidden) and are not observable. Only the categories as determined by the partition sets are observable The parameters of the model are: –Correlation function (h) –Partition sets A1, A2, …, Ak In general, the partition sets are not intervals, but may be disjoint unions of intervals Standard normal is not a limitation (probability integral transform)
76
DIG Model Model Simulation Straightforward in concept, via the usual Cholesky or spectral decomposition of the variance-covariance matrix of Z(t) Only difficulty is the size of the map and the resulting size of the matrices. Usual solution is to generate Z(t) in blocks according to the range of spatial dependence
77
DIG Model Accuracy Assessment Two overlaid maps –t: true/reference values/categories –d: declared categories Same latent surface Z(s), same correlogram (h) Separate overlaid transitionograms
78
DIG Model Accuracy Assessment Red gets Red Yellow gets Red Z(s)Z(s) dt Compiling t, d matches/mismatches in overlaid transitionograms gives model-predicted error/confusion matrix.
79
HMTM Model Employs a duality between Spatial Transitions in the raster map and Hierarchical Transitions in the model
80
Spatial Transitions Scan map to obtain a matrix of transition frequencies between pixels a fixed distance apart (horiz. or vert.)
81
Auto-Association Matrices Analogue of variogram for categorical responses Symmetric, k x k where k = number of mapping categories One auto-association matrix for each distance scale : adjacent pixels 2 pixels apart 4 pixels apart
82
HMTM Model Hierarchical Transitions Model generates a hierarchical sequence of raster maps, all having the same spatial extent Hierarchical Level 0 Hierarchical Level 1 Hierarchical Level 2
83
Assigning Categories to Pixels Assignment at coarsest scale is a random draw from the marginal land-cover distribution:
84
Assigning Categories to Pixels Assignment at finer scales is via k by k row stochastic matrices G i Mother cell4 daughter cells The transition is determined by 4 draws from the ith row of G:
85
HMTM Model Fitting One matrix G for each transition in the hierarchy Estimated recursively from the auto-association matrices: Only matrix algebra and eigen-decomposition required
86
HMTM Simulation Alias-Urn Methods Quadtree Ordering of Pixels Very Fast
87
Applications of HMTM Model Fragmentation Profiles –Model predicted profiles –Confidence bands Variability of Landscape Metrics –Fragstats –Perimeter-Area exponents –Patch Structure
88
Applications of HMTM Model Self-Similarity –HMTM definition ( transition matrices G equal) –Formal Hypothesis Tests Parameter Reduction –Equally-spaced Eigenvalues
89
Applications of HMTM Model Eigenvectors as Landscape Metrics –Marginal land-cover distribution –Orthogonality (PCA) –Contrasts on land-cover categories
92
Thematic Accuracy Assessment Model-based Error Maps Thematic raster data consists of declared category (d) on each grid cell and reference (“true”) categories (t) on selected cells Model for joint (d,t) map is fitted. Repeated conditional simulation of fitted model yields a simulated probability distribution for each missing t-value. Models under consideration: MIG, MRF, HMTM
93
MARMAP SYSTEM Software Design and Development Data Resource Partners Active Data Repository Data Intensive Computing Programming Tools and Environments Visualization Tools
94
MARMAP SYSTEM Software Design and Development Algorithm development Computer programming/Coding User interface design and implementation User output/Visualization Documentation/On-line help Other considerations: –Supported platforms (Windows, UNIX ?, LINUX ?) –Programming languages ? (C/C++, Java, Visual Basic, Delphi, etc.) –Software distribution (CD, Website)
95
CENTER FOR GEOSPATIAL INFORMATICS AND STATISTICS -proposed federal partnership- NSF: DGP, FRG, ITR, SDSC-NPACI NASA, USGS, EPA, USFS, DOT, NCHS, CENSUS, NIMA, DOD
96
Case Study – NASA - PSU Issues Involved: Landcover classification –with available spectral image(s) –with a previous map and current spectral image –with fine or coarse segmentation Multi-period change detection Data Integration
97
Case Study – EPA – PSU Issues Involved: Indicators of Watershed Ecosystem Health Multiple Landscape Fragmentation Analysis Echelon Analysis of Spatial Structure and Behavior Multiscale Bivariate Raster Map Analysis Regional Human Environment Index: Formulation, Visualization, Evaluation, and Validation
98
Case Study –UNEP - PSU Nationwide Human Environment Index worldwide Construction and Evaluation of HEI Multiple Indicators and Comparisons without Integration of Indicators Hasse Diagrams, fuzzy rankings, and visualizations Handbook Interactive Queries
99
Partnership Synergistics Concept Prototype Software Implementation Pilot Tests Feedback Case Studies
100
Partnership Synergistics PI/CO-PI MG- PG CG CSG
101
Partnership Synergistics Methodology Group Concepts, Issues, Approaches, Methods Prototype Group Techniques, Algorithms, Routines Methodology Group Refinement, Adaptation, Development MARMAP SYSTEM VALIDATION MG, PG, CG, CSG Computational Group Data Management, Software Design and Development Case Studies Data Resources Issues Answers
102
Logo for Statistics, Ecology, Environment, and Society
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.