Download presentation
Presentation is loading. Please wait.
Published byBetty Fields Modified over 8 years ago
1
This report is very disappointing. What kind of software are you using?
2
Space Age and Stone Age Syndrome Data:Space Age/Stone Age Analysis:Space Age/Stone Age DataSpace AgeStone Age Analysis Space Age++ Stone Age+
3
The Value of Mapping Maps provide an efficient and unique method of demonstrating distributions of phenomena in space. Though [maps are] constructed primarily to show facts, to show spatial distributions with an accuracy which cannot be attained in pages of description or statistics, their prime importance is as research tools. They record observations in succinct form; they aid analysis; they stimulate ideas and aid in the formation of working hypotheses; they make it possible to communicate findings; they assist in research and policy research.
4
Disease Mapping Disease Mapping is about the use and interpretation of maps showing the incidence or prevalence of disease. Disease data occur either as individual cases or as groups (or counts) of cases within census tracts. Any disease map must be considered with the appropriate background population which gives rise to the incidence. Maps answer the question: where? They can reveal spatial patterns not easily recognized from lists of statistical data. Maps showing infectious diseases can help elucidate the cause of disease. Maps showing non-infectious diseases may be used to generate hypotheses of disease causation.
6
National Mortality Maps and Health Statistics Health Service Areas, Counties, Zip Codes, … Geographical Patterns for Health Resource Allocation Study Areas for Putative Sources of Health Hazard –Balance between dilution effect and edge effect Case Event Analysis and Ecological Analysis –Thresholds, contours, corresponding data Regional Comparisons and Rankings with Multiple Indicators/Criteria Choices of Reference/Control Areas
8
Baltimore Asthma Project Interdisciplinary Analysis of Childhood Asthma in Baltimore, MD 1.Collect and integrate in-situ measurements, remotely sensed measurements and clinical records that have possible relationships to the occurrence of asthma in the Baltimore, Maryland region. 2.Identify key trigger variables from the data to predict asthma occurrence on a spatial and temporal basis. 3.Organize a multidisciplinary team to assist in model design, analysis and interpretation of model results. 4.Develop tools for integrating, accessing and manipulating relevant health and remote sensing data and make these tools available to the scientific and health communities. Partners: Baltimore City Health Department Baltimore City School System Baltimore City Planning Council, Mayor’s Office State of Maryland Department of the Environment State of Maryland Department of Health and Human Services University of Maryland Asthma Assessment from GIS techniques Inner Harbor Time Aerosol Size The impact of asthma is escalating within the U.S. and children are particularly impacted with hospitalization increasing 74% since 1979. This study is investigating climate and environmental links to asthma in Baltimore, Maryland, a city in the top quintile for children’s asthma in the U.S.
9
Urban Heat Islands Use of aircraft and spacecraft remote sensing data on a local scale to help quantify and map urban sprawl, land use change, urban heat island, air quality, and their impact on human health (e.g. pediatric asthma)
10
Infectious Diseases Use of remote sensing data and other available geospatial data on a continental scale to help evaluate landscape characteristics that may be precursors for vector-borne diseases leading to early warning systems involving landscape health, ecosystem health, and human health Water-Borne Diseases Air-Borne Diseases Emerging Infectious Diseases
12
Mekong Malaria and Filariasis Projects To develop a predictive model for assessing risk areas of malaria transmission in the Greater Mekong Sub- region, and To make risk maps for filariasis map breeding sites for major vector species explore the linkage between vector population density and disease transmission intensity with environmental variables Anticipated Benefits –Reduce malaria and filariasis transmission rates –Minimize environmental damage by strategically using larvicides and insecticides –Improve the health status and economic activity of populations affected by malaria and filariasis in the Greater Mekong Sub-region Source: Southeast Asian Journal of Tropical Medicine and Public Health, volume 30 supplement 4, 1999.
13
African Dust Quantities of African dust transported by winds across the Atlantic have been increasing due to prolonged and agricultural practices in North Africa Recent studies show dust carries microbes and pollutants that have been detected in the US and Caribbean Islands Objectives of new studies are to determine harmful effects, e.g., childhood asthma in Puerto Rico
14
Vector-Borne Disease Detection Using NASA Satellite Data NDVI anomaly patterns over Africa during the 1997/98 ENSO warm event Research program on the relationships between environmental parameters (e.g vegetation), climate ( e.g. rainfall) and outbreaks of diseases such as: Rift Valley Fever (RVF) St. Louis Encephalitis Fever (EHF) Dengue Fever Ebola Fever Hanta Virus and others BENEFITS Map and monitor Eco-climatic patterns associated with disease outbreaks from satellite platforms Better understanding the dynamics of climate-disease interactions Advance warning of disease outbreaks would enable preventive measures (vaccination, vector control, etc.) to be undertaken Provide disease surveillance tools to public health authorities Using near real-time climate data and satellite imagery, scientists have discovered environmental triggers for Rift Valley Fever and other diseases Prediction of Rift Valley Fever outbreaks may be made up to 5 months in advance in Africa Using near real-time climate data and satellite imagery, scientists have discovered environmental triggers for Rift Valley Fever and other diseases Prediction of Rift Valley Fever outbreaks may be made up to 5 months in advance in Africa
15
A New NASA Initiative... To apply Space-based capabilities to examine environmental conditions that affect human health To enable easy use of and timely access to Earth science data and models To help our health community partners to develop practical early warning systems
16
Statistical Ecology, Environmental Statistics, Health Statistics—1 Sampling, Monitoring, and Observational Economy Initiatives — Twentieth Century— Capture-Mark-Recapture Composite, Ranked Set Adaptive with Clusters and Networks Transect, Selection Bias, Meta-Analysis Partnerships:
17
Geospatial Patterns and Pattern Metrics –Landscape patterns, disease patterns, mortality patterns Surface Topology and Spatial Structure –Hotspots, outbreaks, critical areas –Intrinsic hierarchical decomposition, study areas, reference areas –Change detection, change analysis, spatial structure of change Statistical Ecology, Environmental Statistics, Health Statistics—2 Multiscale Advanced Raster Map Analysis System Initiative—1
18
Statistical Ecology, Environmental Statistics, Health Statistics—3 Multiscale Advanced Raster Map Analysis System Initiative—2 Partially Ordered Sets and Hasse Diagrams –Multiple indicators, comparisons, fuzzy rankings –Intrinsic hierarchical groups, reference areas –Performance measures, composite indices System Design and Development –BAT, BPT, and synergistic collaboration –Bilateral and multilateral partnerships
19
Mortality rate due to a specific cause of death Elevated rates areas, patterns Ordinal thematic maps Transition pattern, transitionogram Transition matrices; spatial association with varying distance Comparatives with different causes of death National Mortality Maps and Statistics Geographic Patterns—1
20
Surface topology and spatial structure High mortality area delineation –Hotspots, clusters, outbreaks, corridors Surface smoothing Masking of true geographic patterns? Echelon analysis, original surface, smoothed surface National Mortality Maps and Statistics Geographic Patterns—2
21
Study areas for response and explanatory variables relationships Response proximity –Hotspots, thresholds, contours, counter strips Spatial proximity –Buffers, putative hazards Dilution effect and edge effect National Mortality Maps and Statistics Relationships—1
22
Intrinsic study areas Intrinsic hierarchical decomposition Consistent vertical and horizontal balance Echelons and echelon trees Urban heat islands and pediatric asthma Infectious and vector-borne diseases UV radiation National Mortality Maps and Statistics Relationships—2
23
Multiscale Advanced Raster Map System MARMAP SYSTEM Design and Development PARTNERSHIP NSF Digital Government Research Program Proposal for Invited Re-Submission
24
MARMAP SYSTEM Partnership communication of June 11, 2001. NSF Partnership Proposal. Review and Response-1 Review and Response-2 Review and Response-3
25
MARMAP SYSTEM Partnership Research and Outreach Prospectus: http://www.stat.psu.edu/~gpp/PDFfiles/prospectus 8-00.pdf http://www.stat.psu.edu/~gpp/PDFfiles/prospectus 8-00.pdf Our web page for raster map analysis: http://www.stat.psu.edu/~gpp/newpage11.htm http://www.stat.psu.edu/~gpp/newpage11.htm Our web page for raster map monographs: http://www.stat.psu.edu/~gpp/raster.htm http://www.stat.psu.edu/~gpp/raster.htm Our web page for UNEP HEI http://www.stat.psu.edu/~gpp/unephei.htm
26
Geospatial Cell-based Data Kinds of Data Cell as a Unit (Regular grid layout) –Categorical –Ordinal –Numerical –Multivariate Numerical Cell as an Object (Irregular cell sizes and shapes) –Partially Ordered –Ordinal –Numerical –Multivariate Numerical
27
Approaches to Research Issues
28
Landscape Pattern Extraction Regional Geographic Patterns Spectral data Empirical extraction Thematic data Empirical extraction Spectral data Model-based extraction Thematic data Model-based extraction
30
Model-based Pattern Extraction Pattern = Spatial variability in thematic maps Proposed research limited to raster maps Possible Parametric Models: –Geostatistics (Multi-indicator) –Markov Random Fields –Hierarchical Markov Transition Matrix models (HMTM)
31
Upper Echelons of Surfaces
32
Spatial Complexity with Single Response Variable Echelons Approach Echelon method analyzes cellular data pertaining to surface variables. Examines changes in topological connectivity of upper level sets as the level changes. Echelons elucidate spatial structure, help determine critical areas and corridors, emphasize areas of complexity, and map various aspects of surface organization Response can be numerical or ordinal Cellular tessellation can be regular or irregular
41
s Echelons Description—1 Ingredients of an Echelon Analysis: – Tessellation of a geographic region: –Response value Z on each cell. Determines a tessellated (piece-wise constant) surface with Z as elevation. How does connectivity (number of connected components) of the tessellation change with elevation? a c b k d e f g h i j a, b, c, … are cell labels
42
Think of the tessellated surface as a landform Initially the entire surface is under water As the water level recedes, more and more of the landform is exposed At each water level, cells are colored as follows: –Green for previously exposed cells (green = vegetated) –Yellow for newly exposed cells (yellow = sandy beach) –Blue for unexposed cells (blue = under water) For each newly exposed cell, one of three things happens: –New island emerges. Cell is a local maximum. Morse index=2. Connectivity increases. –Existing island increases in size. Cell is not a critical point. Connectivity unchanged. –Two (or more) islands are joined. Cell is a saddle point Morse index=1. Connectivity decreases. Echelons Description -- 2
43
a c b k d e f g h i j Echelons Illustrated -- 1 Echelon Tree a a a b,c b k d e f g h i j c Newly exposed island Island grows
44
Echelons Illustrated -- 2 Echelon Tree a a b,c b k d e f g h i j c Second island appears d a a b,c b k d e f g h i j c Both islands grow d e f,g New echelon
45
Echelons Illustrated -- 3 Echelon Tree a a b,c b k d e f g h i j c Islands join – saddle point d e f,g h New echelon a a b,c b d e f g h i j c Exposed land grows d e f,g h k i,j,k Three echelons
46
Echelons Illustrated -- 4 Each branch in echelon tree determines an echelon Each echelon consists of cells in the tessellation The echelons partition the region Each echelon determines a set of response values Z (and a corresponding set of values of the explanatory variables X, if any) Echelon Tree a a b,c b d e f g h i j c Echelon Partitioning d e f,g h k i,j,k Three echelons
47
Echelons Illustrated – 5 Higher Order Echelons Receding Waterline Previous Pictures Echelon Tree labeled with echelon orders 1 1 1 1 1 2 2 (not 3) 2 3
48
Echelons Illustrated – 6 Echelon Order Defined Echelon Tree labeled with echelon orders 1 1 1 1 1 2 2 (not 3) 2 3 Analogy with stream networks (Horton-Strahler order) Leaf branches have order 1 When two branches of orders p and q join, the new branch has order: Max(p, q) if p q p + 1 if p = q
49
Echelons Illustrated – 7 Echelon Smoothing Need for smoothing echelon trees Alternative to direct smoothing of surface values In complicated echelon trees, root nodes may be most indicative of noise and become prime candidates for pruning (contraction would be a better term) Criteria for pruning: Echelon relief, Echelon basal area, others? What is the corresponding smoothed surface? Echelon Tree labeled with echelon orders 1 1 1 1 1 2 2 (not 3) 2 3 Prune ?
50
Spatial Complexity with Single Response Variable Echelons Approach Issues to be addressed: Echelon trees and maps Echelon profiles and other tree metrics Noise effects and filtering Comparing echelon trees and maps Echelon stochastics: surface simulation, tree simulation, tree metric distributions
51
Pre-Classification Change Detection Echelons Approach Change vector approach (cell by cell) with actual spectral data Change vector approach (cell by cell) with compressed (hyperclustered) spectral data Pattern-based approach (compressed data only): Compare segment pattern at time1 with segment pattern at time 2.
52
Spatial Complexity with Multiple Indicators Echelons Approach Compare echelon features among indicators for consistency/inconsistency: –Order –Number of ancestors (distance from root of tree) Compression by treating features as pseudo- bands
53
Geospatial Analysis for Disease Surveillance—1 Case Event Point Data & Areal Unit Count Data Geospatial surveillance Cluster detection and evaluation Spatial scan statistics Choice of zonal parameter space Candidate zones as circular windows of expanding size Elliptical windows: long island breast cancer study Hyperclusters, echelon trees, upper surface sets defined by thresholds-based nodes
54
Spatio-temporal surveillance Cylinders-based spatio-temporal scan statistics Three-dimensional echelons and echelon trees Candidate zones as upper surface sets defined by thresholds-based nodes Temporal persistence and patterns Cluster alarms, suspect clusters, and their evaluation Geospatial Analysis for Disease Surveillance—2 Case Event Point Data & Areal Unit Count Data
55
Multiple cancer mortality statistics and maps Multiple disease incidence statistics and maps Across United States over years Across individual states over years Pooling over types of cancer/disease Pooling over types of people Change detection and change analysis In space, in time, in space-time Structure and behavior of chance Persistence and patterns of elevated areas Geospatial and Spatiotemporal Patterns of Change
56
To evaluate reported spatial or spatiotemporal disease clusters To see if they are statistically significant To test whether a disease is randomly distributed To perform geographical surveillance of disease To detect areas of significantly high or low rates Spatial and Spatiotemporal Scan Statistics—1 SaTScan
57
Poisson model, where the number of events in an area is Poisson distributed under the null hypothesis Bernoulli model, with 0/1 event data such as cases and controls The program adjusts for the underlying inhomogeneity of a background population With the Poisson model, the program can also adjust for any number of categorical variates provided by the user Spatial and Spatiotemporal Scan Statistics—2 SaTScan
58
SaTSCAN – 1 Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region A list of candidate zones Z is specified a priori. –This list becomes part of the parameter space and the zone must be estimated from within this list. –Each candidate zone should generally be spatially connected, e.g., a union of contiguous spatial units or cells. –Longer lists of candidate zones are usually preferable –Expanding circles or ellipses about specified centers are a common method of generating the list
59
SaTSCAN – 2 Example: Infected individuals in a tessellated region G with cells (spatial units) A (A) = # individuals in cell A (known) x(A) = # infected individuals in cell A (data) Individual infection results from independent Bernoulli trials Full model: –Bernoulli parameter is p inside zone Z and q p outside Z –p, q, Z have unknown parameter values and must be estimated Null model: Bernoulli parameter is constant (but unknown) throughout the region G
60
SaTSCAN – 3 Estimation: Maximum likelihood Likelihood = L( Z, p, q ) –For fixed Z maximize (analytically) with respect to p and q giving a partial likelihood L(Z) –Maximize L(Z) by explicit search through the list of candidate zones giving the likelihood estimate of Z
61
SaTSCAN – 4 Hypothesis Testing: Likelihood ratio statistic –Non-standard situation. Traditional ML theory does not apply –Need to determine the null distribution by Monte Carlo simulation of replicate data sets under the null model. For each data set Z must be estimated and the value of the likelihood ratio test statistic computed
62
SaTSCAN – 5 Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones ? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? A data-driven proposal: Candidate zones are the connected components of the upper level sets of the response surface. The candidate zones have a tree structure (echelon tree is a subtree), which may assist in automated detection of multiple, but geographically separate, elevated zones. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set.
63
Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Cells are objects of primary interest, such as countries, states, watersheds, counties, etc. Cell comparisons and rankings are the goals Suite of indicators are available on each cell Different indicators have different comparative messages, i.e., partial instead of linear ordering Hasse diagrams for visualization of partial orders. Multi- level diagram whose top level of nodes consists of all maximal elements in the partially ordered set of objects. Next level consists of all maximal elements when top level is removed from the partially ordered set, etc. Nodes are joined by segments when they are immediately comparable.
64
Multiple Criteria Analysis Multiple Indicators Partial Ordering Procedures Issues to be addressed: Crisp rankings, interval rankings, fuzzy rankings Fuzzy comparisons Echelon analysis of partially ordered sets with ordinal response levels determined by successive levels in the Hasse diagram Hasse diagram metrics: height, width, dimension, ambiguity (departure from linear order), etc. Hasse diagram stochastics (random structure on the indicators or random structure on Hasse diagram) Hasse diagram comparisons, e.g., compare Hasse diagrams for different regions
66
Hasse Diagram (all countries)
67
Hasse Diagram (W Europe)
68
Ranking Partially Ordered Sets – 2 An Example Poset (Hasse Diagram) Some linear extensions a b dc e f a c b e d f a c e b d f a b c d e f b a c d e f b a c e d f Jump Size: 3 1 5 4 2 Jump or Imputed Link (-------) is a link in the ranking that is not implied by the partial order
69
Ranking Partially Ordered Sets – 3 In the example from the preceding slide, there are a total of 16 linear extensions, giving the following frequency table. Rank Element123456Totals a95200016 b753100 c046600 d024640 e001366 f000061016 Totals16 Each (normalized) row gives the rank-frequency distribution for that element Each (normalized) column gives a rank-assignment distribution across the poset
70
Ranking Partially Ordered Sets – 5 Linear extension decision tree a b dc e f a c e b b d ff d ed f e e f c f d ed f e e f d f e e f c f e e f c c f d ed f e e f d f e e f c b a b a d Jump Size: 1 3 3 2 3 5 4 3 3 2 4 3 4 4 2 2 Poset (Hasse Diagram)
71
Ranking Partially Ordered Sets – 8 In many cases of practical interest e(S) is too large for actual enumeration in a reasonable length of time. For example, HEI data set has 141 countries arranged in a Hasse diagram with 14 levels and level sizes 16, 14, 15, 12, 16, 24, 10, 9, 10, 7, 2, 2, 3, 1 This gives 8.6 10 105 e(S) 1.9 10 243 which is completely beyond present-day computational capabilities. So what do we do? Markov Chain Monte Carlo (MCMC) applied to the uniform distribution on the set of all linear extensions lets us estimate the normalized rank-frequency distributions. Estimating the absolute frequencies (approximate counting) is also possible but somewhat more difficult.
72
Cumulative Rank Frequency Operator – 5 An Example of the Procedure In the example from the preceding slide, there are a total of 16 linear extensions, giving the following cumulative frequency table. Rank Element123456 a91416 b7121516 c041016 d0261216 e00141016 f00006 Each entry gives the number of linear extensions in which the element (row) label receives a rank equal to or better that the column heading
73
Cumulative Rank Frequency Operator – 6 An Example of the Procedure 16 The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f
74
Cumulative Rank Frequency Operator – 7 An Example where must be iterated Cumulative Rank Frequency Operator – 7 An Example where F must be iterated Original Poset (Hasse Diagram) a f eb c g d h a f e b ad c h g a f e b ad c h g a f e b ad c,g (tied) h F F 2 F 3
76
Ranking Possible Disease Clusters in the State of New York Data Matrix
77
First stage screening –Significant clusters by SaTScan and/or upper surface level echelon sets Second stage screening –Multicriteria noteworthy clusters Final stage screening –Follow up clusters for etiology, intervention based on multiple criteria Multiple Criteria Analysis Multiple Indicators and Choices Health Statistics Disease Etiology, Health Policy, Resource Allocation
78
MARMAP SYSTEM Software Design and Development Algorithm development Computer programming/Coding User interface design and implementation User output/Visualization Documentation/On-line help Other considerations: –Supported platforms (Windows, UNIX ?, LINUX ?) –Programming languages ? (C/C++, Java, Visual Basic, Delphi, etc.) –Software distribution (CD, Website)
79
CENTER FOR GEOSPATIAL INFORMATICS AND STATISTICS -proposed federal partnership- NSF: DGP, FRG, ITR, SDSC-NPACI NASA, USGS, EPA, USFS, NRCS, NASS, DOT, NCHS, CDC, NCI, CENSUS, NIMA, DOD, NOAA
80
Case Study –UNEP - PSU Nationwide Human Environment Index worldwide Construction and Evaluation of HEI Multiple Indicators and Comparisons without Integration of Indicators Hasse Diagrams, fuzzy rankings, and visualizations Handbook Interactive Queries
81
Case Study – NASA - PSU Issues Involved: Landcover classification –with available spectral image(s) –with a previous map and current spectral image –with fine or coarse segmentation Multi-period change detection Data Integration
82
Case Study – EPA – PSU Issues Involved: Indicators of Watershed Ecosystem Health Multiple Landscape Fragmentation Analysis Echelon Analysis of Spatial Structure and Behavior Multiscale Bivariate Raster Map Analysis Regional Human Environment Index: Formulation, Visualization, Evaluation, and Validation
83
Partnership Synergistics Concept Prototype Software Implementation Pilot Tests Feedback Case Studies
84
Partnership Synergistics PI/CO-PI MG- PG CG CSG
85
Partnership Synergistics Methodology Group Concepts, Issues, Approaches, Methods Prototype Group Techniques, Algorithms, Routines Methodology Group Refinement, Adaptation, Development MARMAP SYSTEM VALIDATION MG, PG, CG, CSG Computational Group Data Management, Software Design and Development Case Studies Data Resources Issues Answers
86
Logo for Statistics, Ecology, Environment, and Society
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.