1 NJ DHSS CES SEER G. P. Patil January 17, 2003. 2 This report is very disappointing. What kind of software are you using?

Slides:



Advertisements
Similar presentations
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Advertisements

Salt Marsh Restoration Site Selection Tool An Example Application: Ranking Potential Salt Marsh Restoration Sites Using Social and Environmental Factors.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Early Detection of Disease Outbreaks Prospective Surveillance.
Empirical/Asymptotic P-values for Monte Carlo-Based Hypothesis Testing: an Application to Cluster Detection Using the Scan Statistic Allyson Abrams, Martin.
Raster Based GIS Analysis
Border around project area Everything else is hardly noticeable… but it’s there Big circles… and semi- transparent Color distinction is clear.
 Statistical approaches for detecting unexplained clusters of disease.  Spatial Aggregation Thomas Talbot New York State Department of Health Environmental.
CHAPTER 6 Statistical Analysis of Experimental Data
Why Geography is important.
Let’s pretty it up!. Border around project area Everything else is hardly noticeable… but it’s there Big circles… and semi- transparent Color distinction.
Geographic Information Science
Dept. of Civil and Environmental Engineering and Geodetic Science College of Engineering The Ohio State University Columbus, Ohio 43210
BC Jung A Brief Introduction to Epidemiology - IV ( Overview of Vital Statistics & Demographic Methods) Betty C. Jung, RN, MPH, CHES.
Using ArcGIS/SaTScan to detect higher than expected breast cancer incidence Jim Files, BS Appathurai Balamurugan, MD, MPH.
Title: Spatial Data Mining in Geo-Business. Overview  Twisting the Perspective of Map Surfaces — describes the character of spatial distributions through.
The Spatial Scan Statistic. Null Hypothesis The risk of disease is the same in all parts of the map.
Understanding and Interpreting maps
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
How do we represent the world in a GIS database?
Cluster Detection Comparison in Syndromic Surveillance MGIS Capstone Project Proposal Tuesday, July 8 th, 2008.
Kampala, Uganda, 23 June 2014 Applicability of the ITU-T E.803 Quality of service parameters for supporting service aspects Kwame Baah-Acheamfuor Chairman,
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
1 Enviromatics Environmental sampling Environmental sampling Вонр. проф. д-р Александар Маркоски Технички факултет – Битола 2008 год.
Ideas on a Network Evaluation and Design System Prepared for EPA OAQPS Richard Scheffe by Rudolf B. Husar and Stefan R. Falke Center for Air Pollution.
Geovisualization and Spatial Analysis of Cancer Data: Developing Visual-Computational Spatial Tools for Cancer Data Research Challenges for Spatial Data.
Extent and Mask Extent of original data Extent of analysis area Mask – areas of interest Remember all rasters are rectangles.
1 Spatial Data Models and Structure. 2 Part 1: Basic Geographic Concepts Real world -> Digital Environment –GIS data represent a simplified view of physical.
Defining Landscapes Forman and Godron (1986): A
INTRODUCTION TO GIS  Used to describe computer facilities which are used to handle data referenced to the spatial domain.  Has the ability to inter-
Using Population Data to Address the Human Dimensions of Population Change D.M. Mageean and J.G. Bartlett Jessica Daniel 10/27/2009.
Technical Details of Network Assessment Methodology: Concentration Estimation Uncertainty Area of Station Sampling Zone Population in Station Sampling.
Tutorial I: Missing Value Analysis
Spatial Data Models Geography is concerned with many aspects of our environment. From a GIS perspective, we can identify two aspects which are of particular.
INDIAN SCIENCE CONGRESS Mumbai 2015 Actuarial Science Symposium G. P. Patil Penn State University, University Park, PA USA.
1 RTI SYMPOSIUM on HOMELAND and HEALTH SECURITY Biosurveillance Geoinformatics of Hotspot Detection and Prioritization for Biosecurity G. P. Patil November.
NIEHS G. P. Patil. This report is very disappointing. What kind of software are you using?
2.1.1 Geospatial Analysis for Disease Surveillance—1 Case Event Point Data & Areal Unit Count Data Geospatial surveillance Cluster detection and evaluation.
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
1 Forum for Interdisciplinary Mathematics Patna, India G. P. Patil December 2010.
General Elliptical Hotspot Detection Xun Tang, Yameng Zhang Group
Myers, W. L., Bishop, J., Brooks, R., and Patil, G. P. (2001). Composite spatial indexing of regional habitat importance. Community Ecology, 2(2), 213—220.
Project Geoinformatic Surveillance NSF DGP Grant G. P. Patil, Penn State, PI EPA: Watershed Characterization and Prioritization PADOH: Disease Clusters.
1 Surveillance GeoInformatics Hotspot Detection, Prioritization, and Early Warning G. P. Patil December 2004 – January 2005.
Illustrating NOAA’s Geospatial Role in Resilient Coastal Zones Joseph Klimavicz, NOAA CIO and Director of High Performance Computing and Communications.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Hotspot Detection, Delineation, and Prioritization for Geographic Surveillance and Early Warning Organizer and Chair : G. P. Patil  2:00—2:05 Chair 
Albany New York (1) G. P. Patil. Albany New York (2) G. P. Patil.
Early Detection of Disease Outbreaks with Applications in New York City Martin Kulldorff University of Connecticut Farzad Mostashari and James Miller.
Multiscale Raster Map Analysis for Sustainble Environment and Development A Research and Outreach Prospectus of Advanced Mathematical, Statistical and.
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots Overview Geospatial Surveillance Upper Level Set Scan Statistic System Spatial-Temporal.
1 Biosurveillance Sensor Networks and Resultant Spatiotemporal Data for Crisis-Index Development and Early Warning Austin, March 2005 G. P. Patil Austin,
1 Multi-criterion Ranking and Poset Prioritization G. P. Patil December 2004 – January 2005.
JalaSRI Consortium Delhi – Jalgaon Workshop TERI U G.P. Patil June 1, 2009.
1 Spatial Temporal Surveillance. 2 3 Geographic Surveillance and Hotspot Detection for Homeland Security: Cyber Security and Computer Network Diagnostics.
New York City and Other G. P. Patil. New York City Water Distribution Network.
1 Fukuoka Conference, Japan G. P. Patil November 2005.
4.6.1 Upper Echelons of Surfaces
Health GeoInformatics
Spatially Constrained Clustering and Upper Level Set Scan Hotspot Detection in Surveillance GeoInformatics G.P.Patil, Penn State University Reza Modarres,
5/22/2018 Forum for Interdisciplinary Mathematics Patna, India G. P. Patil December 2010.
INTRODUCTION TO GEOGRAPHICAL INFORMATION SYSTEM
Geoinformatics Seminar G. P. Patil March 2003
EPA Presentation March 13,2003 G. P. Patil
12.14 Myers, W. L., Bishop, J., Brooks, R., and Patil, G. P. (2001).
NSF Digital Government surveillance geoinformatics project, federal agency partnership and national applications for digital governance.
Indexing and Hashing Basic Concepts Ordered Indices
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots
Albany New York (1) G. P. Patil
Presentation transcript:

1 NJ DHSS CES SEER G. P. Patil January 17, 2003

2 This report is very disappointing. What kind of software are you using?

3 Stone-Age Space-Age Syndrome Stone-age data Space-age data Stone-age analysis Space-age analysis

4

5

6 PROJECT GEOINFORMATIC SURVEILLANCE DECISION SUPPORT SYSTEM Geographic and Network Surveillance for Arbitrary Shaped Hotspots - Next Generation of Geographic Hotspot Detection, Prioritization, and Early Warning System- G. P. Patil, W. L. Myers, C. Taillie, and D. Wardrop Pennsylvania State University, University Park Webpage:

7 Geographical Surveillance Discrete response Discrete response Hotspot detection and upper level set scan Hotspot detection and upper level set scan Hotspot delineation and hot-spot rating Hotspot delineation and hot-spot rating Multiple hotspot detection and delineation Multiple hotspot detection and delineation Hotspot prioritization and poset ranking Hotspot prioritization and poset ranking Space-time detection and early warning Space-time detection and early warning Continuous response Continuous response User friendly software and downloadable website User friendly software and downloadable website

8 Areas of Application Biodiversity, species-rich, and species-poor areas Biodiversity, species-rich, and species-poor areas Water resources at watershed scales Water resources at watershed scales Power lines and their effects Power lines and their effects Networks of water distribution systems, subway systems, and road transport systems Networks of water distribution systems, subway systems, and road transport systems Urban and regional planning Urban and regional planning Disease epidemiology Disease epidemiology Medical imaging Medical imaging Reconnaissance Reconnaissance Astronomy Astronomy Archaeology Archaeology

9

10 Spatial Scan Statistic Limitations and Needs Circles capture only compactly shaped clusters Circles capture only compactly shaped clusters Want to identify clusters of arbitrary shape Want to identify clusters of arbitrary shape Circles handle only synoptic (tessellated ) data Circles handle only synoptic (tessellated ) data Want to handle data on a network Want to handle data on a network Circles provide point estimate of hotspot Circles provide point estimate of hotspot Want to assess estimation uncertainty (hotspot confidence set) Want to assess estimation uncertainty (hotspot confidence set)

11 Attractive Features Identifies arbitrarily shaped clusters Identifies arbitrarily shaped clusters Data-adaptive zonation of candidate hotspots Data-adaptive zonation of candidate hotspots Applicable to data on a network Applicable to data on a network Provides both a point estimate as well as a confidence set for the hotspot Provides both a point estimate as well as a confidence set for the hotspot Uses hotspot-membership rating to map hotspot boundary uncertainty Uses hotspot-membership rating to map hotspot boundary uncertainty Computationally efficient Computationally efficient Applicable to both discrete and continuous syndromic responses Applicable to both discrete and continuous syndromic responses Identifies arbitrarily shaped clusters in the spatial-temporal domain Identifies arbitrarily shaped clusters in the spatial-temporal domain Provides a typology of space-time hotspots with discriminatory surveillance potential Provides a typology of space-time hotspots with discriminatory surveillance potential Hotspot Detection Innovation Upper Level Set Scan Statistic

12 We also present a prioritization innovation. It lies in the ability for prioritization and ranking of hotspots based on multiple indicator and stakeholder criteria without having to integrate indicators into an index, using Hasse diagrams and partial order sets. This leads us to early warning systems, and also to the selection of investigational areas. Prioritization Innovation Partial Order Set Ranking

13 Syndromic Surveillance Symptoms of disease such as diarrhea, respiratory problems, headache, etc Symptoms of disease such as diarrhea, respiratory problems, headache, etc Earlier reporting than diagnosed disease Earlier reporting than diagnosed disease Less specific, more noise Less specific, more noise

14 The Spatial Scan Statistic Move a circular window across the map. Move a circular window across the map. Use a variable circle radius, from zero up Use a variable circle radius, from zero up to a maximum where 50 percent of the population is included.

15 A small sample of the circles used

16 For each circle: – Obtain actual and expected number of cases inside and outside the circle. – Calculate Likelihood Function Compare Circles: – Pick circle with maximum likelihood. This is the most likely cluster, i.e., the cluster least likely to have occurred by chance Inference: – Generate random replicas of the data set under the null- hypothesis of no clusters (Monte Carlo sampling) – Compare most likely clusters in real and random data sets (Likelihood ratio test)

17 Zonal p-Values -- 1 Zonal p-values depend upon:  zonal intensity  zonal sample size Example:  1 case among 5 persons is not significant  1 thousand cases among 5 thousand persons might well significant

18 Zonal p-Values -- 2 Sample Size Zonal Intensity Contour of constant p-values (p =.05) Zone 1 Zone 2 Zone 3 Zone 1 is not significant (high intensity but small sample size) Zone 2 is significant even though its intensity is smaller than that of Zone 1 Zone 3 is not significant due to low intensity

19 Spatial Scan Statistic: Properties – Adjusts for inhomogeneous population density. – Simultaneously tests for clusters of any size and any location, by using circular windows with continuously variable radius. – Accounts for multiple testing. – Possibility to include confounding variables, such as age, sex or socio-economic variables. – Aggregated or non-aggregated data (states, counties, census tracts, block groups, households, individuals).

20 Detecting Emerging Clusters Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. The base of the cylinder represents space, while the height represents time. The base of the cylinder represents space, while the height represents time. The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters. The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters.

21 West Nile Virus Surveillance in New York City 2000 Data: Simulation/Testing of Prospective Surveillance System 2000 Data: Simulation/Testing of Prospective Surveillance System 2001 Data: Real Time Implementation of Daily Prospective Surveillance 2001 Data: Real Time Implementation of Daily Prospective Surveillance

Data - Dead birds reported by the public - Simulation of a daily prospective surveillance system - Start date: June 1, 2000.

23

24

25 Major epicenter on Staten Island Dead bird surveillance system: June 14 Dead bird surveillance system: June 14 Positive bird report: July 16 (coll. July 5) Positive bird report: July 16 (coll. July 5) Positive mosquito trap: July 24 (coll. July 7) Positive mosquito trap: July 24 (coll. July 7) Human case report: July 28 (onset July 20) Human case report: July 28 (onset July 20)

26 Hospital Emergency Admissions in New York City Hospital emergency admissions data from a majority of New York City hospitals. Hospital emergency admissions data from a majority of New York City hospitals. At midnight, hospitals report last 24 hour of At midnight, hospitals report last 24 hour of data to New York City Department of Health A spatial scan statistic analysis is performed every morning A spatial scan statistic analysis is performed every morning If an alarm, a local investigation is conducted If an alarm, a local investigation is conducted

27 Bird richness in the hexagons.

28 Statewide echelon map based on EMAP hexagons. The 4-digit number in each hexagon is the EPA-EMAP identifier, while the number below is species richness.

29 Aggregation-Resolution Issues

30 Aggregation-Resolution Issues

31

32 Ridge and Valley Hotspots Ridge and Valley hotspots within hotspots. Hotspots within Hotspots

33 Candidate Zones for Hotspots Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region A list of candidate zones Z is specified a priori A list of candidate zones Z is specified a priori –This list becomes part of the parameter space and the zone must be estimated from within this list –Each candidate zone should generally be spatially connected, e.g., a union of contiguous spatial units or cells –Longer lists of candidate zones are usually preferable –Expanding circles or ellipses about specified centers are a common method of generating the list

34 Poor Hotspot Delineation by Circular Zones Hotspot Circular zone approximations Circular zones may represent single hotspot as multiple hotspots

35 Circular spatial scan statistic zonation Cholera outbreak along a river flood plain Small circles miss much of the outbreak Large circles include many unwanted cells

36

37 Urban Heat Islands Use of aircraft and spacecraft remote sensing data on a local scale to help quantify and map urban sprawl, land use change, urban heat island, air quality, and their impact on human health Use of aircraft and spacecraft remote sensing data on a local scale to help quantify and map urban sprawl, land use change, urban heat island, air quality, and their impact on human health (e.g. pediatric asthma) (e.g. pediatric asthma)

38 Baltimore Asthma Project Interdisciplinary Analysis of Childhood Asthma in Baltimore, MD 1.Collect and integrate in-situ measurements, remotely sensed measurements and clinical records that have possible relationships to the occurrence of asthma in the Baltimore, Maryland region. 2.Identify key trigger variables from the data to predict asthma occurrence on a spatial and temporal basis. 3.Organize a multidisciplinary team to assist in model design, analysis and interpretation of model results. 4.Develop tools for integrating, accessing and manipulating relevant health and remote sensing data and make these tools available to the scientific and health communities. Partners: Baltimore City Health Department Baltimore City School System Baltimore City Planning Council, Mayor’s Office State of Maryland Department of the Environment State of Maryland Department of Health and Human Services University of Maryland Asthma Assessment from GIS techniques Inner Harbor Time Aerosol Size The impact of asthma is escalating within the U.S. and children are particularly impacted with hospitalization increasing 74% since This study is investigating climate and environmental links to asthma in Baltimore, Maryland, a city in the top quintile for children’s asthma in the U.S.

39 Crime Rate Hotspots

40 Outbreak expanding in time Small cylinders miss much of the outbreak Large cylinders include many unwanted cells Space Time Cylindrical space-time scan statistic zonation

41 Some Space-Time Hotspots and Their Cylindrical Approximations Hotspot Cylindrical approximation Cylindrical approximation sees single hotspot as multiple hotspots Space Time

42 ULS Candidate Zones Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones? Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? A data-driven proposal: Candidate zones are the connected A data-driven proposal: Candidate zones are the connected components of the upper level sets of the response surface. The candidate zones have a tree structure (echelon tree is a subtree), which may assist in automated detection of multiple, but geographically separate, elevated zones. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set.

43 Data-adaptive approach to reduced parameter space  0 Data-adaptive approach to reduced parameter space  0 Zones in  0 are connected components of upper level sets of the empirical intensity function G a = Y a / A a Zones in  0 are connected components of upper level sets of the empirical intensity function G a = Y a / A a Upper level set (ULS) at level g consists of all cells a where G a  g Upper level set (ULS) at level g consists of all cells a where G a  g Upper level sets may be disconnected. Connected components are Upper level sets may be disconnected. Connected components are the candidate zones in  0 These connected components form a rooted tree under set inclusion. These connected components form a rooted tree under set inclusion. –Root node = entire region R –Leaf nodes = local maxima of empirical intensity surface –Junction nodes occur when connectivity of ULS changes with falling intensity level ULS Scan Statistic

44 Upper Level Set (ULS) of Intensity Surface Hotspot zones at level g (Connected Components of upper level set)

45 Changing Connectivity of ULS as Level Drops g

46 ULS Connectivity Tree Schematic intensity “surface” N.B. Intensity surface is cellular (piece-wise constant), with only finitely many levels A, B, C are junction nodes where multiple zones coalesce into a single zone A B C

47 Ingredients: Ingredients: – Tessellation of a geographic region: –Intensity value G on each cell. Determines a cellular (piece- wise constant) surface with G as elevation. Imagine surface initially inundated with water Imagine surface initially inundated with water Water evaporates gradually exposing the surface which appears as islands in the sea Water evaporates gradually exposing the surface which appears as islands in the sea How does connectivity (number of connected components) of the exposed surface change with falling water level? How does connectivity (number of connected components) of the exposed surface change with falling water level? ULS Connectivity Tree -1 a c b k d e f g h i j a, b, c, … are cell labels

48 Think of the tessellated surface as a landform Think of the tessellated surface as a landform Initially the entire surface is under water Initially the entire surface is under water As the water level recedes, more and more of the landform is As the water level recedes, more and more of the landform isexposed At each water level, cells are colored as follows: At each water level, cells are colored as follows: –Green for previously exposed cells (green = vegetated) –Yellow for newly exposed cells (yellow = sandy beach) –Blue for unexposed cells (blue = under water) For each newly exposed cell, one of three things happens: For each newly exposed cell, one of three things happens: –New island emerges. Cell is a local maximum. Morse index=2. Connectivity increases. –Existing island increases in size. Cell is not a critical point. Connectivity unchanged. –Two (or more) islands are joined. Cell is a saddle point Morse index=1. Connectivity decreases. ULS Connectivity Tree -2

49 a c b k d e f g h i j ULS Connectivity Tree -3 ULS Tree a a a b,c b k d e f g h i j c Newly exposed island Island grows

50 ULS Connectivity Tree -4 ULS Tree a a b,c b k d e f g h i j c Second island appears d a a b,c b k d e f g h i j c Both islands grow d e f,g New leaf node (local maximum)

51 ULS Connectivity Tree -5 ULS Tree a a b,c b k d e f g h i j c Islands join – saddle point d e f,g h Junction node a a b,c b d e f g h i j c Exposed land grows d e f,g h k i,j,k Root node

52 Agreement/Disagreement regarding hotspot locus Agreement/Disagreement regarding hotspot locus Comparative plausibility and accuracy of hotspot delineation Comparative plausibility and accuracy of hotspot delineation Execution time and computer efficiency Execution time and computer efficiency Comparison of Tree-Structured and Circle-Based SATScan

53 A hotspot confidence set with two connected components is shown on the ULS tree. The connected components correspond to different hotspot loci while the nodes within a connected component correspond to different delineations of that hotspot – all at the appropriate confidence level. Confidence Region on ULS Tree

54 Estimation Uncertainty in Hotspot Delineation Outer envelope Inner envelope MLE

55 Determine a confidence set for the hotspot Determine a confidence set for the hotspot Each member of the confidence set is a zone which is a statistically plausible delineation of the hotspot at specified confidence Each member of the confidence set is a zone which is a statistically plausible delineation of the hotspot at specified confidence Confidence set lets us rate individual cells a for hotspot membership Confidence set lets us rate individual cells a for hotspot membership Rating for cell a is percentage of zones in confidence set that contain a. (More generally, use weighted proportion.) Rating for cell a is percentage of zones in confidence set that contain a. (More generally, use weighted proportion.) Map of cell ratings: Map of cell ratings: –Inner envelope = cells with 100% rating –Outer envelope = cells with positive rating Hotspot Delineation and Hotspot Rating

56

57 Ranking Possible Disease Clusters in the State of New York Data Matrix

58 Regions of comparability and incomparability for the inherent importance ordering of hotspots. Hotspots form a scatterplot in indicator space and each hotspot partitions indicator space into four quadrants

59 Hotspot Prioritization and Poset Ranking Multiple hotspots with intensities significantly elevated relative to the rest of the region Multiple hotspots with intensities significantly elevated relative to the rest of the region Ranking based on likelihood values, and additional attributes: raw intensity values, socio-economic and demographic factors, feasibility scores, excess cases, seasonal residence, atypical demographics, etc. Ranking based on likelihood values, and additional attributes: raw intensity values, socio-economic and demographic factors, feasibility scores, excess cases, seasonal residence, atypical demographics, etc. Multiple attributes, multiple indicators Multiple attributes, multiple indicators Ranking without having to integrate the multiple indicators into a composite index Ranking without having to integrate the multiple indicators into a composite index

60 First stage screening First stage screening –Significant clusters by SaTScan and/or upper level sets upper level sets Second stage screening Second stage screening –Multicriteria noteworthy clusters by partially ordered sets and Hass diagrams Final stage screening Final stage screening –Follow up clusters for etiology, intervention based on multiple criteria using Hass diagrams based on multiple criteria using Hass diagrams Multiple Criteria Analysis, Multiple Indicators and Choices, Health Statistics, Disease Etiology, Health Policy, Resource Allocation

61 HUMAN ENVIRONMENT INTERFACE LAND, AIR, WATER INDICATORS RANK COUNTRYLANDAIRWATER 1Sweden 2Finland 3Norway 5 Iceland 13 Austria 22 Switzerland 39 Spain 45 France 47 Germany 51 Portugal 52 Italy 59 Greece 61 Belgium 64 Netherlands 77 Denmark 78 United Kingdom 81 Ireland for land - % of undomesticated land, i.e., total land area-domesticated (permanent crops and pastures, built up areas, roads, etc.) for air - % of renewable energy resources, i.e., hydro, solar, wind, geothermal for water - % of population with access to safe drinking water

62 Hasse Diagram (Western Europe)

63 Ranking Partially Ordered Sets – 5 Linear extension decision tree a b dc e f a c e b b d ff d ed f e e f c f d ed f e e f d f e e f c f e e f c c f d ed f e e f d f e e f c b a b a d Jump Size: Poset (Hasse Diagram)

64 Cumulative Rank Frequency Operator – 5 An Example of the Procedure In the example from the preceding slide, there are a total of 16 linear extensions, giving the following cumulative frequency table. Rank Element a b c d e f Each entry gives the number of linear extensions in which the element (row label) receives a rank equal to or better that the column heading

65 Cumulative Rank Frequency Operator – 6 An Example of the Procedure 16 The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f

66 Cumulative Rank Frequency Operator – 7 An example where F must be iterated Original Poset (Hasse Diagram) a f eb c g d h a f e b ad c h g a f e b ad c h g F F 2

67 Cumulative Rank Frequency Operator – 8 An example where F results in ties Original Poset (Hasse Diagram) a cb d a b, c (tied) d F Ties reflect symmetries among incomparable elements in the original Hasse diagram Elements that are comparable in the original Hasse diagram will not become tied after applying F operator

68

69 Incorporating Judgment Poset Cumulative Rank Frequency Approach Certain of the indicators may be deemed more important than the others Such differential importance can be accommodated by the poset cumulative rank frequency approach Instead of the uniform distribution on the set of linear extensions, we may use an appropriately weighted probability distribution , e.g.,

70 Network-based Surveillance Subway system surveillance Subway system surveillance Drinking water distribution system surveillance Drinking water distribution system surveillance Stream and river system surveillance Stream and river system surveillance Postal System Surveillance Postal System Surveillance Road transport surveillance Road transport surveillance

71 Figure 1. Upper Juniata river network with superimposed network of Pennsylvania Department of Environmental Protection (DEP) sampling stations. In Phase 1, scan statistics methods will be used to identify hotspots of biological impairment. Phase 2 incorporates potential explanatory factors and identifies residual hotspots that are subjected to detailed network modeling in Phase 3.

72 Disease Mapping and Evaluation Elevated rate cluster detection and assessment Elevated rate cluster detection and assessment Geographic surveillance and evaluation Geographic surveillance and evaluation Comparative studies involving statistical power Comparative studies involving statistical power Geometric accuracy in cluster estimation Geometric accuracy in cluster estimation Real data projects and validations Real data projects and validations Computing and software Computing and software

73 Agent Orange Dioxin Hotspots, coldspots, and midspots Hotspots, coldspots, and midspots Identification and prioritization Identification and prioritization Persistent hotspot trajectories Persistent hotspot trajectories Spatial disease monitoring, modeling, and tracking Spatial disease monitoring, modeling, and tracking Health effect surveillance system Health effect surveillance system

74 Agent Orange Dioxin Ecosystem impact assessment Ecosystem impact assessment Deforestation, defoliation, remote sensing Deforestation, defoliation, remote sensing Hyperspectral change detection and accuracy assessment Hyperspectral change detection and accuracy assessment Emergent advanced raster map analysis system Emergent advanced raster map analysis system

75 Crop Biosurveillance/Biosecurity

76 Crop Biosurveillance/Biosecurity

77 Crop Biosurveillance/Biosecurity

78 Emergent Surveillance Plexus Surveillance Sensor Network Testbed Autonomous Ocean Sampling Network Types of Hotspots Hotspots due to multiple, localized, stationary sources Hotspots due to multiple, localized, stationary sources Hotspots corresponding to areas of interest in a stationary mapped field Hotspots corresponding to areas of interest in a stationary mapped field Time-dependent, localized hotspots Time-dependent, localized hotspots Hotspots due to moving point sources Hotspots due to moving point sources

79 Ocean SAmpling MObile Network OSAMON

80 Ocean SAmpling MObile Network OSAMON Feedback Loop Network sensors gather preliminary data Network sensors gather preliminary data ULS scan statistic uses available data to estimate hotspot ULS scan statistic uses available data to estimate hotspot Network controller directs sensor vehicles to new locations Network controller directs sensor vehicles to new locations Updated data is fed into ULS scan statistic system Updated data is fed into ULS scan statistic system

81 High - -- Indoor / Outdoor GPS for cooperative navigation Cross-platform joint service experiments Operational testing for more realistic adaptation Aerial Mapping adds new perspective Expand data storage and processing Larger cross-platform teams capture realistic cooperation and dynamic adaptation Surveillance Sensor Network Testbed

82 SAmpling MObile Networks (SAMON) Additional Application Contexts Hotspots for radioactivity and chemical or biological agents to prevent or mitigate the effects of terrorist attacks or to detect nuclear testing Hotspots for radioactivity and chemical or biological agents to prevent or mitigate the effects of terrorist attacks or to detect nuclear testing Mapping elevation, wind, bathymetry, or ocean currents to better understand and protect the environment Mapping elevation, wind, bathymetry, or ocean currents to better understand and protect the environment Detecting emerging failures in a complex networked system like the electric grid, internet, cell phone systems Detecting emerging failures in a complex networked system like the electric grid, internet, cell phone systems Mapping the gravitational field to find underground chambers or tunnels for rescue or combat missions Mapping the gravitational field to find underground chambers or tunnels for rescue or combat missions

83 Environmental Justice Is environmental degradation worse in poor and minority communities? Is environmental degradation worse in poor and minority communities? Identification of persistent poverty and its trajectories Identification of persistent poverty and its trajectories Identification of pollution proximity and vulnerability Identification of pollution proximity and vulnerability Co-location research hampered by the lack of methods and tools of investigation Co-location research hampered by the lack of methods and tools of investigation

84

85 Space-Time Poverty Hotspot Typology Federal Anti-Poverty Programs have had little success in eradicating pockets of persistent poverty Federal Anti-Poverty Programs have had little success in eradicating pockets of persistent poverty Can spatial-temporal patterns of poverty hotspots provide clues to the causes of poverty and lead to improved location- specific anti-poverty policy ? Can spatial-temporal patterns of poverty hotspots provide clues to the causes of poverty and lead to improved location- specific anti-poverty policy ?

86 Dimensions of Tract Poverty in Four Metropolitan Areas Concentrated Persistent Camden, NJ Detroit, MI Oakland, CA Memphis, TN Growing Shifting

87 Growing poverty

88 Persistent poverty

89 Oakland 1970 Poverty dataOakland 1980 Poverty data Oakland 1990 Poverty data Shifting poverty

90 Camden 1990 Poverty data Concentrated poverty Camden 1970 Poverty dataCamden 1980 Poverty data

91 Hotspot Persistence Hotspot Persistence Space (census tract) Time (census year) Persistent Hotspot Long Duration Space (census tract) Time (census year) One-shot Hotspot Short Duration Persistence is a property of space-time hotspots Persistence can be assessed by the projection of the space-time hotspot onto the time axis

92 Typology of Persistent Space-Time Hotspots-1 Space (census tract) Time (census year) Stationary Hotspot Space (census tract) Time (census year) Shifting Hotspot Space (census tract) Time (census year) Expanding Hotspot Space (census tract) Time (census year) Contracting Hotspot

93 Typology of Persistent Space-Time Hotspots-2 Space (census tract) Time (census year) Bifurcating Hotspot These hotspots are connected in space-time However, certain time slices of the hotspot are disconnected in space Space (census tract) Time (census year) Merging Hotspot Spatially disconnected time slice

94 Trajectory of a Persistent Space-Time Hotspot A space-time hotspot is a three-dimensional object Visualization can be done by displaying the sequence of time slices---called the trajectory of the hotspot Time slices of space-time hotspot Space (census tract) Time (census year) Merging Hotspot

95 Trajectory of a Merging Hotspot

96 Trajectory of a Shifting Hotspot

97 Geospatial Patterns and Pattern Metrics Geospatial Patterns and Pattern Metrics –Landscape patterns, disease patterns, mortality patterns Surface Topology and Spatial Structure Surface Topology and Spatial Structure –Hotspots, outbreaks, critical areas –Intrinsic hierarchical decomposition, study areas, reference areas –Change detection, change analysis, spatial structure of change Multiscale Advanced Raster Map Analysis

98 Needed Downloadable geoinformatic analysis system Downloadable geoinformatic analysis system Identification of arbitrary shape hotspots, coldspots, midspots Identification of arbitrary shape hotspots, coldspots, midspots Multi-criteria prioritization and ranking Multi-criteria prioritization and ranking Synoptic and network-based surveillance Synoptic and network-based surveillance Multiple scales and aggregation levels Multiple scales and aggregation levels Space, time, and space-time Space, time, and space-time Zonal tree scan system Zonal tree scan system

99 Terra Seer Long Island Study-1 August 13, 2002 GEOGRAPHIC TRACE OF CANCER The Geographic Distribution of BLC Cancers in Long Island NY The Geographic Distribution of BLC Cancers in Long Island NY Geographic Trace of Cancer (the pattern of cases over time and space) may even become as important as genetic research in the search for causes Geographic Trace of Cancer (the pattern of cases over time and space) may even become as important as genetic research in the search for causes Need of an Exploratory, Integrative, Need of an Exploratory, Integrative, Multi-scalar Approach

100 Terra Seer Long Island Study-2 August 13, 2002 STRENGTHENING SATSCAN The techniques we employ are not ‘better’ or ‘more accurate’ than the scan statistic The techniques we employ are not ‘better’ or ‘more accurate’ than the scan statistic Using a battery of approaches allows us to quantify different aspects of univariate and multivariate clusters, to explore different scales of clustering, and to evaluate how sensitive the results are to different definitions of clustering Using a battery of approaches allows us to quantify different aspects of univariate and multivariate clusters, to explore different scales of clustering, and to evaluate how sensitive the results are to different definitions of clustering The methods we have used are not designed to replace the scan approach, rather they should be used with the scan statistic to develop a more comprehensive understanding of geographic variation of cancer morbidity The methods we have used are not designed to replace the scan approach, rather they should be used with the scan statistic to develop a more comprehensive understanding of geographic variation of cancer morbidity

101 Terra Seer Long Island Study-3 August 13, 2002 HOTSPOTS, PUBLIC AND AGENCIES Public: Being singled out as an elevated cluster community is such a source of dread for many people Public: Being singled out as an elevated cluster community is such a source of dread for many people Agency: Hotspot status creates intense pressure to study individual clusters instead of broader cancer patterns Agency: Hotspot status creates intense pressure to study individual clusters instead of broader cancer patterns Need: Hotspot rating along with hotspot detection, prioritization, and ranking interval with multicriteria Need: Hotspot rating along with hotspot detection, prioritization, and ranking interval with multicriteria

102 Terra Seer Long Island Study-4 August 13, 2002 Issues and Approaches Issues and Approaches –Local Moran test, Boundary Detection, Boundary Overlap, Multi-Scalar, and Spatially Constrained Clustering Adjacency: Centroid, common border, upper level set tree neighbor Adjacency: Centroid, common border, upper level set tree neighbor Multiscalar: Multivariate response, Multivariate Binomial, Poisson, Gamma, Lognormal Multiscalar: Multivariate response, Multivariate Binomial, Poisson, Gamma, Lognormal Scale of Study: Spatial extent, Hotspot in Hotspots Scale of Study: Spatial extent, Hotspot in Hotspots Scale of Process: Resolution, Aggregation Level Scale of Process: Resolution, Aggregation Level

103 Recommendations Research, education, and outreach initiative Research, education, and outreach initiative National Center National Center Multi-focal multi-disciplinary thrust Multi-focal multi-disciplinary thrust Concepts, methods, tools, validations Concepts, methods, tools, validations Solid, sound, sophisticated and yet user- friendly Priority activities and regional projects Priority activities and regional projects Downloadable user-friendly software system Downloadable user-friendly software system Innovative Synergistics Innovative Synergistics

104 Penn State Involvements Center for Statistical Ecology and Environmental Statistics/NSF,EPA Center for Statistical Ecology and Environmental Statistics/NSF,EPA Poverty Research Center/Ford Foundation Poverty Research Center/Ford Foundation GeoVista Center/NCI, NSF GeoVista Center/NCI, NSF Geovisualization and Spatial Analysis of Cancer Data NCI Program For: Geographic-based Research in Cancer Control and Epidemiology Appalachia Cancer Network/NCI Appalachia Cancer Network/NCI Environmental Consortium on Biosurveillance Environmental Consortium on Biosurveillance Cooperative Wetlands Center/EPA Cooperative Wetlands Center/EPA Center for Remote Sensing of Earth Resources/NASA, EPA Center for Remote Sensing of Earth Resources/NASA, EPA

105 Logo for Statistics, Environment, Health, Ecology, and Society