Project Geoinformatic Surveillance NSF DGP Grant 307010 G. P. Patil, Penn State, PI EPA: Watershed Characterization and Prioritization PADOH: Disease Clusters.

Slides:



Advertisements
Similar presentations
Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Early Detection of Disease Outbreaks Prospective Surveillance.
Geographic Information Systems
Collaborative Signal Processing CS 691 – Wireless Sensor Networks Mohammad Ali Salahuddin 04/22/03.
Landscape and Urban Planning Volume 79, Issue 1Landscape and Urban Planning Volume 79, Issue 1, 15 January 2007, Pages Biological integrity in.
SEEA Experimental Ecosystem Accounts: A Proposed Outline and Road Map Sixth Meeting of the UN Committee of Experts on Environmental-Economic Accounting.
The Spatial Scan Statistic. Null Hypothesis The risk of disease is the same in all parts of the map.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
GIS in Real Estate Phil Hurvitz CAUP-Urban Form Lab April 13, 2005.
Methods and Tools to Integrate Biodiversity into Land Use Planning
Using ArcView to Create a Transit Need Index John Babcock GRG394 Final Presentation.
Energy-Aware Scheduling with Quality of Surveillance Guarantee in Wireless Sensor Networks Jaehoon Jeong, Sarah Sharafkandi and David H.C. Du Dept. of.
1 Survey of the Nation’s Lakes Presentation at NALMS’ 25 th Annual International Symposium Nov. 10, 2005.
1 NEST New and emerging science and technology EUROPEAN COMMISSION - 6th Framework programme : Anticipating Scientific and Technological Needs.
A Data Intensive High Performance Simulation & Visualization Framework for Disease Surveillance Arif Ghafoor, David Ebert, Madiha Sahar Ross Maciejewski,
INTERACTIVE ANALYSIS OF COMPUTER CRIMES PRESENTED FOR CS-689 ON 10/12/2000 BY NAGAKALYANA ESKALA.
The roots of innovation Future and Emerging Technologies (FET) Future and Emerging Technologies (FET) The roots of innovation Proactive initiative on:
Objectives: 1.Enhance the data archive for these estuaries with remotely sensed and time-series information 2.Exploit detailed knowledge of ecosystem structure.
Extent and Mask Extent of original data Extent of analysis area Mask – areas of interest Remember all rasters are rectangles.
Components of the Global Climate Change Process IPCC AR4.
Opportunities for Research in the Dynamics of Water Processes in the Environment at NSF Pam Stephens Directorate of Geosciences, NSF Directorate of Geosciences,
Shaping a Health Statistics Vision for the 21 st Century 2002 NCHS Data Users Conference 16 July 2002 Daniel J. Friedman, PhD Massachusetts Department.
Research on Rural Resource Management and the Rural Economy: Addressing the Local and Regional Dimension Royal Society of Edinburgh 16 May 2007.
Governor’s Office of Homeland Security and Emergency Response State Directors Meeting February 24, 2014 Bruce A. Davis, Ph.D. Senior Program Manager Resilient.
Analyzing the Geospatial Imbalance of the Primary Care Physician Labor Supply in the Contiguous United States By Russ Frith University of W. Florida Capstone.
The Statistical Urban Zoning. The Experience of the Municipality of Firenze La zonizzazione statistica in ambito urbano. L’esperienza del Comune di Firenze.
Detection, Classification and Tracking in Distributed Sensor Networks D. Li, K. Wong, Y. Hu and A. M. Sayeed Dept. of Electrical & Computer Engineering.
Patterns and Trends CE/ENVE 424/524. Classroom Situation Option 1: Stay in Lopata House 22 pros: spacious room desks with chairs built in projector cons:
Using Regional Models to Assess the Relative Effects of Stressors Lester L. Yuan National Center for Environmental Assessment U.S. Environmental Protection.
NOAA Vision and Mission Goals Pedro J. Restrepo, Ph.D., P.E. Senior Scientist, Office of Hydrologic Development NOAA/NWS First Q2 Workshop (Q2 - "Next.
Nationwide Sustainability Indicators and Their Integration, Evaluation, and Visualization Worldwide - UNEP Initiative - Sustainability Indicators Indicators.
INDIAN SCIENCE CONGRESS Mumbai 2015 Actuarial Science Symposium G. P. Patil Penn State University, University Park, PA USA.
1 RTI SYMPOSIUM on HOMELAND and HEALTH SECURITY Biosurveillance Geoinformatics of Hotspot Detection and Prioritization for Biosecurity G. P. Patil November.
NIEHS G. P. Patil. This report is very disappointing. What kind of software are you using?
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
1 Forum for Interdisciplinary Mathematics Patna, India G. P. Patil December 2010.
1 Cleveland Clinic G. P. Patil October 8, 2004 Cleveland.
Myers, W. L., Bishop, J., Brooks, R., and Patil, G. P. (2001). Composite spatial indexing of regional habitat importance. Community Ecology, 2(2), 213—220.
Motivation, Description, and Timeliness Geoinformatics for spatial and temporal hotspot detection and prioritization is a critical need for.
1 Surveillance GeoInformatics Hotspot Detection, Prioritization, and Early Warning G. P. Patil December 2004 – January 2005.
1 Poset Prioritization G. P. Patil October We also present a prioritization innovation. It lies in the ability for prioritization and ranking.
1 Seattle JSM Session G. P. Patil August 7, 2006.
Hotspot Detection, Delineation, and Prioritization for Geographic Surveillance and Early Warning Organizer and Chair : G. P. Patil  2:00—2:05 Chair 
1 NJ DHSS CES SEER G. P. Patil January 17, This report is very disappointing. What kind of software are you using?
Albany New York (1) G. P. Patil. Albany New York (2) G. P. Patil.
Early Detection of Disease Outbreaks with Applications in New York City Martin Kulldorff University of Connecticut Farzad Mostashari and James Miller.
1 Seattle JSM Session G. P. Patil August 6, 2006.
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots Overview Geospatial Surveillance Upper Level Set Scan Statistic System Spatial-Temporal.
1 Annual Digital Government Research Conference San Diego, CA Project Highlights G.P. Patil May 2006.
1 Biosurveillance Sensor Networks and Resultant Spatiotemporal Data for Crisis-Index Development and Early Warning Austin, March 2005 G. P. Patil Austin,
1 Multi-criterion Ranking and Poset Prioritization G. P. Patil December 2004 – January 2005.
JalaSRI Consortium Delhi – Jalgaon Workshop TERI U G.P. Patil June 1, 2009.
1 Spatial Temporal Surveillance. 2 3 Geographic Surveillance and Hotspot Detection for Homeland Security: Cyber Security and Computer Network Diagnostics.
1 Fukuoka Conference, Japan G. P. Patil November 2005.
4.6.1 Upper Echelons of Surfaces
Health GeoInformatics
Spatially Constrained Clustering and Upper Level Set Scan Hotspot Detection in Surveillance GeoInformatics G.P.Patil, Penn State University Reza Modarres,
5/22/2018 Forum for Interdisciplinary Mathematics Patna, India G. P. Patil December 2010.
Geoinformatics Seminar G. P. Patil March 2003
EPA Presentation March 13,2003 G. P. Patil
12.14 Myers, W. L., Bishop, J., Brooks, R., and Patil, G. P. (2001).
NSF Digital Government surveillance geoinformatics project, federal agency partnership and national applications for digital governance.
One Health Early Warning Alert
Geographic and Network Surveillance for Arbitrarily Shaped Hotspots
URBAN - Mission “economic and social regeneration of cities and neighbourhoods in crisis” Lewis Dijkstra, Ph.D. DG Regional Policy.
Information Session January 18, :00-1:45 pm
Work Programme 2012 COOPERATION Theme 6 Environment (including climate change) Challenge 6.4 Protecting citizens from environmental hazards European.
Albany New York (1) G. P. Patil
Presentation transcript:

Project Geoinformatic Surveillance NSF DGP Grant G. P. Patil, Penn State, PI EPA: Watershed Characterization and Prioritization PADOH: Disease Clusters and Prioritization TAPAC: CDC, EPA,NASA,NIH,USGS

2

3

4 The Spatial Scan Statistic Move a circular window across the map. Move a circular window across the map. Use a variable circle radius, from zero up Use a variable circle radius, from zero up to a maximum where 50 percent of the population is included.

5 A small sample of the circles used

6 Spatial Scan Statistic: Properties – Adjusts for inhomogeneous population density. – Simultaneously tests for clusters of any size and any location, by using circular windows with continuously variable radius. – Accounts for multiple testing. – Possibility to include confounding variables, such as age, sex or socio-economic variables. – Aggregated or non-aggregated data (states, counties, census tracts, block groups, households, individuals).

7 Detecting Emerging Clusters Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. The base of the cylinder represents space, while the height represents time. The base of the cylinder represents space, while the height represents time. The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters. The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters.

8 Major epicenter on Staten Island Dead bird surveillance system: June 14 Dead bird surveillance system: June 14 Positive bird report: July 16 (coll. July 5) Positive bird report: July 16 (coll. July 5) Positive mosquito trap: July 24 (coll. July 7) Positive mosquito trap: July 24 (coll. July 7) Human case report: July 28 (onset July 20) Human case report: July 28 (onset July 20)

9

10 Hospital Emergency Admissions in New York City Hospital emergency admissions data from a majority of New York City hospitals. Hospital emergency admissions data from a majority of New York City hospitals. At midnight, hospitals report last 24 hour of At midnight, hospitals report last 24 hour of data to New York City Department of Health A spatial scan statistic analysis is performed every morning A spatial scan statistic analysis is performed every morning If an alarm, a local investigation is conducted If an alarm, a local investigation is conducted

11 You Are Invited NSF DGP PROJECT Geoinformatic Surveillance: Hotspot Detection and Prioritization Across Geographic Regions and Networks for Digital Government in the 21st Century Geoinformatic surveillance for spatial and temporal hotspot detection and prioritization is a critical need for the 21st century Digital Government. A hotspot can mean an unusual phenomenon, anomaly, aberration, outbreak, elevated cluster, or critical area. The declared need may be for monitoring, etiology, management, or early warning. The responsible factors may be natural, accidental or intentional, with relevance to both infrastructure and homeland security. This project describes a multi-disciplinary research program based on novel methods and tools for hotspot detection and prioritization, driven by a wide variety of case studies of direct interest to several government agencies. These case studies deal with critical societal issues, such as carbon budgets, water resources, ecosystem health, public health, drinking water distribution system, persistent poverty, environmental justice, crop pathogens, invasive species, biosecurity, biosurveillance, remote sensor networks, early warning and homeland security. The geosurveillance provides an excellent opportunity, challenge, and vehicle for synergistic collaboration of computational, technical, and social scientists. Our methodology involves an innovation of the popular circle-based spatial scan statistic methodology. In particular, it employs the notion of an upper level set and is accordingly called the upper level set scan statistic, pointing to the next generation of a sophisticated analytical and computational system, effective for the detection of arbitrarily shaped hotspots along spatio-temporal dimensions. We also propose a novel prioritization scheme based on multiple indicator and stakeholder criteria without having to integrate indicators into an index, using revealing Hasse diagrams and partially ordered sets. Responding to the Government’s role and need, we propose a cross-disciplinary collaboration among federal agencies and academic researchers to design and build the prototype system for surveillance infrastructure of hotspot detection and prioritization. The methodological toolbox and the software toolkit developed will support and leverage core missions of federal agencies as well as their interactive counterparts in the society. The research advances in the allied sciences and technologies necessary to make such a system work are the thrust of this five year project. The project will have a dual disciplinary and cross-disciplinary thrust. Dialogues and discussions will be particularly welcome, leading potentially to well considered synergistic case studies. The collaborative case studies are expected to be conceptual, structural, methodological, computational, applicational, developmental, refinemental, validational, and/or visualizational in their individual thrust. For additional information, see the webpages: (1) (2) (3) Project address: Penn State Center for Statistical Ecology and Environmental Statistics 421 Thomas Building, Penn State University, University Park, PA Telephone: (814) ;

12 National Applications Biosurveillance Biosurveillance Carbon Management Carbon Management Coastal Management Coastal Management Community Infrastructure Community Infrastructure Crop Surveillance Crop Surveillance Disaster Management Disaster Management Disease Surveillance Disease Surveillance Ecosystem Health Ecosystem Health Environmental Justice Environmental Justice Environmental Management Environmental Management Environmental Policy Environmental Policy Homeland Security Homeland Security Invasive Species Invasive Species Poverty Policy Poverty Policy Public Health Public Health Public Health and Environment Public Health and Environment Syndromic Surveillance Syndromic Surveillance Urban Crime Urban Crime Water Management Water Management

13 Attractive Features Identifies arbitrarily shaped clusters Identifies arbitrarily shaped clusters Data-adaptive zonation of candidate hotspots Data-adaptive zonation of candidate hotspots Applicable to data on a network Applicable to data on a network Provides both a point estimate as well as a confidence set for the hotspot Provides both a point estimate as well as a confidence set for the hotspot Uses hotspot-membership rating to map hotspot boundary uncertainty Uses hotspot-membership rating to map hotspot boundary uncertainty Computationally efficient Computationally efficient Applicable to both discrete and continuous syndromic responses Applicable to both discrete and continuous syndromic responses Identifies arbitrarily shaped clusters in the spatial-temporal domain Identifies arbitrarily shaped clusters in the spatial-temporal domain Provides a typology of space-time hotspots with discriminatory surveillance potential Provides a typology of space-time hotspots with discriminatory surveillance potential Hotspot Detection Innovation Upper Level Set Scan Statistic

14 Candidate Zones for Hotspots Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region Goal: Identify geographic zone(s) in which a response is significantly elevated relative to the rest of a region A list of candidate zones Z is specified a priori A list of candidate zones Z is specified a priori –This list becomes part of the parameter space and the zone must be estimated from within this list –Each candidate zone should generally be spatially connected, e.g., a union of contiguous spatial units or cells –Longer lists of candidate zones are usually preferable –Expanding circles or ellipses about specified centers are a common method of generating the list

15 Scan Statistic Zonation for Circles and Space-Time Cylinders

16 ULS Candidate Zones Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones? Question: Are there data-driven (rather than a priori) ways of selecting the list of candidate zones? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? Motivation for the question: A human being can look at a map and quickly determine a reasonable set of candidate zones and eliminate many other zones as obviously uninteresting. Can the computer do the same thing? A data-driven proposal: Candidate zones are the connected A data-driven proposal: Candidate zones are the connected components of the upper level sets of the response surface. The candidate zones have a tree structure (echelon tree is a subtree), which may assist in automated detection of multiple, but geographically separate, elevated zones. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set. Null distribution: If the list is data-driven (i.e., random), its variability must be accounted for in the null distribution. A new list must be developed for each simulated data set.

17 Data-adaptive approach to reduced parameter space  0 Data-adaptive approach to reduced parameter space  0 Zones in  0 are connected components of upper level sets of the empirical intensity function G a = Y a / A a Zones in  0 are connected components of upper level sets of the empirical intensity function G a = Y a / A a Upper level set (ULS) at level g consists of all cells a where G a  g Upper level set (ULS) at level g consists of all cells a where G a  g Upper level sets may be disconnected. Connected components are Upper level sets may be disconnected. Connected components are the candidate zones in  0 These connected components form a rooted tree under set inclusion. These connected components form a rooted tree under set inclusion. –Root node = entire region R –Leaf nodes = local maxima of empirical intensity surface –Junction nodes occur when connectivity of ULS changes with falling intensity level ULS Scan Statistic

18 Upper Level Set (ULS) of Intensity Surface Hotspot zones at level g (Connected Components of upper level set)

19 Changing Connectivity of ULS as Level Drops g

20 ULS Connectivity Tree Schematic intensity “surface” N.B. Intensity surface is cellular (piece-wise constant), with only finitely many levels A, B, C are junction nodes where multiple zones coalesce into a single zone A B C

21 A confidence set of hotspots on the ULS tree. The different connected components correspond to different hotspot loci while the nodes within a connected component correspond to different delineations of that hotspot

22 Network Analysis of Biological Integrity in Freshwater Streams

23 Network-Based Surveillance Subway system surveillance Subway system surveillance Drinking water distribution system surveillance Drinking water distribution system surveillance Stream and river system surveillance Stream and river system surveillance Postal System Surveillance Postal System Surveillance Road transport surveillance Road transport surveillance Syndromic Surveillance Syndromic Surveillance

24 Syndromic Surveillance Symptoms of disease such as diarrhea, respiratory problems, headache, etc Symptoms of disease such as diarrhea, respiratory problems, headache, etc Earlier reporting than diagnosed disease Earlier reporting than diagnosed disease Less specific, more noise Less specific, more noise

25 (left) The overall procedure, leading from admissions records to the crisis index for a hospital. The hotspot detection algorithm is then applied to the crisis index values defined over the hospital network. (right) The -machine procedure for converting an event stream into a parse tree and finally into a probabilistic finite state automaton (PFSA). Syndromic Surveillance

26 Mapping Priority Hotspots of Vegetative Disturbance for Carbon Budgets

27 Crop Biosurveillance/Biosecurity

28 Hyperspectral Imagery Signature Library Image Segmentation (hyperclustering) Proxy Signal (per segment) Disease Signature Similarity Index (per segment) Tessellation (segmentation) of raster grid Signature Similarity Map Hotspot/ Anomaly Detection Crop Biosurveillance/Biosecurity Data Processing Module

29 Emergent Surveillance Plexus (ESP) Surveillance Sensor Network Testbed Autonomous Ocean Sampling Network Types of Hotspots Hotspots due to multiple, localized, stationary sources Hotspots due to multiple, localized, stationary sources Hotspots corresponding to areas of interest in a stationary mapped field Hotspots corresponding to areas of interest in a stationary mapped field Time-dependent, localized hotspots Time-dependent, localized hotspots Hotspots due to moving point sources Hotspots due to moving point sources

30 Ocean SAmpling MObile Network OSAMON

31 Ocean SAmpling MObile Network OSAMON Feedback Loop Network sensors gather preliminary data Network sensors gather preliminary data ULS scan statistic uses available data to estimate hotspot ULS scan statistic uses available data to estimate hotspot Network controller directs sensor vehicles to new locations Network controller directs sensor vehicles to new locations Updated data is fed into ULS scan statistic system Updated data is fed into ULS scan statistic system

32 SAmpling MObile Networks (SAMON) Additional Application Contexts Hotspots for radioactivity and chemical or biological agents to prevent or mitigate the effects of terrorist attacks or to detect nuclear testing Hotspots for radioactivity and chemical or biological agents to prevent or mitigate the effects of terrorist attacks or to detect nuclear testing Mapping elevation, wind, bathymetry, or ocean currents to better understand and protect the environment Mapping elevation, wind, bathymetry, or ocean currents to better understand and protect the environment Detecting emerging failures in a complex networked system like the electric grid, internet, cell phone systems Detecting emerging failures in a complex networked system like the electric grid, internet, cell phone systems Mapping the gravitational field to find underground chambers or tunnels for rescue or combat missions Mapping the gravitational field to find underground chambers or tunnels for rescue or combat missions

Prioritization Innovation Partially Ordered Set Ranking We present a prioritization innovation: Ability to prioritize and rank hotspots Based on multiple indicator and stakeholder criteria without integrating indicators into an index Employs Hasse diagrams and partially ordered sets Leads to Early warning systems Selection of areas for focused investigation

First stage screening –Significant clusters by SaTScan and/or upper level sets Second stage screening –Multicriterion noteworthy clusters by partially ordered sets and Hasse diagrams Final stage screening –Follow up clusters for etiology, –intervention based on multiple criteria using Hasse diagrams Multiple Criteria Analysis Multiple Indicators and Choices Health Statistics, Disease Etiology Health Policy, Resource Allocation

Hotspot Prioritization Poset Ranking Main Features: Multiple hotspots with intensities significantly elevated relative to the rest of the region Ranking based on likelihood values, and additional attributes: raw intensity values, socio-economic and demographic factors, feasibility scores, excess cases, seasonal residence, atypical demographics, etc. Multiple attributes, multiple indicators Ranking without having to integrate the multiple indicators into a composite index

HUMAN ENVIRONMENT INTERFACE (HEI) LAND, AIR, WATER INDICATORS RANKCOUNTRYLANDAIRWATER Sweden Finland Norway Iceland Austria Switzerland Spain France Germany Portugal Italy Greece Belgium Netherlands Denmark United Kingdom Ireland LAND percent undomesticated (excludes permanent crops, pastures, built up areas, roads) AIR percent renewable energy (e.g., hydro, solar, wind, geothermal) WATER percent of population with access to safe drinking water

Hasse Diagram for HEI (Western Europe)

Decision tree enumerates all possible linear extensions of the poset. Every downward path through the decision tree determines a linear extension. Dashed links in the decision tree are not implied by the partial order and are called jumps. Tracing the linear extension in the original Hasse diagram requires a jump at each dashed link. Note that there is a pure-jump linear extension (path a, b, c, d, e, f) in which every link is a jump. Hasse Diagrams Linear Extensions

Cumulative Rank Frequency Operator Example of the Procedure 16 The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f

Cumulative Rank Frequency Operator Iteration may be required to achieve a linear ordering Original Poset (Hasse Diagram) a f eb c g d h a f e b ad c h g a f e b ad c h g

Certain of the indicators may be deemed more important than the others Such differential importance can be accommodated in the poset cumulative rank frequency approach Instead of the uniform distribution on the set of linear extensions, use an appropriately weighted probability distribution , e.g., Cumulative Rank Frequency Approach Incorporating Judgment