 Statistical approaches for detecting unexplained clusters of disease.  Spatial Aggregation Thomas Talbot New York State Department of Health Environmental.

Slides:



Advertisements
Similar presentations
Summary of A Spatial Scan Statistic by M. Kulldorff Presented by Gauri S. Datta SAMSI September 29, 2005.
Advertisements

Cohort Studies.
Prototype Statewide Interactive Mapping Service Preterm birth, Environmental Health Investigations Branch California Department of Health Services.
Spatial Autocorrelation using GIS
Statistical approaches for detecting clusters of disease. Feb. 26, 2013 Thomas Talbot New York State Department of Health Bureau of Environmental and Occupational.
BEN ANDERSON PROJECT MANAGER UNIVERSITY OF LOUISVILLE CENTER FOR HAZARDS RESEARCH AND POLICY DEVELOPMENT Using Dasymetric Mapping.
*Wahida Kihal 1, Cindy Padilla 1,2, Benoit Lalloué 1,2,3, Marcello Gelormini1, Denis Zmirou-Navier 1,2,3, Séverine Deguen 1,2 1 EHESP School of Public.
Early Detection of Disease Outbreaks Prospective Surveillance.
Nicky Best and Chris Jackson With Sylvia Richardson Department of Epidemiology and Public Health Imperial College, London
Introduction to Spatial Regression Glen Johnson, PhD Lehman College / CUNY School of Public Health
Empirical/Asymptotic P-values for Monte Carlo-Based Hypothesis Testing: an Application to Cluster Detection Using the Scan Statistic Allyson Abrams, Martin.
GIS and Spatial Statistics: Methods and Applications in Public Health
Neighborhood Walkability and Bikeability Andrew Rundle, Dr.P.H. Associate Professor of Epidemiology Mailman School of Public Health Columbia University.
Smoothed Maps. This is a Smoothed Map Ideas Behind Smoothing To avoid arbitrary political boundaries To adjust unstable estimates towards a global mean.
Correlation and Autocorrelation
GIS in Spatial Epidemiology: small area studies of exposure- outcome relationships Robert Haining Department of Geography University of Cambridge.
Why Geography is important.
1 Spatial Statistics and Analysis Methods (for GEOG 104 class). Provided by Dr. An Li, San Diego State University.
Spatial Statistics for Cancer Surveillance Martin Kulldorff Harvard Medical School and Harvard Pilgrim Health Care.
Socio-Economic & Demographic Data Tools for Proactive Planning Robin Blakely-Armitage STATE OF NEW YORK CITIES: Creative Responses to Fiscal Stress March.
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
Ch 5 Practical Point Pattern Analysis Spatial Stats & Data Analysis by Magdaléna Dohnalová.
Geographic Information Science
Childhood Lead Poisoning in New York State Symposium To Examine Lead Poisoning in NYS March 13, 2006 Rachel de Long, M.D., M.P.H. Director, Bureau of Child.
Using ArcGIS/SaTScan to detect higher than expected breast cancer incidence Jim Files, BS Appathurai Balamurugan, MD, MPH.
The Spatial Scan Statistic. Null Hypothesis The risk of disease is the same in all parts of the map.
Site Location.
SPONSOR JAMES C. BENNEYAN DEVELOPMENT OF A PRESCRIPTION DRUG SURVEILLANCE SYSTEM TEAM MEMBERS Jeffrey Mason Dan Mitus Jenna Eickhoff Benjamin Harris.
Health Consultation: Evaluation of Cancer Incidence in Census Tracts of Attleboro and Norton, Massachusetts: Suzanne K. Condon Associate Commissioner.
Adaptive Kernel Density in Demographic Analysis Richard Lycan Institute on Aging Portland State University.
ATSDR’s approach to site assessment and epidemiologic considerations for multisite studies Steve Dearwent, PhD, MPH Chief, Health Investigations Branch.
Spatial Statistics Applied to point data.
Kevin Kovach, DrPH(c), MSc, CHES Johnson County Department of Health and Environment – Olathe, Kansas Does the County Poverty Rate Influence Birth Weight.
EUROHEIS 2 Dr Linda Beale October 2007 – September 2010.
An Analysis of Childhood Asthma and Environmental Exposures in Utah Michelle Gillette, M.P.H. Office of Epidemiology Utah Department of Health.
PPT th Edition. PPT 8-2 McGraw-Hill/Irwin Levy/Weitz: Retailing Management, 5/e Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Interpolation.
Spatial Data Analysis Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What is spatial data and their special.
Thomas Talbot Chief, Environmental Health Surveillance Section NYS Department of Health April 18, 2013.
Cluster Detection Comparison in Syndromic Surveillance MGIS Capstone Project Proposal Tuesday, July 8 th, 2008.
Urbanisation and spatial inequalities in health in Brazil and India
Extending Spatial Hot Spot Detection Techniques to Temporal Dimensions Sungsoon Hwang Department of Geography State University of New York at Buffalo DMGIS.
1 ◄ ◄ Maternal and Infant Health data for California Choose one vital records indicator:  Preterm birth (birth prior to 37 weeks of pregnancy among singletons)
Case Control Study Dr. Ashry Gad Mohamed MB, ChB, MPH, Dr.P.H. Prof. Of Epidemiology.
Clear title: What, Where, When. Clear, readable, neat labels. Good progression of colors. “Balanced” map. Legend labels. Legend includes units. No abbreviations.
OBJECTIVES (i) An update of the national analysis a) to assess the confounding and modifying effect of community and neighbourhood level ecological.
1 Spatial Statistics and Analysis Methods (for GEOG 104 class). Provided by Dr. An Li, San Diego State University.
What’s the Point? Working with 0-D Spatial Data in ArcGIS
Spatial Statistics and Analysis Methods (for GEOG 104 class).
Methods for point patterns. Methods consider first-order effects (e.g., changes in mean values [intensity] over space) or second-order effects (e.g.,
Exploratory Spatial Data Analysis (ESDA) Analysis through Visualization.
Matching. Objectives Discuss methods of matching Discuss advantages and disadvantages of matching Discuss applications of matching Confounding residual.
Statistical methods for real estate data prof. RNDr. Beáta Stehlíková, CSc
Statistical Significance: Tests for Spatial Randomness.
Special Topics in Geo-Business Data Analysis Week 3 Covering Topic 6 Spatial Interpolation.
GIS Software Applications in Epidemiology Marcus Liscombe Brent Croft GISC GIS MANAGEMENT AND IMPLEMENTATION.
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
GIS and the Built Environment: An Overview Phil Hurvitz UW-CAUP-Urban Form Lab GIS and the Geography of Obesity Workshop August 3, 2005.
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
GEOGRAPHIC CLUSTERS OF HEAD & NECK CANCER IN FLORIDA Recinda Sherman, MPH, CTR Florida Cancer Data Systems NAACCR Detroit, June 7, 2007.
Early Detection of Disease Outbreaks with Applications in New York City Martin Kulldorff University of Connecticut Farzad Mostashari and James Miller.
Residential Segregation: A Key Connector Between Race and Environmental Health Disparities Jennifer Davis, Sacoby Wilson, Muhammad Salaam, Rahnuma Hassan.
Epidemiological Study Designs And Measures Of Risks (1)
Cases and controls A case is an individual with a disease, whose location can be represented by a point on the map (red dot). In this table we examine.
Dept of Biostatistics, Emory University
Geographic Pattern of Type 1 Diabetes Mellitus in Children in Central Ohio: Higher risk of New Cases in Rural and Urban Areas. Sasigarn A. Bowden, MD,
Spatial Autocorrelation
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Why are Spatial Data Special?
Spatial Data Analysis: Intro to Spatial Statistical Concepts
Presentation transcript:

 Statistical approaches for detecting unexplained clusters of disease.  Spatial Aggregation Thomas Talbot New York State Department of Health Environmental Health Surveillance Section Albany School of Public Health GIS & Public Health Class March 3, 2009

Cluster A number of similar things grouped closely together Webster’s Dictionary Unexplained concentrations of health events in space and/or time Public Health Definition

Occupation Sex, Age Socioeconomic class Behavior (smoking) Race Time Space

Spatial Autocorrelation Negative autocorrelation “Everything is related to everything else, but near things are more related than distant things.” - Tobler’s first law of geography Positive autocorrelation

Moran’s I A test for spatial autocorrelation in disease rates. Nearby areas tend to have similar rates of disease. Moran I is greater than 1, positive spatial autocorrelation. When nearby areas are dissimilar Moran I is less than 1, negative spatial autocorrelation.

Detecting Clusters Consider scale Consider zone Control for multiple testing

Talbot

Cluster Questions Does a disease cluster in space? Does a disease cluster in both time and space? Where is the most likely cluster? Where is the most likely cluster in both time and space?

More Cluster Questions At what geographic or population scale do clusters appear? Are cases of disease clustered in areas of high exposure?

Nearest Neighbor Analysis Cuzick & Edwards Method Count the the number of cases whose nearest neighbors are cases and not controls. When cases are clustered the nearest neighbor to a case will tend to be another case, and the test statistic will be large.

Nearest Neighbor Analyses

Advantages Accounts for the geographic variation in population density Accounts for confounders through judicious selection of controls Can detect clustering with many small clusters

Disadvantages Must have spatial locations of cases & controls Doesn’t show location of the clusters

Spatial Scan Statistic Martin Kulldorff Determines the location with elevated rate that is statistically significant. Adjust for multiple testing of the many possible locations and area sizes of clusters. Uses Monte Carlo testing techniques

The Space-Time Scan Statistic Cylindrical window with a circular geographic base and a height corresponding to time. Cylindrical window is moved in space and time. P value for each cylinder calculated.

Knox Method test for space-time interaction When space-time interaction is present cases near in space will be near in time, the test statistic will be large. Test statistic: The number of pairs of cases that are near in both time and space.

Focal tests for clustering Cross sectional or cohort approach: Is there a higher rate of disease in populations living in contaminated areas compared to populations in uncontaminated areas? (Relative risk) Case/control approach: Are there more cases than controls living in a contaminated area? (Odds ratio)

Focal Case-Control Design CaseControl 250 m. 500 m.

Regression Analysis Control for know risk factors before analyzing for spatial clustering Analyze for unexplained clusters. Follow-up in areas with large regression residuals with traditional case-control or cohort studies Obtain additional risk factor data to account for the large residuals.

At what geographic or population scale do clusters appear? Multiresolution mapping.

A cluster of cases in a neighborhood provides a different epidemiological meaning then a cluster of cases across several adjacent counties. Results can change dramatically with the scale of analysis.

Interactive Selections by rate, population and p value

References Talbot TO, Kulldorff M, Forand SP, and Haley VB. Evaluation of Spatial Filters to Create Smoothed Maps of Health Data. Statistics in Medicine. 2000, 19: Forand SP, Talbot TO, Druschel C, Cross PK. Data Quality and the Spatial Analysis of Disease Rates: Congenital Malformations in New York Health and Place. 2002, 8: Haley VB, Talbot TO. Geographic Analysis of Blood Lead Levels in New York State Children Born Environmental Health Perspectives 2004, 112(15): Kuldorff M, National Cancer Institute. SatScan User Guide

Health data can be shown at different geographic scales Residential address Census blocks, and tracts Towns Counties State

Concerns about release of small area data Risk of disclosure of confidential information. Rates of disease are unreliable due to small numbers.

Rate maps with small numbers provide very little information.

Disclosure of confidential information Census Blocks

Smoothed or Aggregated Count & Rate Maps Protect Confidentiality so data can be shared. Reduce random fluctuations in rates due to small numbers.

Smoothed Rate Maps Borrow data from neighboring areas to provide more stable rates of disease. –Shareware tools available –Empirical or Hierarchal Bayesian approaches –Adaptive Spatial Filters –Head banging –etc.

from Talbot et al., Statistics in Medicine, 2000

Problems with smoothing Does not provide counts & rates for defined geographic areas. Not clear how to link risk factor data with smoothed health data. Methods are sometimes difficult to understand - “black boxes” Does not meet requirements of some recent New York policies & legislation.

Environmental Facilities & Cancer Incidence Map Law, 2008 § Plot cancer cases by census block, except in cases where such plotting could make it possible to identify any cancer patient. Census blocks shall be aggregated to protect confidentiality.

Environmental Justice & Permitting NYSDEC Commissioner Policy 29 Incorporate existing human health data into the environmental review process. Data will be made available at a fine geographic scale (ZIP code or ZIP Code Groups).

Aggregated Count or Rate Maps Merge small areas with neighboring areas to provide more stable rates of disease and/or protect confidentiality. –Aggregation can be done manually. –Existing automated tools were difficult to use.

Original ZIP Codes 3 Years Low Birth Weight Incidence Ratios

Aggregated to 250 Births per ZIP Code Group

Goal Aggregate small areas into larger ones. User decides how much aggregation is needed. Works with various levels of geography. – census blocks, tracts, towns, ZIP codes etc. – can nest one level of geography in another Uses software which is readily available in NYSDOH (SAS)‏ Can output results for use in mapping programs. Our Tool Requirements

Aggregation Tool C14 B23 A21 RegionCases Original Block Data † Regions SAS Tool † Simulated data CasesBlock / / / / / / / / / CasesRegionBlock C103202/2002 C103202/2001 B014500/3010 B014500/3009 B014500/3008 B014500/3007 B014500/3005 A122300/2005 A122300/2004

Aggregation Process Populated blocks with the fewest cases are merged first. If there is a tie the program starts with the block with the fewest neighbors. Selected block then is merged with the closest neighbor in the same census block group. After merging the first block the list of neighbors is updated. Process repeats until all regions have a minimum number of cases –program can also merge to user specified population

Special Situations Tool tries to avoid merging blocks in different census areas: –Census block groups –Census tracts ( homogeneous population characteristics). –Counties Tool tries to avoid merging blocks across major water bodies e.g. Finger lakes, Hudson River, Atlantic Ocean

Water

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

9 Cases 98Population † Simulated data

New York State Descriptive Statistics Year 2000 populated census blocks Median Census blocks Median cases 1, Average Population 11,38121,52539,748225,167 Number 24 cases12 cases6 cases Original Census Blocks Statistic (calculated using populated regions only)‏ New Regions: Level of Aggregation NY number of cases 470,000 NY population 18,976,457

Performance Measures Compactness Homogeneity with respect to demographic factors (measured as index of dissimilarity) Similar population sizes. Number of aggregated areas. Aggregated zones are completely contained within larger areas. –e.g. blocks aggregation areas contained within tracts

Index of dissimilarity the percentage of one group that would have to move to a different area in order to have a even distribution b i = the minority population of the ith area, e.g. census tract B = the total minority population of the large geographic entity for which the index of dissimilarity is being calculated. w i = the non-minority population of the ith area W = the total non-minority population of the large geographic entity for which the index of dissimilarity is being calculated. Wikipedia

The End