Early Detection of Disease Outbreaks with Applications in New York City Martin Kulldorff University of Connecticut Farzad Mostashari and James Miller.

Slides:



Advertisements
Similar presentations
Summary of A Spatial Scan Statistic by M. Kulldorff Presented by Gauri S. Datta SAMSI September 29, 2005.
Advertisements

Summary of A Spatial Scan Statistic by M. Kulldorff Presented by Gauri S. Datta Mid-Year Meeting February 3, 2006.
Significance Testing.  A statistical method that uses sample data to evaluate a hypothesis about a population  1. State a hypothesis  2. Use the hypothesis.
Hotspot/cluster detection methods(1) Spatial Scan Statistics: Hypothesis testing – Input: data – Using continuous Poisson model Null hypothesis H0: points.
Sylvia Le April 29, Disparities in Health Care Disparities in health care have serious impact on the quality of health care. Identifying health.
Presented by: GROUP 7 Gayathri Gandhamuneni & Yumeng Wang.
Evaluating the Use of Outbreak Detection Algorithms to Detect Tuberculosis Outbreaks in Scotland Ben Tait Dr Janet Stevenson.
Epidemiologic study designs
Spatio – Temporal Cluster Detection Using AMOEBA
Statistical approaches for detecting clusters of disease. Feb. 26, 2013 Thomas Talbot New York State Department of Health Bureau of Environmental and Occupational.
 2005 Carnegie Mellon University A Bayesian Scan Statistic for Spatial Cluster Detection Daniel B. Neill 1 Andrew W. Moore 1 Gregory F. Cooper 2 1 Carnegie.
SPATIAL DATA ANALYSIS Tony E. Smith University of Pennsylvania Point Pattern Analysis Spatial Regression Analysis Continuous Pattern Analysis.
Early Detection of Disease Outbreaks Prospective Surveillance.
Find the Joy in Stats ? ! ? Walt Senterfitt, Ph.D., PWA Los Angeles County Department of Public Health and CHAMP.
Empirical/Asymptotic P-values for Monte Carlo-Based Hypothesis Testing: an Application to Cluster Detection Using the Scan Statistic Allyson Abrams, Martin.
A Spatial Scan Statistic for Survival Data Lan Huang, Dep Statistics, Univ Connecticut Martin Kulldorff, Harvard Medical School David Gregorio, Dep Community.
 Statistical approaches for detecting unexplained clusters of disease.  Spatial Aggregation Thomas Talbot New York State Department of Health Environmental.
A Tree-Based Scan Statistic for Database Disease Surveillance Martin Kulldorff University of Connecticut Joint work with: Zixing Fang, Stephen Walsh.
The Space-Time Scan Statistic for Multiple Data Streams
Visualization of space-time patterns of West Nile virus Alan McConchie CPSC 533c: Information Visualization December 14, 2006.
Syndromic Surveillance in Georgia: A Grassroots Approach February 22, 2006 Erin L. Murray Karl Soetebier Georgia Division of Public Health.
Detecting Spatial Clustering in Matched Case-Control Studies Andrea Cook, MS Collaboration with: Dr. Yi Li December 2, 2004.
Sample Size Determination
Spatial Statistics for Cancer Surveillance Martin Kulldorff Harvard Medical School and Harvard Pilgrim Health Care.
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
Geographic Information Science
BC Jung A Brief Introduction to Epidemiology - IV ( Overview of Vital Statistics & Demographic Methods) Betty C. Jung, RN, MPH, CHES.
Dr. Paramita Sengupta Department Of Community Medicine Christian Medical College Ludhiana Co-authors: Ragini Mann, Rohit Theodore, A I Benjamin Risk factors.
Using ArcGIS/SaTScan to detect higher than expected breast cancer incidence Jim Files, BS Appathurai Balamurugan, MD, MPH.
The Spatial Scan Statistic. Null Hypothesis The risk of disease is the same in all parts of the map.
Multiple Choice Questions for discussion
HSS4303B – Intro to Epidemiology
SPONSOR JAMES C. BENNEYAN DEVELOPMENT OF A PRESCRIPTION DRUG SURVEILLANCE SYSTEM TEAM MEMBERS Jeffrey Mason Dan Mitus Jenna Eickhoff Benjamin Harris.
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Health Consultation: Evaluation of Cancer Incidence in Census Tracts of Attleboro and Norton, Massachusetts: Suzanne K. Condon Associate Commissioner.
Wisconsin Department of Health Services HIV/AIDS Surveillance Annual Review New diagnoses, prevalent cases, and deaths through December 31, 2013 April.
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Health and Disease in Populations 2001 Sources of variation (2) Jane Hutton (Paul Burton)
INCIDENCE AND SURVIVAL TRENDS OF COLORECTAL CANCER FROM 2002 TO 2011 BE Ansa; E Alema-Mensah; MD Claridy; JQ Sheats; B Fontenot, and SA Smith Georgia Regents.
Temporal, Geographic and Demographic Trends of Early Onset Breast Cancer (EOBC) in Hampden County, MA Holly S. Mason MD FACS Jane Garb MS.
Enhancing Disease Surveillance with Spatial-temporal Results Patricia Araki, MPH County of Los Angeles – Department of Public Health Acute Communicable.
Centre for Environmental Health Research Small area health analyses: pharmacy data and exposure to transport noise Oscar Breugelmans, Jan van de Kassteele,
Cluster Detection Comparison in Syndromic Surveillance MGIS Capstone Project Proposal Tuesday, July 8 th, 2008.
Epidemiology Applications Fran C. Wheeler, Ph.D School of Public Health University of South Carolina Columbia, SC (803)
A short introduction to epidemiology Chapter 2b: Conducting a case- control study Neil Pearce Centre for Public Health Research Massey University Wellington,
Data Sources-Cancer Betsy A. Kohler, MPH, CTR Director, Cancer Epidemiology Services New Jersey Department of Health and Senior Services.
Tools to Access the Latest Cancer Statistics Paul Miller Washington Reporting Fellowships program presentation April 15, 2013.
Unit 2 – Public Health Epidemiology Chapter 4 – Epidemiology: The Basic Science of Public Health.
Materials and Methods GIS Development A GIS was constructed from historical records of known villages reporting human anthrax between the years 1937 and.
A short introduction to epidemiology Chapter 9: Data analysis Neil Pearce Centre for Public Health Research Massey University Wellington, New Zealand.
Study Design & Population A retrospective cohort design was applied to the Medicaid administrative claims data of youth continuously enrolled in a Mid-Atlantic.
Lecture 5: The Natural History of Disease: Ways to Express Prognosis
Descriptive study design
Statistical Significance: Tests for Spatial Randomness.
~PPT Howard Burkom 1, PhD Yevgeniy Elbert 2, MSc LTC Julie Pavlin 2, MD MPH Christina Polyak 2, MPH 1 The Johns Hopkins University Applied Physics.
1 Copyright © 2012 by Mosby, an imprint of Elsevier Inc. Copyright © 2008 by Mosby, Inc., an affiliate of Elsevier Inc. Chapter 24 Public Health Surveillance.
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
Towards efficient prospective detection of multiple spatio-temporal clusters Bráulio Veloso, Andréa Iabrudi and Thais Correa. Universidade Federal de Ouro.
2.1.1 Geospatial Analysis for Disease Surveillance—1 Case Event Point Data & Areal Unit Count Data Geospatial surveillance Cluster detection and evaluation.
Spatial Scan Statistic for Geographical and Network Hotspot Detection C. Taillie and G. P. Patil Center for Statistical Ecology and Environmental Statistics.
GEOGRAPHIC CLUSTERS OF HEAD & NECK CANCER IN FLORIDA Recinda Sherman, MPH, CTR Florida Cancer Data Systems NAACCR Detroit, June 7, 2007.
Chapter 2. **The frequency distribution is a table which displays how many people fall into each category of a variable such as age, income level, or.
Aldo Aviña Environmental and Occupational Health
Dept of Biostatistics, Emory University
Lecture 1: Fundamentals of epidemiologic study design and analysis
One Health Early Warning Alert
Public Health Surveillance
Epidemiology MPH 531 Analytic Epidemiology Case control studies
Mpundu MKC MSc Epidemiology and Biostatistics, BSc Nursing, RM, RN
Presentation transcript:

Early Detection of Disease Outbreaks with Applications in New York City Martin Kulldorff University of Connecticut Farzad Mostashari and James Miller New York City Department of Health

Content Prospective Disease Surveillance The Spatial Scan Statistic Thyroid Cancer Incidence in New Mexico Dead Birds and West-Nile Virus Surveillance in New York City Other Current Applications

Prospective Surveillance For a pre-specified geographical area, there are existing statistical methods for the detection of a sudden disease outbreak, e.g. CUSUM methods.

Three Important Issues An outbreak may start locally. Can be used simultaneously for multiple geographical areas, but that leads to multiple testing. Disease outbreaks may not conform to the pre-specified geographical areas.

Level of Aggregation If too little geographical aggregation, the rates are statistically unstable. With aggregation, there is arbitrariness in the boundaries chosen.

Step One: Purely Spatial Scan Statistic Pre-determined time period. Geographical areas with observed number of cases and population at risk. Evaluate overlapping circles of different sizes at different locations.

One-Dimensional Scan Statistic

The Spatial Scan Statistic Move a circular window across the map. Use a variable circle radius, from zero up to a maximum where 50 percent of the population is included.

A small sample of the circles used

For each circle: – Obtain actual and expected number of cases inside and outside the circle. – Calculate Likelihood Function Compare Circles: – Pick circle with maximum likelihood. This is the most likely cluster, i.e., the cluster least likely to have occurred by chance. Inference: – Generate random replicas of the data set under the null- hypothesis of no clusters (Monte Carlo sampling). – Compare most likely clusters in real and random data sets (Likelihood ratio test).

Spatial Scan Statistic: Properties –Adjusts for inhomogeneous population density. –Simultaneously tests for clusters of any size and any location, by using circular windows with continuously variable radius. –Accounts for multiple testing. –Possibility to include confounding variables, such as age, sex or socio-economic variables. –Aggregated or non-aggregated data (states, counties, census tracts, block groups, households, individuals).

Example: Thyroid Cancer Incidence in New Mexico Data Source: New Mexico Tumor Registry Gender: Male Aggregation Level: 32 Counties Adjustments for: Age and Temporal Trends

Thyroid Cancer Median age at diagnosis: 44 years United States (SEER) incidence: 4.5 / 100,000 United States mortality: 0.3 / 100,000 Five year survival: 95% Known risk factors: Radiation treatment for head and neck conditions. Radioactive downfall (Hiroshima/Nagasaki, Chernobyl, Marshall Islands) Work as radiologic technician (USA) or x-ray operator (Sweden).

Purely Spatial Scan Statistic 1978 analysis Data Years Cases Expected RRp=Most Likely Cluster Bernadillo,Valencia

Time-Periodic Surveillance New cases are added on a yearly basis. Reanalysis when new data arrives. Data Available: Surveillance Starts: 1978 Total Cases: 333

Purely Spatial Scan Statistic Data Years CasesExpectedRRp=Most Likely Cluster Bernadillo,Valencia Bernadillo,Valencia Bernadillo,Valencia Bernadillo,Valencia Bernadillo,Valencia Bernadillo,Valencia North Central Counties North Central - SanMiguel North Central + Colfax,Harding North Central + Colfax,Harding North Central - SanMiguel North Central + Colfax,Harding North Central + Torrance North Central + Colfax,Harding North Central + Torrance North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos.

North Central Counties

Two Problems While we are adjusting for the multiple testing stemming from many possible cluster locations and cluster sizes, we are not adjusting for the multiple testing due to repeated analyses every year. Low power to quickly detect emerging clusters.

Solution: Space-Time Scan Statistic

Detecting Emerging Clusters Instead of a circular window in two dimensions, we use a cylindrical window in three dimensions. The base of the cylinder represents space, while the height represents time. The cylinder is flexible in its circular base and starting date, but we only consider those cylinders that reach all the way to the end of the study period. Hence, we are only considering ‘alive’ clusters.

Hypothesis Test Find Likelihood for Each Choice of Cylinder Through Maximum Likelihood Estimation, Find the Most Likely Cluster Apply Likelihood Ratio Test Evaluate Significance Through Monte Carlo Simulation

Space-Time Scan Statistic Alive Clusters YearsMost Likely Cluster CasesCluster Period Expected RRp=p= LosAlamos, Rio Arriba Bernadillo + 7 counties West LosAlamos, Rio Arriba North Central – SanMiguel North Central – SanMiguel Bernadillo, Valencia North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos North Central Lincoln North Central + Colfax, Harding North Central + Colfax, Harding North Central – SanMiguel North Central + Colfax,Harding LosAlamos, RioArriba, SantaFe, Taos LosAlamos

Los Alamos

Space-Time Scan Statistic Alive Clusters YearsMost Likely Cluster CasesCluster Period Expected RRp=p= LosAlamos, Rio Arriba Bernadillo + 7 counties West LosAlamos, Rio Arriba North Central – SanMiguel North Central – SanMiguel Bernadillo, Valencia North Central Counties = Bernadillo, Los Alamos, Mora, Rio Arriba, Sandoval, San Miguel, Santa Fe and Taos North Central Lincoln North Central + Colfax, Harding North Central + Colfax, Harding North Central – SanMiguel North Central + Colfax,Harding LosAlamos, RioArriba, SantaFe, Taos LosAlamos LosAlamos

Problem We have still not adjusted for repeated time- period analyses conducted every year.

Adjusting for Yearly Surveillance While interest is only in ‘alive’ clusters, the p-value will be calculated based on the probability of obtaining a likelihood higher than the observed for any cylinder used during the present or past analyses. This is done using a space-time scan statistic evaluating all cylinders irrespectively of start and end year.

Adjusting for Yearly Surveillance The Los Alamos Cluster 1991 Analysis: p=0.13 (unadjusted p=0.02) 1992 Analysis: p=0.016 (unadjusted p=0.002)

Los Alamos cases

Thyroid Cancer in Los Alamos The New Mexico Department of Health have investigated the individual nature of all 17 male thyroid cancer cases reported in Los Alamos All were confirmed cases.

Thyroid Cancer in Los Alamos 3/17 had a history of therapeutic ionizing radiation treatment to the head and neck. 8/17 had been regularly monitored for exposure to ionizing radiation due to their particular work at the Los Alamos National Laboratory. 2/17 had had significant workplace-related exposure to ionizing radiation from atmospheric weapons testing fieldwork. A know risk factor, ionizing radiation, is hence a likely explanation for the observed cluster.

West Nile Virus Surveillance in New York City 2000 Data: Simulation/Testing of Prospective Surveillance System 2001 Data: Real Time Implementation of Daily Prospective Surveillance

2000 Data - Dead birds reported by the public - Simulation of a daily prospective surveillance system - Start date: June 1, 2000.

Major epicenter on Staten Island Dead bird surveillance system: June 14 Positive bird report: July 16 (coll. July 5) Positive mosquito trap: July 24 (coll. July 7) Human case report: July 28 (onset July 20)

2001 Data Real time prospective surveillance Daily analyses starting June 22

Syndromic Surveillance Symptoms of disease such as diarrhea, respiratory problems, headache, etc Earlier reporting than diagnosed disease Less specific, more noise

Hospital Emergency Admissions in New York City Hospital emergency admissions data from a majority of New York City hospitals. At midnight, hospitals report last 24 hour of data to New York City Department of Health A spatial scan statistic analysis is performed every morning If an alarm, a local investigation is conducted

Other Syndromic Surveillance Data Sources 911 Ambulance Dispatches Pharmacy Sales Employee Sick Leave Physician Visits Veterinary Clinic Visits

Conclusions The space-time scan statistic can serve as an important tool in prospective systematic time- periodic geographical surveillance for the early detection of disease outbreaks. It is possible to detect emerging clusters, and we can adjust for the multiple tests performed over the years. The method can be used for different diseases.

References Early Detection: Kulldorff M. Prospective time-periodic geographical disease surveillance using a scan statistic. J Royal Statistical Society, A164:61-72, West Nile: Mostashari F, Kulldorff M, Miller J. Dead bird clustering: An early warning system for West Nile virus activity. Under review. Software: Kulldorff M, Rand K, Gherman G, Williams G, DeFrancesco D. SaTScan v.2.1: Software for the spatial and space-time scan statistics. Bethesda, MD: National Cancer Institute,