Early Statistical Detection of Bio-Terrorism Attacks by Tracking OTC Medication Sales Galit Shmueli Dept. of Statistics and CALD Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Using ESSENCE IV An Overview Objectives Explain ESSENCE and its impact Define surveillance Define syndromic surveillance and its importance Demonstrate.
Advertisements

Statistical Issues and Challenges Associated with Rapid Detection of Bio-Terrorist Attacks SE Fienberg and G Shmueli (2005) Presented by Lisa Denogean.
Reeder et al. Perceived usefulness of a distributed community-based syndromic surveillance system: a pilot qualitative evaluation study. BMC Research Notes.
LeadManager™- Internet Marketing Lead Management Solution May, 2009.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
NYC Syndromic Surveillance IFH HIT Meaningful Use Workshop 10/1/2010 Marlena Plagianos, MS NYCDOHMH
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Comparability of Electronic and Manual Influenza-like Illness (ILI) Surveillance Methods Robin M. Williams, Nebraska Department of Health & Human Services/University.
Infrastructure and Methods to Support Real Time Biosurveillance Kenneth D. Mandl, MD, MPH Children’s Hospital Boston Harvard Medical School.
 2005 Carnegie Mellon University A Bayesian Scan Statistic for Spatial Cluster Detection Daniel B. Neill 1 Andrew W. Moore 1 Gregory F. Cooper 2 1 Carnegie.
Placing Bioterrorism in its Context Dr. C. de Ville de Goyet.
Optimizing Disease Outbreak Detection Methods Using Reinforcement Learning Masoumeh Izadi Clinical & Health Informatics Research Group Faculty of Medicine,
Bayesian Biosurveillance Gregory F. Cooper Center for Biomedical Informatics University of Pittsburgh The research described in this.
Project Mimic: Simulation for Syndromic Surveillance Thomas Lotze Applied Mathematics and Scientific Computation University of Maryland Galit Shmueli and.
Surveillance of gastroenteritis using drug sales data in France Mathilde Pivette, PharmD, MPH Pr Avner Bar-Hen Dr Pascal Crépey.
Avar Monitoring the blogosphere for emerging, health related events, so Health Officials don‘t have to Team Mentor: Avaré Stewart.
UNCLASSIFIED Building Biosurveillance Systems for Early Detection of Public Health Events Central Asia Regional Health Security Conference April.
An introduction to time series approaches in biosurveillance Professor The Auton Lab School of Computer Science Carnegie Mellon University
 2004 University of Pittsburgh Bayesian Biosurveillance Using Multiple Data Streams Weng-Keen Wong, Greg Cooper, Denver Dash *, John Levander, John Dowling,
What’s Strange About Recent Events (WSARE) v3.0: Adjusting for a Changing Baseline Weng-Keen Wong (Carnegie Mellon University) Andrew Moore (Carnegie Mellon.
 2004 University of Pittsburgh Bayesian Biosurveillance Using Multiple Data Streams Greg Cooper, Weng-Keen Wong, Denver Dash*, John Levander, John Dowling,
Bayesian Biosurveillance Using Causal Networks Greg Cooper RODS Laboratory and the Laboratory for Causal Modeling and Discovery Center for Biomedical Informatics.
Bayesian Network Anomaly Pattern Detection for Disease Outbreaks Weng-Keen Wong (Carnegie Mellon University) Andrew Moore (Carnegie Mellon University)
1 Bayesian Network Anomaly Pattern Detection for Disease Outbreaks Weng-Keen Wong (Carnegie Mellon University) Andrew Moore (Carnegie Mellon University)
Staffing RODS in Ohio February 23 rd, 2006 Biosurveillance Information Exchange Working Group Rutgers University Piscataway, NJ Loren Shaffer, MPH
Overview of ‘Syndromic Surveillance’ presented as background to Multiple Data Source Issue for DIMACS Working Group on Adverse Event/Disease Reporting,
Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.
Surveillance. Definition Continuous and systematic process of collection, analysis, interpretation, and dissemination of descriptive information for monitoring.
Stubbornsoft. Point of Sale Application Retail & POS Management Solution supports bar code readers for faster checkout and inventory control, printing.
Use of epidemiologic methods in disaster management Dr AA Abubakar Dept of Community Medicine Ahmadu Bello University Zaria Nigeria.
© 2012 TeraMedica, Inc. Big Data: Challenges and Opportunities for Healthcare Joe Paxton Healthcare and Life Sciences Sales Leader.
Towards Detecting Influenza Epidemics by Analyzing Twitter Massages Aron Culotta Jedsada Chartree.
From Pandemic Preparedness to Management: UK experience Professor Lindsey Davies CBE FRCP FFPH National Director of Pandemic Influenza Preparedness.
Using Disease Surveillance and Response to Facilitate Adaptation to Climate- Related Health Risks Kristie L. Ebi, Ph.D., MPH Development Day at COP-11.
SPONSOR JAMES C. BENNEYAN DEVELOPMENT OF A PRESCRIPTION DRUG SURVEILLANCE SYSTEM TEAM MEMBERS Jeffrey Mason Dan Mitus Jenna Eickhoff Benjamin Harris.
A Wavelet-based Anomaly Detector for Disease Outbreaks Thomas Lotze Galit Shmueli University of Maryland College Park Sean Murphy Howard Burkom Johns Hopkins.
Additional Data For Harmonized Use Case for Biosurveillance HINF 5430 Final Project By Maria Metty, Priyaranjan Tokachichu &Resty Namata December 13, 2007.
An Overview of Transaction Processing Systems
Highline Class, BI 348 Basic Business Analytics using Excel, Chapter 01 Intro to Business Analytics BI 348, Chapter 01.
Influenza-like Illness Surveillance at the National Level
1 ESSENCE: Biosurveillance in Support of the DoD Health Mission.
Analyzing over-the-counter medication purchases for early detection of epidemics and bio-terrorism by Anna Goldenberg Advisor: Rich Caruana Note: Sponsored.
Copyright © 2003 OPNET Technologies, Inc. Confidential, not for distribution to third parties. Session 1341: Case Studies of Security Studies of Intrusion.
Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11.
Harmonized Biosurveillance Use Case By Resty Namata, Maria Metty & Priyaranjan Tokachichu December 13, 2007.
Center for Computational Analysis of Social and Organizational Systems Dynamic Network Approach to Health Surveillance Prof.
A Simulation Model for Bioterrorism Preparedness in An Emergency Room Lisa Patvivatsiri Department of Industrial Engineering Texas Tech University Presented.
~PPT Howard Burkom 1, PhD Yevgeniy Elbert 2, MSc LTC Julie Pavlin 2, MD MPH Christina Polyak 2, MPH 1 The Johns Hopkins University Applied Physics.
1 Copyright © 2012 by Mosby, an imprint of Elsevier Inc. Copyright © 2008 by Mosby, Inc., an affiliate of Elsevier Inc. Chapter 24 Public Health Surveillance.
Collaboration Network in Healthcare E-RISE 2011 By Yudistira Asnar, Federica Paci (UNITN) May 13, 2011.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Bayesian Biosurveillance of Disease Outbreaks RODS Laboratory Center for Biomedical Informatics University of Pittsburgh Gregory F. Cooper, Denver H.
User Resources for the: One Health Harmful Algal Bloom System (OHHABS) and National Outbreak Reporting System (NORS) Updated: 06/15/2016.
Drug Utilization Review & Drug Utilization Evaluation: An Overview
Syndromic Surveillance and The Health Alert Network Lex Gibson Epidemiologist Alleghany/Roanoke City Health Districts.
Do You Really Know Your Data Users (and What Matters to Them)?
Infectious Diseases Surveillance in the Military
WebMD By: Zach Lanham.
Online Conditional Outlier Detection in Nonstationary Time Series
APHA, Washington, November, 2007
Epidemic Alerts EECS E6898: TOPICS – INFORMATION PROCESSING: From Data to Solutions Alexander Loh May 5, 2016.
Computers and Data Collection
Real-Time Bed Allocation with Dynamic Simulation
One Health Early Warning Alert
Influenza-like Illness Surveillance at the National Level
Component 11 Unit 7: Building Order Sets
Using Informatics to Promote Community/Population Health
Allscripts EHR: comprehensive solutions
Multiple models forecast Influenza
By: Mikey Ulibarri DevonVasquez
Presentation transcript:

Early Statistical Detection of Bio-Terrorism Attacks by Tracking OTC Medication Sales Galit Shmueli Dept. of Statistics and CALD Carnegie Mellon University With Stephen Fienberg (Statistics) Anna Goldenberg & Rich Caruana (CS)

Overview Current bio-surveillance systems – Monitoring traditional data – Using simple SPC methods Early detection – Use of non-traditional data – Building a flexible, automated detection system – Evaluating the system Results and enhancements

Traditional Data Sources Public health sources – School absence records – Sentinel practices – Laboratory data Medical sources – Patient visits at urgent care, outpatient clinics, emergency rooms Speed of detection: weeks after the actual occurrence – Rate of data arrival

Why is detection slow? Data arrives late – Projects using electronic reporting systems: Influenza surveillance system (U of Utah) Tracking ICD9 codes (U of Pittsburgh) Future: increasing availability of electronic means for gathering surveillance data Data available on weekly or monthly scale Data are nation-wide Signature of outbreak in data is late!

Non-Traditional Data Data that indirectly measure symptoms – Over-the-counter medication and grocery sales – Web browsing at medical websites – Automatic body tracking devices Different levels of availability Regional, localized data Confidentiality issues

Manifestation of Flu in Traditional and Non-Traditional Data Lab Flu WebMD School Cough& Cold Throat Resp Viral Death weeks

OTC Medication and Grocery Sales Benefits – Manifestation of outbreak is very early – Timeliness in collection and reporting (daily) – Extremely detailed (basket-level) Drawbacks – No info about epidemic manifestation in sales data – Requires knowledge about marketing efforts (sales, discounts) – If outbreak replicates sales patterns – hard to detect (Holidays are a big challenge) – Hard to model!

Prior Uses of Non-Traditional Data Diarrheal Disease Surveillance: data from 38 drug stores in NY (Mikol et al., 2000) Monitoring near-real-time satellite vegetation and climate data for predicting emerging Rift Valley Fever epidemics in East Africa (DoD and NASA, 2001)

Description of Our Data Daily sales of several OTC medication groups for 541 days between Aug 8, ’ 99 to Jan 31, ‘ 01 Concentrated on cough&cold medication (inhalational symptoms): – Cough medication – Tabs & Caps – Nasal medication

Hypothetical Scenario of an Inhalational Anthrax Attack Symptoms: almost all typical to flu! – fever – fatigue – cough – mild chest discomfort – but no runny nose (!) Death may occur within hours

Sales of Four Sub-Categories

Overview Current bio-surveillance systems Non-traditional data The detection system An evaluation method Results and Conclusions Future work

The Detection System Take into account special features of OTC and grocery sales data – Time series – Seasonality – Weekday/Weekend effect – Stores closed on certain days – Influence of total sales patterns – Very noisy, non-stationary Create automated system

Layers of the Detection System WARNING! – POSSIBLE BEGINNING OF AN EPIDEMIC/ATTACK YES Real-time sales > threshold Preprocessing Forecasting next day sales Creating a threshold New day sales NO De-noising

Pre-Processing

De-Noising Target: obtain main features of data, reduce noise to improve predictability Selected method: Discrete Cosine Transform with horizontal filtering How much to de-noise? – Retain minimal coefficient set that Maximizes accuracy Optimizes predictability – Use cross-validation and MSE-based criteria

De-Noising: DCT with Horizontal Filtering de-noised set 2 de-noised set 1

Forecasting Target: Predict next day sales Use pre-processed, de-noised data Problem: non-stationary (ARIMA doesn ’ t work) Method: 1) decompose with wavelets 2) predict each wavelet resolution 3) sum to obtain overall prediction

Prediction Using Wavelets

Threshold Selection: SPC Based on empirical distribution of residuals (real values – predictions), we fit a “ 3σ ” limit

Comparing Next-Day Sales to the Threshold

Overview Current bio-surveillance systems Non-traditional data The detection system An evaluation method Results and Conclusions Ongoing work (basket-level data) Future work

Evaluating the System How fast does it detect an anthrax footprint? Problems: – data does not include outbreak signature – We don ’ t know what signature looks like in such data Solution: simulated signature day spike base Inhalational anthrax signature

Constructing the Signature Sverdlovsk outbreak, 1979 Based on data from Meselson et al., Science (1994)

Anthrax Signature in OTC Sales Add signature at each data point sequentially, and look at rate of detection Try different slopes, heights Compare different configurations of system for different signatures slope = 1/3 Detects 100% of spikes within 3 days for height = 1.3(data range)

Results and Conclusions The detection system – works with grocery data – detects simulated footprint quickly – has low false alarm rate The system is flexible (tools are interchangeable) Almost fully automated, efficient computation “ Perfect bio-attack ” is on holiday

Future Work Combine with traditional medical and public health data sources Aggregated data: Track several series simultaneously Basket data: Utilize other features of grocery data such as spatial factor, customer information