Bayesian Disease Outbreak Detection that Includes a Model of Unknown Diseases Yanna Shen and Gregory F. Cooper Intelligent Systems Program and Department of Biomedical Informatics University of Pittsburgh
Introduction Outbreak detection algorithms: Outbreak detection algorithms: –Specific detection algorithms Look for pre-defined anomalous pattern in the data Look for pre-defined anomalous pattern in the data –Non-specific detection algorithms Try to detect any anomalous events, relative to some baseline of “normal” behavior Try to detect any anomalous events, relative to some baseline of “normal” behavior
Safety-net detection approaches Our safety-net algorithm: Our safety-net algorithm: –A hybrid method that combines the specific and non-specific detection approaches –Detect known causes of anomalies well while having the non-specific approach serve as a “safety-net” –Bayesian approach –Operate on a time series of Emergency Department (ED) patient symptoms such as cough, fever and diarrhea
The population-wide disease model outbreak disease in population fraction person_1 diseaseperson_2 diseaseperson_N disease person_1 evidence person_2 evidence person_N evidence...
An example population- wide disease model outbreak disease in population fraction person_1 diseaseperson_2 diseaseperson_N disease person_1 cough state person_2 cough state person_N cough state... person’s disease state person’s disease state Non-outbreak disease (d 0 ) specific outbreak disease (d k ) Unknown disease (d*) P(cough state = true | person’s disease state) p 0 ~ p 0 ~ Beta(α 0,β 0 ) p k p k ~ Beta(α k,β k ) p * ~ Beta(1,1)
Inference Derive the posterior probability P(pop_dx | data) Derive the posterior probability P(pop_dx | data) Derive P(data | pop_dx) Derive P(data | pop_dx) –Time complexity is exponential in N E (number of people who come to the ED) Adapted the inference method given in (Cooper 1995), which performs inference that is polynomial in N E Adapted the inference method given in (Cooper 1995), which performs inference that is polynomial in N E outbreak disease in population fraction person_1 disease person_2 disease person_n disease person_1 cough state person_2 cough state person_N cough state... pop_dx data P(cough | disease state) = p u, where p u ~ Beta(α u,β u )
Creating the datasets Create a background time series: Create a background time series: –Simulate the number of people who came to the ED on a given day without any disease outbreak –Simulate the cough status for each of these people Create the outbreak cases by using FLOO (Neill 2005) Create the outbreak cases by using FLOO (Neill 2005) Overlay the outbreak cases onto Overlay the outbreak cases onto the simulated background cases
Experimental setup 1 Let d u and d v be two CDC Category A diseases and d u ≠ d v Let d u and d v be two CDC Category A diseases and d u ≠ d v Model: Test data: A1B1
Result (A1 vs. B1) Plots showing the AMOC performances for experiment A1 and B1 Plots showing the AMOC performances for experiment A1 and B1
Experimental setup 2 Model: Test data: A2B2
Result (A2 vs. B2) Plots showing the AMOC performances for experiment A2 and B2 Plots showing the AMOC performances for experiment A2 and B2
Summary Introduced a Bayesian method for detecting disease outbreaks that combines a specific detection method with a non-specific method Introduced a Bayesian method for detecting disease outbreaks that combines a specific detection method with a non-specific method Provided support that this hybrid approach helps detect unexpected disease more than it interferes with detecting unknown diseases Provided support that this hybrid approach helps detect unexpected disease more than it interferes with detecting unknown diseases
Future work Explore distributions other than the uniform distribution for a disease symptom, such as cough, for the safety-net disease Explore distributions other than the uniform distribution for a disease symptom, such as cough, for the safety-net disease Extend the model to consider multiple person evidences Extend the model to consider multiple person evidences
Acknowledgements This research was funded by a grant from the National Science Foundation (NSF IIS ) This research was funded by a grant from the National Science Foundation (NSF IIS ) We thank the colleagues from the Department of Biomedical Informatics, the University of Pittsburgh, for their helpful comments on this work. We thank the colleagues from the Department of Biomedical Informatics, the University of Pittsburgh, for their helpful comments on this work. –Wendy Chapman –John Dowling –John Levander –Melissa Saul –Garrick Wallstrom