Bayesian Biosurveillance Using Causal Networks Greg Cooper RODS Laboratory and the Laboratory for Causal Modeling and Discovery Center for Biomedical Informatics.

Slides:



Advertisements
Similar presentations
1 Probability and the Web Ken Baclawski Northeastern University VIStology, Inc.
Advertisements

Autonomic Scaling of Cloud Computing Resources
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.
BAYESIAN NETWORKS. Bayesian Network Motivation  We want a representation and reasoning system that is based on conditional independence  Compact yet.
Dynamic Bayesian Networks (DBNs)
2005 Syndromic Surveillance1 Estimating the Expected Warning Time of Outbreak- Detection Algorithms Yanna Shen, Weng-Keen Wong, Gregory F. Cooper RODS.
1 Knowledge Engineering for Bayesian Networks. 2 Probability theory for representing uncertainty l Assigns a numerical degree of belief between 0 and.
What is Statistical Modeling
Optimizing Disease Outbreak Detection Methods Using Reinforcement Learning Masoumeh Izadi Clinical & Health Informatics Research Group Faculty of Medicine,
Bayesian Biosurveillance Gregory F. Cooper Center for Biomedical Informatics University of Pittsburgh The research described in this.
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) March, 16, 2009.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11
 2004 University of Pittsburgh Bayesian Biosurveillance Using Multiple Data Streams Weng-Keen Wong, Greg Cooper, Denver Dash *, John Levander, John Dowling,
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
 2004 University of Pittsburgh Bayesian Biosurveillance Using Multiple Data Streams Greg Cooper, Weng-Keen Wong, Denver Dash*, John Levander, John Dowling,
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
. Bayesian Networks Lecture 9 Edited from Nir Friedman’s slides by Dan Geiger from Nir Friedman’s slides.
Part 2 of 3: Bayesian Network and Dynamic Bayesian Network.
Model N : The total number of patients in an anthrax outbreak who are seen by clinicians. DT : The time to detect the anthrax outbreak Detection : The.
Weng-Keen Wong, Oregon State University © Bayesian Networks: A Tutorial Weng-Keen Wong School of Electrical Engineering and Computer Science Oregon.
Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
Conclusions On our large scale anthrax attack simulations, being able to infer the work zip appears to improve detection time over just using the home.
Evaluation of Bayesian Networks Used for Diagnostics[1]
1 © 1998 HRL Laboratories, LLC. All Rights Reserved Development of Bayesian Diagnostic Models Using Troubleshooting Flow Diagrams K. Wojtek Przytula: HRL.
Population-Wide Anomaly Detection Weng-Keen Wong 1, Gregory Cooper 2, Denver Dash 3, John Levander 2, John Dowling 2, Bill Hogan 2, Michael Wagner 2 1.
Bayesian Networks Alan Ritter.
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Judgment and Decision Making in Information Systems Computing with Influence Diagrams and the PathFinder Project Yuval Shahar, M.D., Ph.D.
Jeff Howbert Introduction to Machine Learning Winter Classification Bayesian Classifiers.
Quiz 4: Mean: 7.0/8.0 (= 88%) Median: 7.5/8.0 (= 94%)
Read R&N Ch Next lecture: Read R&N
Epidemiology The Basics Only… Adapted with permission from a class presentation developed by Dr. Charles Lynch – University of Iowa, Iowa City.
Additional Data For Harmonized Use Case for Biosurveillance HINF 5430 Final Project By Maria Metty, Priyaranjan Tokachichu &Resty Namata December 13, 2007.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Lecture 8: Generalized Linear Models for Longitudinal Data.
CS 4100 Artificial Intelligence Prof. C. Hafner Class Notes March 13, 2012.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Digital Statisticians INST 4200 David J Stucki Spring 2015.
Introduction to Bayesian Networks
Harmonized Biosurveillance Use Case By Resty Namata, Maria Metty & Priyaranjan Tokachichu December 13, 2007.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
1 Bayesian Networks: A Tutorial. 2 Introduction Suppose you are trying to determine if a patient has tuberculosis. You observe the following symptoms:
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Bayesian Disease Outbreak Detection that Includes a Model of Unknown Diseases Yanna Shen and Gregory F. Cooper Intelligent Systems Program and Department.
CPSC 322, Lecture 26Slide 1 Reasoning Under Uncertainty: Belief Networks Computer Science cpsc322, Lecture 27 (Textbook Chpt 6.3) Nov, 13, 2013.
Chapter Two Copyright © 2006 McGraw-Hill/Irwin The Marketing Research Process.
Dependency Networks for Inference, Collaborative filtering, and Data Visualization Heckerman et al. Microsoft Research J. of Machine Learning Research.
VIEWS b.ppt-1 Managing Intelligent Decision Support Networks in Biosurveillance PHIN 2008, Session G1, August 27, 2008 Mohammad Hashemian, MS, Zaruhi.
Bayesian Biosurveillance of Disease Outbreaks RODS Laboratory Center for Biomedical Informatics University of Pittsburgh Gregory F. Cooper, Denver H.
Bayesian Networks Chapter 2 (Duda et al.) – Section 2.11 CS479/679 Pattern Recognition Dr. George Bebis.
Qian Liu CSE spring University of Pennsylvania
Read R&N Ch Next lecture: Read R&N
Bayesian Biosurveillance of Disease Outbreaks
Learning Bayesian Network Models from Data
Read R&N Ch Next lecture: Read R&N
Gregory Cooper Professor of Biomedical Informatics Director, Center for Causal Discovery Vice Chair, Department of Biomedical Informatics Research involves.
A Short Tutorial on Causal Network Modeling and Discovery
Estimating the Expected Warning Time of Outbreak-Detection Algorithms
Propagation Algorithm in Bayesian Networks
Causal Models Lecture 12.
Read R&N Ch Next lecture: Read R&N
Presentation transcript:

Bayesian Biosurveillance Using Causal Networks Greg Cooper RODS Laboratory and the Laboratory for Causal Modeling and Discovery Center for Biomedical Informatics University of Pittsburgh

Outline Biosurveillance goals Biosurveillance as diagnosis of a population Introduction to causal networks Examples of using causal networks for biosurveillance Summary and challenges

Biosurveillance Detection Goals Detect an unanticipated biological disease outbreak in the population as rapidly and as accurately as possible Determine the people who already have the disease Predict the people who are likely to get the disease

Biosurveillance as Diagnosis of a Population

The Similarity of Patient Diagnosis and Population Diagnosis Patient disease Patient symptom 1 Patient symptom 2 Patient risk factors Population disease Symptoms of patient 1 Population risk factors Symptoms of patient 2

Simple Examples of Patient Diagnosis and Population Diagnosis lung cancer weight lossfatigue smoking aerosolized release of anthrax Patient 1 has respiratory symptoms threats of bioterrorism Patient 2 has respiratory symptoms

Population Diagnosis with a More Detailed Patient Model aerosolized release of anthrax threats of bioterrorism patient 1 disease status patient 2 disease status respiratory symptoms wide mediastinum on X-ray respiratory symptoms wide mediastinum on X-ray ? ? ?

Population-Level “Symptoms” aerosolized release of anthrax threats of bioterrorism patient 1 disease status patient 2 disease status respiratory symptoms wide mediastinum on X-ray respiratory symptoms wide mediastinum on X-ray local sales of over-the- counter (OTC) cough medications

An Alternative Way of Modeling OTC Sales aerosolized release of anthrax threats of bioterrorism patient 1 disease status patient 2 disease status respiratory symptoms wide mediastinum on X-ray respiratory symptoms wide mediastinum on X-ray local sales of over-the- counter (OTC) cough medications

aerosolized release of anthrax threats of bioterrorism patient 1 disease status patient 2 disease status respiratory symptoms wide mediastinum on X-ray respiratory symptoms wide mediastinum on X-ray sales of over-the-counter (OTC) cough medications

An Introduction to Causal Networks A causal network has two components: –Structure: A diagram in which nodes represent variables and arcs between nodes represent causal influence * –Parameters: A probability distribution for each effect given its direct causes * The diagram (graph) is not allowed to contain directed cycles, which conveys that an effect cannot cause itself.

An Example of a Causal Network Causal network structure: Causal network parameters: * aerosolized release of anthrax (ARA) patient disease status (PDS) respiratory symptoms (RS) P(ARA = true) = P(PDS = respiratory anthrax | ARA = true) = P(PDS = respiratory anthrax | ARA = false) = P(RS = present | PDS = respiratory anthrax) = 0.8 P(RS = present | PDS = other) = 0.1 * These parameters are for illustration only.

A Previous Example of a Causal Network aerosolized release of anthrax threats of bioterrorism patient 1 disease status patient 2 disease status respiratory symptoms wide mediastinum on X-ray respiratory symptoms wide mediastinum on X-ray sells of over-the-counter (OTC) cough medications

The Causal Markov Condition The Causal Markov Condition: Let D be the direct causes of a variable X in a causal network. Let Y be a variable that is not causally influenced by X (either directly or indirectly). Then X and Y are independent given D. aerosolized release of anthrax patient disease status respiratory symptoms D X Y Example:

A Key Intuition Behind the Causal Markov Condition An effect is independent of its distant causes, given its immediate causes aerosolized release of anthrax patient disease status respiratory symptoms D X Y Example:

Joint Probability Distributions For a model with binary variables X and Y, the joint probability distribution is: { P(X = t, Y = t), P(X = t, Y = f), P(X = f, Y = t), P(X = f, Y = f)} We can use the joint probability distribution to derive any conditional probability of interest on the model variables. Example: P(X = t | Y = t)

A Causal Network Specifies a Joint Probability Distribution The causal Markov condition permits the joint probability distribution to be factored as follows: Example: P(RS, PDS, ARA) = P(RS | PDS) P(PDS | ARA) P(ARA) ARA PDS RS

Inference algorithms exist for deriving a conditional probability of interest from the joint probability distribution defined by a causal network. Example: P(ARA = + | TOB = +, Pt1_RS = +, Pt2_WM = +, OTC = ) Causal Network Inference aerosolized release of anthrax (ARA) threats of bioterrorism (TOB) respiratory symptoms wide mediastinum on X-ray respiratory symptoms (RS) wide mediastinum on X-ray (WM) sales of over-the-counter (OTC) cough medications patient 1 (Pt1) disease status patient (Pt2) disease status ? ? ?

Examples of Using Bayesian Inference on Causal Networks for Biosurveillance The following models are highly simplified and serve as simple examples that suggest a set of research issues They are intended only to illustrate basic principles These models were implemented using Hugin (version 6.1)

Basic Population Model

Prior Risk of Release of Agent X

Basic Patient Model

A Model with One Patient Case

A Model with One Abstracted Patient Case

Where do the probabilities come from? Databases of prior cases Case studies in the literature Animal studies Computer models (e.g., particle dispersion models) Expert assessments

A Model with One Abstracted Patient Case

An Example in Which a Single Patient Case Is Inadequate to Detect a Release Data: A patient who presents with respiratory symptoms today

How Might We Distinguish Anticipated Diseases (e.g., Influenza) from Unanticipated Diseases (e.g., Respiratory Anthrax)? Differences in their expected spatio- temporal patterns over the population may be very helpful.

A Model with Two Patient Cases

A Model with Three Patient Cases

A Model with Ten Patient Cases

A Hypothetical Population of Ten People (not all of whom are patients) PersonHome LocationDay of ED VisitED Symptoms 1area 1yesterdayrespiratory 2area 1yesterdaynon-respiratory 3area 2yesterdaynon-respiratory 4area 2no visit to EDNA 5area 1no visit to EDNA 6area 1todayrespiratory 7area 2todaynon-respiratory 8area 1todayrespiratory 9area 1no visit to EDNA 10area 2no visit to EDNA

Posterior Probability of a Release of X Among the Population of Ten People Being Modeled

Adding Population-Based Data Data: Increased OTC sales of cough medications today

For Each Person in the Population a Probability of Current Infection with Disease X Can be Estimated PersonHome LocationDay of ED VisitED SymptomsRisk for Disease X 1area 1yesterdayrespiratory26% 2area 1yesterdaynon-respiratory9% 3area 2yesterdaynon-respiratory6% 4area 2no visit to EDNA< 1% 5area 1no visit to EDNA< 1% 6area 1todayrespiratory27% 7area 2todaynon-respiratory11% 8area 1todayrespiratory27% 9area 1no visit to EDNA< 1% 10area 2no visit to EDNA< 1%

Modeling the Frequency Distribution Over the Number of Infected People

The Frequency Distribution Over the Number of Infected People in the Example

A More Detailed Patient Model

Incorporating Heterogeneous Patient Models Data: Same as before, except patient 1 is now known to have a chest X-ray result that is consistent with Disease X

We Can Use the Derived Posterior Probabilities in a Computer-Based Ongoing Decision Analysis sound an alarm keep silent P(dx X | evidence) P(no dx X | evidence) P(dx X | evidence) P(no dx X | evidence) U(alarm, dx X) U(alarm, no dx X) U(silent, dx X) U(silent, no dx X) The probabilities in blue can be derived using a causal network.

Summary of Bayesian Biosurveillance Using Causal Networks Biosurveillance can be viewed as ongoing diagnosis of an entire population. Causal networks provide a flexible and expressive means of coherently modeling a population at different levels of detail. Inference on causal networks can derive the type posterior probabilities needed for biosurveillance. These probabilities can be used in a decision analytic system that determines whether to raise an alarm (and that can recommend which additional data to collect).

Challenges Include...

One Challenge: Modeling Contagious Diseases One approach: Include arcs among the disease- status nodes of individuals who were in close proximity of each other during the period of concern being modeled.

Another Challenge: Achieving Tractable Inference on Very Large Causal Networks Possible approaches include: –Aggregating individuals into equivalence classes to reduce the size of the causal network –Use sampling methods to reduce the time of inference (at the expense of deriving only approximate posterior probabilities)

Some Additional Challenges Constructing realistic outbreak models Constructing realistic decision models about when to raise an alert Developing explanations of alerts Evaluating the detection system

Suggested Reading R.E. Neapolitan, Learning Bayesian Networks (Prentice Hall, 2003).

A Sample of Causal Network Commercial Software Hugin: Netica: Bayesware: