Statistical inference for epidemics on networks PD O’Neill, T Kypraios (Mathematical Sciences, University of Nottingham) Sep 2011 ICMS, Edinburgh.

Slides:



Advertisements
Similar presentations
Bayesian Estimation in MARK
Advertisements

Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Gibbs Sampling Qianji Zheng Oct. 5th, 2010.
Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Introduction to Sampling based inference and MCMC Ata Kaban School of Computer Science The University of Birmingham.
CHAPTER 16 MARKOV CHAIN MONTE CARLO
BAYESIAN INFERENCE Sampling techniques
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.
Maximum likelihood (ML) and likelihood ratio (LR) test
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Maximum likelihood (ML) and likelihood ratio (LR) test
An Optimal Learning Approach to Finding an Outbreak of a Disease Warren Scott Warren Powell
Nonlinear and Non-Gaussian Estimation with A Focus on Particle Filters Prasanth Jeevan Mary Knox May 12, 2006.
Presenting: Assaf Tzabari
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Particle Filters for Mobile Robot Localization 11/24/2006 Aliakbar Gorji Roborics Instructor: Dr. Shiri Amirkabir University of Technology.
Today Introduction to MCMC Particle filters and MCMC
Maximum Likelihood (ML), Expectation Maximization (EM)
Value of Information for Complex Economic Models Jeremy Oakley Department of Probability and Statistics, University of Sheffield. Paper available from.
Computer vision: models, learning and inference Chapter 10 Graphical Models.
July 3, A36 Theory of Statistics Course within the Master’s program in Statistics and Data mining Fall semester 2011.
Maximum likelihood (ML)
Robin McDougall, Ed Waller and Scott Nokleby Faculties of Engineering & Applied Science and Energy Systems & Nuclear Science 1.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
Queensland University of Technology CRICOS No J Towards Likelihood Free Inference Tony Pettitt QUT, Brisbane Joint work with.
Genetic Regulatory Network Inference Russell Schwartz Department of Biological Sciences Carnegie Mellon University.
Bayesian parameter estimation in cosmology with Population Monte Carlo By Darell Moodley (UKZN) Supervisor: Prof. K Moodley (UKZN) SKA Postgraduate conference,
Statistical Decision Theory
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
Module 1: Statistical Issues in Micro simulation Paul Sousa.
Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Monte Carlo Methods So far we have discussed Monte Carlo methods based on a uniform distribution of random numbers on the interval [0,1] p(x) = 1 0  x.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
MCMC (Part II) By Marc Sobel. Monte Carlo Exploration  Suppose we want to optimize a complicated distribution f(*). We assume ‘f’ is known up to a multiplicative.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
An Introduction to Markov Chain Monte Carlo Teg Grenager July 1, 2004.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
Sampling and estimation Petter Mostad
by Ryan P. Adams, Iain Murray, and David J.C. MacKay (ICML 2009)
Inference of Non-Overlapping Camera Network Topology by Measuring Statistical Dependence Date :
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
The Unscented Particle Filter 2000/09/29 이 시은. Introduction Filtering –estimate the states(parameters or hidden variable) as a set of observations becomes.
Analysis of Social Media MLD , LTI William Cohen
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
A latent Gaussian model for compositional data with structural zeroes Adam Butler & Chris Glasbey Biomathematics & Statistics Scotland.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
Gil McVean, Department of Statistics Thursday February 12 th 2009 Monte Carlo simulation.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Graduate : Yu Cheng Chen Author: Lynette.
Markov-Chain-Monte-Carlo (MCMC) & The Metropolis-Hastings Algorithm P548: Intro Bayesian Stats with Psych Applications Instructor: John Miyamoto 01/19/2016:
Introduction to ERGM/p* model Kayo Fujimoto, Ph.D. Based on presentation slides by Nosh Contractor and Mengxiao Zhu.
Analysis of Social Media MLD , LTI William Cohen
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
CS498-EA Reasoning in AI Lecture #19 Professor: Eyal Amir Fall Semester 2011.
Introduction to Sampling based inference and MCMC
MCMC Output & Metropolis-Hastings Algorithm Part I
Hidden Markov Models Part 2: Algorithms
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Learning From Observed Data
Presentation transcript:

Statistical inference for epidemics on networks PD O’Neill, T Kypraios (Mathematical Sciences, University of Nottingham) Sep 2011 ICMS, Edinburgh

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh The basic problem Given data on a network and an infectious disease, can model parameters be inferred? 1. Orientation

Sep 2011 ICMS, Edinburgh The basic problem Data Can be partial or complete for network Usually partial for disease Can be multi-scale May be longitudinal or not 1. Orientation

Sep 2011 ICMS, Edinburgh The basic problem Model Can be for the network Can be for the disease Can be both 1. Orientation

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh Consider Erdös-Renyi random graph on N vertices. Let p = Prob(two edges connected) Run an SIR model on graph: Infection rate = β, Removal rate = γ 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh Given complete observation of removal process, we wish to infer p, β and γ i.e. find posterior density  (p, β, γ | data) 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh Bayes’ Theorem gives  (p, β, γ | data)   (data | p, β, γ)  (p, β, γ) However, the likelihood  (data | p, β, γ) is intractable in practice. 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh One solution is to augment the parameter space to include the unobserved infection events. This leads to a tractable likelihood, and the resulting posterior density can be explored using MCMC methods. 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh Britton & O’Neill (2002) – basic idea Neal & Roberts (2005) – improved computational aspects Ray & Marzouk (2008) – extended to two populations Groendyke, Welch & Hunter (2011a) – SEIR model 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh Groendyke, Welch & Hunter (2011b) – More general network model where p jk = function of covariates of j, k and (j,k) but edges are still independent 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh General comment – this estimation problem often leads to parameter identifiability issues. e.g. A highly connected network and low- infectivity disease, or a sparse network and high-infectivity disease? 2. Inference for epidemics Inference for network and disease given partial temporal data

Sep 2011 ICMS, Edinburgh Data tell us which individuals become infected and who is connected to whom. Again the likelihood is intractable. Augment data with network of infectious contacts (Demiris & O’Neill 2005; O’Neill 2009; van Boven et al. 2010). 2. Inference for epidemics Inference for disease given final outcome data and network data

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh Most real-life networks require more general models which can incorporate a wide range of features. e.g. transitivity, homophily, self-organization, … 3. Network models

Sep 2011 ICMS, Edinburgh Basic idea: Directed edges have covariates X(i,j) Each vertex has a position in multivariate social space Z(i). Edge prob(i,j) = f( X(i,j), | Z(i) – Z(j) | ). Z(i)’s are i.i.d. (e.g. Gaussian mixture). 3. Network models Latent position cluster models (Handcock, Raftery & Tantrum, 2007)

Sep 2011 ICMS, Edinburgh Key point is that edge probabilities are conditionally (upon the Z(i)’s) independent. Given data on observed edges, inference can be carried out using MCMC or even ML. 3. Network models Latent position cluster models (Handcock, Raftery & Tantrum, 2007)

Sep 2011 ICMS, Edinburgh Very widely used class of models in social network literature. Can incorporate many features of interest. 3. Network models Exponential Random Graph Models (Frank & Strauss, 1986)

Sep 2011 ICMS, Edinburgh Let Y be a random N  N adjacency matrix: Y(i,j) = 1 if edge from i to j is present, 0 if not. For Y=y, i = 1,…,m, s(i,y) denotes a summary statistic of y (e.g. number of edges, triangles, 3-stars, ….) 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Then the ERGM is defined by  ( y |  ) = exp (  i  (i) s(i,y) ) / z(  ) Where  = (  (1), …,  (m)) is a real m-vector, z(  ) =  y exp (  i  (i) s(i,y) ) 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Example: N=3, s(1,y) = # edges, s(2,y) = # triangles 8 possible graphs (4 up to isomorphism) 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh  ( y |  )  1 e  (1) e 2  (1) e 3  (1)+  (2) z(  ) = 1 + 3e  (1) + 3e 2  (1) + e 3  (1)+  (2) 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh  (i) > 0 promotes s(i,y)  (i) < 0 inhibits s(i,y) e.g. in the example  (1) > 0 promotes edges  (1) < 0 inhibits edges 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Often see near-degeneracy in ERGMs in the sense that small number of graphs y are far more likely than all the others. 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh  =(2,1)  ( y |  )  Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh A key computational problem with ERGMs is that z(  ) =  y exp (  i  (i) s(i,y) ) is intractable unless N is very small. 3. Network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh Options include: Maximum pseudolikelihood – not that good in general Monte Carlo ML estimation – various practical problems 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Standard MCMC cannot be used since the posterior density is “doubly intractable”:  (  |y)   (y|  )  (  ) = f(y|  )  (  ) / z(  ) i.e. the likelihood itself is only known up to proportionality (know f(y|  ), not z(  ) ). 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh One option (Möller et al., 2006) is to augment the parameter space to include a new variable on the data space – call this x – and then work with the augmented posterior density  ( x,  | y). 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh  ( x,  | y) =  ( x | , y)  (  | y) =  ( x | , y) f(y |  )  (  ) / z(  )  (y) 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh A Metropolis-Hastings algorithm requires a proposal to update (x,  ). If we can draw a random graph from the distribution of y given  then we may choose q(x*,  * | x,  ) = q(x* |  ) q (  * |  ) = f (x * |  ) q (  * |  ) / z(  ) 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh The resulting M-H acceptance probability ratio is then of the form  ( x* |  *, y) f(y |  *) f(x|  ) q(  |  *)  (  *)  ( x | , y) f(y |  ) f(x*|  ) q(  * |  )  (  ) and z(  ) is not required. 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh The crucial assumption is the ability to sample from the original ERGM given  ; in practice this is usually achieved using MCMC. Variations of the Möller method have been developed – essentially choices of  ( x | , y). 4. Inference for network models Exponential Random Graph Models

Sep 2011 ICMS, Edinburgh Outline 1. Orientation 2. Inference for epidemics 3. Network models 4. Inference for network models 5. Open problems

Sep 2011 ICMS, Edinburgh 5. Open Problems 1. Simulating random graphs from ERGMs? MCMC is considered as the gold-standard method to draw from  (y|  ) for given  -- essential in order to draw inference for . Is it possible to use an exact algorithm instead? For instance, rejection sampling? What would be a good proposal distribution? Efficiency?

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? Bayesian inference for ERGMs often relies on advanced MCMC algorithms (Cairo and Friel, 2010) Alternatively, one can resort to approximate methods which are easier to implement.

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? Data y; parameter  ; target distribution  (  |y). Consider the following algorithm: 1.Draw  * from the prior  (  ). 2.Simulate data y* from  (y*|  *) 3.If y* = y then accept  *. 4.Goto 1.

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? No evaluation of the likelihood is required (suitable when the likelihood is intractable or expensive to compute). Relies on being able to simulate data from the model (which is usually easy to do so... ) Step 3 may not be feasible in practice...

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? A variation of the previous algorithm: 1.Draw  * from the prior  (  ). 2.Simulate data y* from  (y*|  *) 3.If ρ(y, y*) ≤ ε then accept  *. 4.Goto 1. where ρ(y, y*) is a measure of distance between y and y*.

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? Summary statistics Instead of calculating the distance between the “raw data” y and y*, we can calculate the distance between some summary statistics of the data S(y) and S(y*), i.e. ρ(S(y), S(y*))

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? Recall that the likelihood function is written as  (y|  )=exp(  i  (i) s(i,y) ) / z(  ). Therefore, a natural choice for summary statistics could be: s(1,y), s(2,y),... which are sufficient statistics too.

Sep 2011 ICMS, Edinburgh 5. Open Problems 2. Approximate inference for ERGMs? Approximate Bayesian Computation (ABC) Challenges How to choose the distance metric ρ(∙) ? How to choose ε ? Sequential Monte Carlo (SMC) methods.

Sep 2011 ICMS, Edinburgh 5. Open Problems 3. Model Choice for ERGMs? Suppose we have some network data and a number of different ERGMs that could we could fit to these data. How do we decide which ERGM do the data support most? How can we tell if a particular ERGM model offers a good fit to the data? Model choice/selection

Sep 2011 ICMS, Edinburgh 5. Open Problems 3. Model Choice for ERGMs? Bayesian model choice, in general, can be problematic (Bayes Factors, marginal likelihoods). Key concept is the marginal likelihood,  (y) :  (  |y) =  (y|  )  (  ) /  (y) where  (y) = ∫  (y|  )  (  ) d 

Sep 2011 ICMS, Edinburgh 5. Open Problems 3. Model Choice for ERGMs? Exact (Bayesian) inference for ERGMs is itself hard due the fact that the posterior density is “doubly intractable”:  (  |y)   (y|  )  (  ) = f(y|  )  (  ) / z(  ) Hence, (Bayesian) model choice would be even harder due to z(  ) being unknown.

Sep 2011 ICMS, Edinburgh 5. Open Problems 4. Need for alternative, computationally tractable network models? Using ERGMS in large networks can be very computationally intensive. Need for developing models which preserve (some of) the nice features of ERGMs but, are easier to handle computationally and more suitable for epidemic modelling?