1 Spatial assessment of deprivation and mortality risk in Nova Scotia: Comparison between Bayesian and non-Bayesian approaches Prepared for 2008 CPHA Conference,

Slides:



Advertisements
Similar presentations
MCMC for Poisson response models
Advertisements

Introduction to Monte Carlo Markov chain (MCMC) methods
Lecture 23 Spatial Modelling 2 : Multiple membership and CAR models for spatial data.
MCMC estimation in MlwiN
Chapter 12 Inference for Linear Regression
Objectives 10.1 Simple linear regression
Forecasting Using the Simple Linear Regression Model and Correlation
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
Evaluating Diagnostic Accuracy of Prostate Cancer Using Bayesian Analysis Part of an Undergraduate Research course Chantal D. Larose.
2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
Introduction to Spatial Regression Glen Johnson, PhD Lehman College / CUNY School of Public Health
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
Statistical inference form observational data Parameter estimation: Method of moments Use the data you have to calculate first and second moment To fit.
USING BAYESIAN HIERARCHICAL MODELLING TO PRODUCE HIGH RESOLUTION MAPS OF AIR POLLUTION IN THE EU Gavin Shaddick University of Bath RSS Avon Local Group.
Presenting: Assaf Tzabari
GIS in Spatial Epidemiology: small area studies of exposure- outcome relationships Robert Haining Department of Geography University of Cambridge.
Classical and Bayesian analyses of transmission experiments Jantien Backer and Thomas Hagenaars Epidemiology, Crisis management & Diagnostics Central Veterinary.
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Chapter 12 Section 1 Inference for Linear Regression.
Department of Geography, Florida State University
Standard error of estimate & Confidence interval.
Scot Exec Course Nov/Dec 04 Ambitious title? Confidence intervals, design effects and significance tests for surveys. How to calculate sample numbers when.
Inference for regression - Simple linear regression
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
1/26/09 1 Community Health Assessment in Small Populations: Tools for Working With “Small Numbers” Region 2 Quarterly Meeting January 26, 2009.
2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici.
Incorporating heterogeneity in meta-analyses: A case study Liz Stojanovski University of Newcastle Presentation at IBS Taupo, New Zealand, 2009.
1 Statistical Distribution Fitting Dr. Jason Merrick.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
Biostatistics Class 6 Hypothesis Testing: One-Sample Inference 2/29/2000.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
2006 Summer Epi/Bio Institute1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Instructor: Elizabeth Johnson Lecture Developed: Francesca.
Maximum Likelihood - "Frequentist" inference x 1,x 2,....,x n ~ iid N( ,  2 ) Joint pdf for the whole random sample Maximum likelihood estimates.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
BMTRY 763. Space-time (ST) Modeling (BDM13, ch 12) Some notation Assume counts within fixed spatial and temporal periods: map evolutions Both space and.
Organization of statistical research. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and.
Machine Learning 5. Parametric Methods.
Item Parameter Estimation: Does WinBUGS Do Better Than BILOG-MG?
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Maximum likelihood estimators Example: Random data X i drawn from a Poisson distribution with unknown  We want to determine  For any assumed value of.
Statistical Methods. 2 Concepts and Notations Sample unit – the basic landscape unit at which we wish to establish the presence/absence of the species.
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Hypothesis Testing. Statistical Inference – dealing with parameter and model uncertainty  Confidence Intervals (credible intervals)  Hypothesis Tests.
1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Spatio-temporal Modelling and Mapping of Teenage Birth Data Paramjit Gill Okanagan University College, Kelowna, BC, Canada Abstract We.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Bursts modelling Using WinBUGS Tim Watson May 2012 :diagnostics/ :transformation/ :investment planning/ :portfolio optimisation/ :investment economics/
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
Markov Chain Monte Carlo in R
Multilevel modelling: general ideas and uses
Chapter 14 Introduction to Multiple Regression
Module 2: Bayesian Hierarchical Models
Bayes Net Learning: Bayesian Approaches
Bayesian Inference for Small Population Longevity Risk Modelling
More about Posterior Distributions
Bayesian Inference, Basics
Statistical NLP: Lecture 4
Simple Linear Regression
Parametric Methods Berlin Chen, 2005 References:
Presentation transcript:

1 Spatial assessment of deprivation and mortality risk in Nova Scotia: Comparison between Bayesian and non-Bayesian approaches Prepared for 2008 CPHA Conference, June 1-4, 2008 Mikiko Terashima Dr. Pantelis Andreou Dr. Judith Guernsey Department of Community Health & Epidemiology Dalhousie University

2 Objectives To present basic ideas of Bayesian approaches in disease mapping  Perspectives of a health researcher with minimum statistical background To demonstrate an example application of Bayesian approaches to spatial analysis of health in comparison with non-Bayesian approaches

4 Key differences Non-Bayesian  Parameters are assumed fixed (constant)  Use likelihood to calculate parameters  “Confidence interval”: Based on hypothetical situations where we were to sample e.g.100 times, 95 times out of them will contain the constant (true) parameter value  Standard Error (SE)  Chi-square, AIC, -2Log likelihood etc. for Goodness- of-fit Bayesian  Parameters are assumed random  Use likelihood and prior information to calculate posterior distribution, and estimate parameters from it (Estimated parameters [e.g. mean, median] are used like constant parameter values in non- Bayesian models)  “Credible interval”: Based on a number of simulations, we know the distribution of the parameter— posterior distribution—and 95% of them fall between the interval A and B.  Markov Chain Standard Error (MCSE)  DIC for goodness-of-fit (like AIC but does not require Maximum Likelihood of parameters)

5 Why use Bayesian approaches in spatial analysis of health/disease mapping? It can incorporate spatial random effects in the analysis of spatial variation in health, disease or health/disease risks—Bayesian way is more ‘natural’ and straightforward.  The belief of parameters being random fits assumption that geographical area rates are somewhat different. Because it “borrows strength” from prior knowledge about factors not in the data, the model is closer to reality than inferring things from only data –reduce noise from maps. It allows us to estimate values of areas with missing data by sampling the same way as other parameters.

6 Challenges “Black box” for non-statisticians “PC vs. Apple computers” New and constantly evolving Application of prior information requires statisticians’ insights and background knowledge about the variables.

7 Example application: Spatial analysis of deprivation and mortality in Nova Scotia communities

8 Scenario (Deprivation and morality in Nova Scotia) A number of deaths were observed in 278 communities of Nova Scotia over the period (obs). Counts of events are likely Poisson distributed. Obs~Po(  ) We want to see the association between deprivation at the community level and observed counts of death. The community level deprivation was measured by two sets of variables: material (Mat) and psychosocial (Psy). Spatial autocorrelation might be playing a role (assumed normal). We want to know if other (unknown) factors are affecting the rates. We know the expected counts for each community (exp). We want to map community level risks.

9 Poisson regression model of counts Offset

10 1. Log(  i )=offset i (+ β 0i ) + β 1i x Mat + β 2i x Psy 2. Log(  i )=offset i (+ β 0i ) + β 1i x Mat + β 2i x Psy β 0i ~ flat() β 1i ~ N(0, 1.0E-5) β 2i ~ N(0, 1.0E-5) v 0i ~ N(0, 1/σ 2 v ) u 0i (neigh)  CAR tau.u ~ gamma(0.5, ) tau.v ~ gamma(0.5, ) β 0i = β 0 + u 0i, u 0i ~ N(0, σ 0 ) β 1i = β 1 + u 1i, u 1i ~ N(0, σ 1 ) β 2i = β 2 + u 1i, u 1i ~ N(0, σ 2 ) e i ~ N(0, σ e ) Models Conditional Autoregressive model Simulated from a similar model template provided within WinBUGS software (unstructured random + neighbourhood effect) [+ e i ] tau.v + v 0i + u 0i (neigh)

11 Conditional AutoRegressive model (CAR)-Normal Where Number of neighbours Average of surrounding (adjacent) areas (neighbours) Weights are typically 1 (e.g. Besag, York & Mollie, 1991) Spatial autocorrelation effect for neighbourhood i A common model to deal with spatial autocorrelation in disease mapping  usually results in smoothing Spatial autocorrelation effect for neighbourhood i depends on the number of neighbours and average of surrounding neighbours

WinBUGS code model{ for (i in 1 : N) { # Likelihood O[i] ~ dpois(mu[i]) log(mu[i]) <- log(Mrexp[i]) + beta0 + beta1* Mat[i] + beta2 * Psy[i] + b[i] + h[i] RR[i] <- exp(beta0 + beta1* Mat[i] +beta2 * Psy[i] + u[i] + v[i]) Mrexp[i] ~ dgamma (1, 0.1) Mat[i] ~ dnorm (0.0, 1.0) Psy[i] ~ dnorm (0.0, 1.0) # Exchangeable prior on unstructured random effects v[i] ~ dnorm(0, tau.v) } # CAR prior distribution for spatial random effects: u[1:N] ~ car.normal(adj[], weights[], num[], tau.u) for(k in 1:sumNumNeigh) { weights[k] <- 1 } # Other priors: beta0 ~ dflat() beta1 ~ dnorm(0.0, 1.0E-5) beta2 ~ dnorm(0.0, 1.0E-5) tau.u ~ dgamma(0.5, ) sigma.u <- sqrt(1 / tau.u) tau.v ~ dgamma(0.5, ) sigma.v <- sqrt(1 / tau.v) } CAR Filling missing values Other parameter priors Unstructured random effects Poisson regression model RR for each community

13 node mean sd MC error2.5%median97.5%startsample beta beta beta sigma.u sigma.v Parameter outputs Model 2 output with WinBUGS beta0/β 0 :Intercept Beta1/β 1 : Material deprivation Beta2/β 2 : Psychosocial deprivation sigma.u: Variance due to spatial autocorrelation sigma.v: Variance due to unstructured random effects Log (  ) = offset x Mat x Psy Model 1 output with MLwiN β0β0 β1β1 β2β2 DIC= DIC=

SMR map (Model 1) versus RR map (Model 2 ) Extremely high and low rates were reduced Rate at each community shrunk to the mean (1.0) Communities with missing data on the SMR map now have predicted values

15 Three benefits of Bayesian approaches for spatial health analyses are: 1) random effects due to area variations can be easily incorporated; 2) smoothing effects; and 3) missing values can be inferred using prior knowledge (no holes in the map).  Working closely with knowledgeable statisticians, health researchers can use Bayesian approaches for these benefits (e.g. spatially analyzing and mapping health and health risks). Conclusion

16 Acknowledgement The student is funded by Killam Pre-doctoral Scholarship and CIHR Strategic Training Program in Public Health and the Agricultural Rural Ecosystem (PHARE)

17 Thank you!