Improving health worldwide www.lshtm.ac.uk George B. Ploubidis The role of sensitivity analysis in the estimation of causal pathways from observational.

Slides:



Advertisements
Similar presentations
Multilevel Models with Latent Variables Daniel J. Bauer Department of Psychology University of North Carolina 9/13/04 SAMSI Workshop.
Advertisements

Statistics for Improving the Efficiency of Public Administration Daniel Peña Universidad Carlos III Madrid, Spain NTTS 2009 Brussels.
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Properties of Least Squares Regression Coefficients
The Simple Regression Model
Lecture 8 (Ch14) Advanced Panel Data Method
Hypothesis Testing Steps in Hypothesis Testing:
Copyright EM LYON Par accord du CFC Cession et reproduction interdites Research in Entrepreneurship- The problem of unobserved heterogeneity Frédéric Delmar.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Correlation and regression Dr. Ghada Abo-Zaid
3.2 OLS Fitted Values and Residuals -after obtaining OLS estimates, we can then obtain fitted or predicted values for y: -given our actual and predicted.
Linear Regression. PSYC 6130, PROF. J. ELDER 2 Correlation vs Regression: What’s the Difference? Correlation measures how strongly related 2 variables.
Omitted Variable Bias Methods of Economic Investigation Lecture 7 1.
Linear regression models
STAT 497 APPLIED TIME SERIES ANALYSIS
Exploratory factor analysis GHQ-12. EGO GHQ-12 EFA 1) Assuming items are continuous Variable: Names are ghq01 ghq02 ghq03 ghq04 ghq05 ghq06 ghq07 ghq08.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Method Reading Group September 22, 2008 Matching.
The Simple Linear Regression Model: Specification and Estimation
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Chapter 12 Simple Regression
Structural Equation Modeling
ARIMA Forecasting Lecture 7 and 8 - March 14-16, 2011
Chapter 11 Multiple Regression.
The Simple Regression Model
6.4 Prediction -We have already seen how to make predictions about our dependent variable using our OLS estimates and values for our independent variables.
Basics of regression analysis
FIN357 Li1 The Simple Regression Model y =  0 +  1 x + u.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
Getting Started with Hypothesis Testing The Single Sample.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Mediation: Sensitivity Analysis David A. Kenny. 2 You Should Know Assumptions Detailed Example Solutions to Assumption Violation.
Regression and Correlation Methods Judy Zhong Ph.D.
Lifelong Socio Economic Position and biomarkers of later life health: Testing the relative contribution of competing hypotheses George B. Ploubidis, Emily.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Model Inference and Averaging
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Lecture 8: Generalized Linear Models for Longitudinal Data.
PARAMETRIC STATISTICAL INFERENCE
Introduction Multilevel Analysis
+ Chapter 12: Inference for Regression Inference for Linear Regression.
The Examination of Residuals. Examination of Residuals The fitting of models to data is done using an iterative approach. The first step is to fit a simple.
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Estimation Method of Moments (MM) Methods of Moment estimation is a general method where equations for estimating parameters are found by equating population.
Lecture 2: Statistical learning primer for biologists
Constructs AKA... AKA... Latent variables Latent variables Unmeasured variables Unmeasured variables Factors Factors Unobserved variables Unobserved variables.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Review of Statistics.  Estimation of the Population Mean  Hypothesis Testing  Confidence Intervals  Comparing Means from Different Populations  Scatterplots.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
Mediation: The Causal Inference Approach David A. Kenny.
Individual observations need to be checked to see if they are: –outliers; or –influential observations Outliers are defined as observations that differ.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
1 AAEC 4302 ADVANCED STATISTICAL METHODS IN AGRICULTURAL RESEARCH Part II: Theory and Estimation of Regression Models Chapter 5: Simple Regression Theory.
Econometric analysis of CVM surveys. Estimation of WTP The information we have depends on the elicitation format. With the open- ended format it is relatively.
Estimating standard error using bootstrap
Inference about the slope parameter and correlation
Multiple Imputation using SOLAS for Missing Data Analysis
The Centre for Longitudinal Studies Missing Data Strategy
How to handle missing data values
CHAPTER 29: Multiple Regression*
Joint Statistical Meetings, Vancouver, August 1, 2018
Chapter 4, Regression Diagnostics Detection of Model Violation
Topic 8 Correlation and Regression Analysis
Presentation transcript:

Improving health worldwide George B. Ploubidis The role of sensitivity analysis in the estimation of causal pathways from observational data

Outline Sensitivity analysis Causal Mediation -Two examples Advantages Limitations Summary

Causal inference Causal inference with observational data is a nearly alchemic task Estimates depend on the model being correctly specified – no unmeasured confounders – Sequential Ignorability Can’t be directly tested Things become more complicated when mediation is of interest

A simple idea Sensitivity analysis is an effective method for probing the plausibility of a nonrefutable assumption (sequential ignorability) The goal of sensitivity analysis is to quantify the degree to which the key assumption of no unmeasured confounders (sequential ignorability) must be violated for a researcher’s original conclusion to be reversed

If an inference is sensitive, a slight violation of the assumption may lead to substantively different conclusions Given the importance of sequential ignorability, it has been argued that when observational data are employed some kind of sensitivity analysis should always be carried out Simply put: What happens to my estimated parameters if I simulate the effect of unmeasured confounders?

It’s been done before Survival “frailty” models Time series with latent factors However, the difference with causal mediation is that indirect effects need to be estimated Bayesian semi parametric propensity scores

Three general scenarios Mediator – outcome confounders Exposure – mediator confounders Exposure – mediator – outcome confounders Formal approaches available for the first scenario, but model specific approaches available for the remaining two

Mediator - Outcome U

Exposure – mediator U

Exposure – mediator - outcome U

When it’s about the exposure No formal approach thus far But under certain assumptions we can “challenge” our parameter estimates We can capitalise on the properties of latent variable measurement models Latent variables capture unobserved heterogeneity Unmeasured confounders can be thought of as sources of unobserved heterogeneity

When the exposure is involved Include latent variable “U” to represent unmeasured confounder(s) U ~ N (0,1) Normally distributed (by definition), with mean 0 and variance 1

The latent variable(s) can represent the effect of one or more confounders The goal is to find out what happens to our estimates under several scenarios that involve latent “U” It can be shown that under certain assumptions latent variables can “imitate” the effect of observed confounders

A (relatively) simple example U

A simple LSEM All variables continuous and normally distributed No other confounders other than “U” Linear associations No interactions (although they could be accommodated) Estimation with MLR

First the parameter estimates without the confounder Y on X via W = ( )

Here comes the (observed) confounder! Y on X via W = ( )

Can a latent variable do the same? It can be shown that if we fix the intercepts, slopes (loadings) and variance of the latent variable according to the estimated parameters we can obtain the estimates from the previous model X = Ax + λxU + eX W = AW + λwU + eW Y = AY + λYU + eY

Estimates with the “latent confounder” U Y on X via W = ( )

Two possibilities a) The researcher suspects a set of unknown confounders b) A well known confounder, or a set of well known confounders have not been measured

Frequentist approach By specifying values for the effect of the confounder(s), the researcher will be able to test several scenarios of weak/moderate/strong confounding An iterative process The results of the trials can be quantified

Weak/No confounding U Y on X via W = ( )

Strong confounding U Y on X via W = ( )

A scree plot Finding the tipping point

Let’s go Bayesian “U” is a well known confounder It’s associations with X,W and Y have been quantified in the existing literature Hence, we use informative priors for the parameters that link “U” with X,W and Y UX ~ N (0.37, 0.01) UW ~N (0.26, 0.01) UY ~ N (-0.23, 0.01)

U Y on X via W = ( )

Limitations Not non parametrically identified (i.e. results depend on the distribution of the simulated confounder) No stopping rule – can’t be falsified Latent confounder can only be normally distributed Discrete latent variables possible – Principal Stratification

Advantages Properties of LVMs are well known Software availability DAG theory can be used to inform the sensitivity analyses

Mediator – Outcome Confounding Medsens (Stata, R, Mplus) Employs the correlation (Rho) between the residual variances (errors) of the models for the mediator and outcome Effects are computed given different fixed values of the residual covariance. The proposed sensitivity analysis asks the question of how large does Rho have to be for the mediation effect (Average Causal Mediation Effect – ACME) to disappear

Medsens example U X to Y via M = 0.14 (0.11 – 0.17)

Medsens Results

Medsens Results II Rho at which ACME = 0 is Product of residuals where ACME = 0, is Product of explained variances where ACME = 0, is The unmeasured confounder needs for example to explain 30% of the originally explained variance of the mediator and 39% of the outcome for the ACME to be 0 Since this is a product other combinations are plausible (0.20 * 0.57 etc)

Limitations Assumes all confounding due to Rho Only available for mediator - outcome associations Accommodates continuous mediator and continuous/ binary outcome, and binary mediator and continuous outcome Assumes normal distribution of error terms

Advantages No distributional assumption for the unmeasured confounder Can accommodate binary, ordinal outcomes Quintile regression (for the outcome model) also available (only in R) Easy to use software

Summary Under certain assumptions latent variables have the potential to “imitate” the effect of unmeasured confounders Medsens is a very useful tool to test the effects of mediator – outcome confounders Both approaches mostly effective in research areas (like the study of health inequalities) with strong theoretical underpinnings that can inform parameter specification/interpretation

Summary II Sensitivity analysis not a substitute for randomisation Be aware of the assumptions and limitations of all sensitivity analysis approaches But especially when estimation of indirect effects (mediation) is required Always carry out sensitivity analysis!!

Thank you!