Analysis of Observational Data

Analysis of Observational Data
Adjusting for Bias: The Propensity Score and Instrumental Variable Analysis

Overview Randomized clinical trials Observational data
Statistical adjustment in observational studies Multivariable regression adjustment Calculating and using the propensity score Instrumental variable (IV) analysis

Randomized Controlled Trials (the “gold standard”)
A trial in which treatment assignment is determined by a randomization process.

Characteristics of RCTs
Participants have no contraindications to either treatment Randomization ensures subjects are equally matched on all factors Allow causal inference

But High cost Problems with generalizability:
Many people who are given the treatments in “real life” are excluded from the trials Treatment is “ideal” (high compliance, careful follow-up means that any problems may be caught early). Tend to compare treatment to placebo, rather than comparing two ‘accepted’ treatments to one another. Often short duration and/or underpowered. Some situations cannot be randomized.

What is Observational Data
The choice of treatment is not under the control of the researcher - the researcher can only ‘observe’ what treatment was given Chart abstraction and/or Administrative databases All “patients” All providers But …missing key information

Example of an Observational Study using Administrative Data
Two medications used to treat chronic obstructive lung disease (COPD) Long-acting anticholinergic (LAAC) Long-acting beta agonist (LABA) Compare the risk of cardiovascular events (e.g. heart attack, stroke) 8

Hospital Discharge Database (diagnoses) Ontario Drug Benefit Plan (which drug is the person taking) Hospital Discharge Database (for outcomes) Physician Billing Database (diagnoses) Registered Persons Database (age, sex, SES)

Sorts of Predictors Available in the Administrative Data
Age, sex, neighborhood income quintile Comorbidities Diagnoses from hospital discharge records Diagnoses from physician billing records Prescription medications Primary care (number of visits to a primary care doctor) Characteristics of the health care region (availability of care)

Predictors not found in Administrative Data
Smoking history Immigration status / language Actual SES (income and education) Disease severity (e.g. is patient’s diabetes well-controlled, how severe is their COPD). Concern is unmeasured confounders

Observational Data Vs. Randomized Controlled Trials

Observational studies
Used when it is not feasible to use controlled experimentation Unethical to withhold treatment Exposure believed to be harmful Patients will not agree to be randomized RCT too expensive

Observational studies
Generalizable Less expensive (though not cheap!) Researcher has no control over the assignment of subjects to treatments Researcher often has no control over what variables are collected, the quality of their measurement, or their definitions

Effectiveness vs. Efficacy
Effective treatment provides positive results in a usual or routine care condition. Effectiveness studies use real-world clinicians and clients, and clients who have multiple diagnoses or needs. Efficacious treatment provides positive results in a controlled experimental research trial, often in highly constrained conditions.

Statistical Adjustment for Observational Data
Regression Adjustment Stratification Propensity score Instrumental variable analysis

Analysis of our Study Exposure variable is choice of drug (LAAC vs. LABA) Outcome is time to hospitalization for an adverse cardiovascular event.  Survival analysis To account for possible differences between the two drug groups, analysis will be adjusted for other covariates.

Multivariable Regression Analysis

Regression adjustment
Requires “sufficient” overlap between the treatment groups Difficult to assess whether the model has sufficiently adjusted for differences between the two groups Limitation on the number of covariates when studying rare outcomes

Regression Adjustment
Analysis not separated from the design Estimates the conditional (adjusted) effect rather than the population-average (marginal) effect Does not account for unmeasured patient characteristics affecting both the treatment decision and the outcome.

Stratification

Stratification Subjects grouped into subclasses - treated and untreated subjects who fall into the same subclass are compared Cochran (1965) demonstrated that 5 subclasses are often sufficient to remove 90% of the bias due to the variable used to form the subclasses As the number of covariates increases, the number of subclasses increases - difficult to create strata that contain both treated and untreated subjects.

Propensity Score

Propensity Score Builds on the general concept of stratification.
Rosenbaum and Rubin (1983): bias from covariates can be eliminated by controlling for a scalar-valued function of the covariates – the propensity score.

Calculating the propensity score
A way of summarizing the information in all of the prognostic variables PS = probability of one of the two treatments, given the observed covariates Logistic regression: P(LAAC rather than LABA) = f(age, sex, diabetes, etc.) Propensity score

Logistic regression estimates the propensity for patients to be prescribed LAAC (rather than LABA), based on patient characteristics proc logistic; model LAAC = age sex comorbidity rurality …; PS ~ propensity of physicians to prescribe one drug rather than the other to a certain type of patient Patients predicted to be unlikely to be prescribed a LAAC (likely to be prescribed a LABA)  low propensity score Patients of the sort who are likely to be prescribed a LAAC  high propensity score.

Variable Selection All measured baseline covariates
Baseline covariates associated with treatment choice Baseline covariates associated with the outcome Baseline covariates associated with both treatment assignment and outcome

Propensity Score Methods
Covariate adjustment using the Propensity Score Stratification on the PS Matching on the PS. Inverse probability weighting

Covariate Adjustment Regression using treatment plus PS as the independent variables. Depends on the correctness of the propensity score calculation. Assumes a linear relationship between PS and outcome. In contrast, stratifying and matching, depend mainly on the rank of the propensity scores.

PS = probability of being prescribed a LAAC
Stratification PS = probability of being prescribed a LAAC … High PS Low PS … Intermediate PS Compare LAAC and LABA outcomes within each stratum Combine the 5 estimates to get overall estimate of treatment difference.

Stratification Within each stratum, covariates in the two treatment groups are similarly distributed Strata homogeneous in the propensity score tend to balance the observed covariates (this can be tested). Each stratum replicates an RCT, conditional on the observed covariates.

I’m having trouble staying awake.

Matching Matching on the propensity score eliminates a greater amount of treatment selection bias than does stratification. Matching is on the logit of the propensity score. Match within a caliper (usually 0.2 std deviations of the score) Can additionally match on other important prognostic factors.

Propensity score matching
Create a matched sample Assess balance between treated and untreated subjects in the matched sample. Repeat iteratively until acceptable balance is achieved. Estimate the treatment effect in the matched sample.

The test of a good propensity score model is how well it balances the measured variables between treated and untreated subjects.

Assessing Balance Between Treatment Groups
Significance testing not appropriate confounded with sample size matched set is smaller than the original sample, so less likely to find “significant” differences in the matched sample. samples based on admin data often so large that even clinically meaningless differences are statistically significant

Standardized Differences for Comparing Balance
d = 100 × (treatment – control difference) / (pooled standard deviation) No uniformly agreed on criterion for assessing standardized differences – many authors use a threshold of 10% 37

For unbalanced variables, add interactions with other variables in the model or higher order terms (e.g. age2) to the propensity score, and re-calculate

Analyses of Matched Data Must Incorporate the Matching
Means Paired T-test Proportions McNemar’s test Survival models Stratify on matched pairs Logistic regression GEE estimation to account for matched pairs

Matching pairs patients who are similar with respect to covariates, matching on many confounders simultaneously Unmatched individuals are discarded The resulting matched sample may not be representative of all patients receiving treatment Compares patients who are all potential candidates for both treatments.

Interpretation Result of a matched propensity score analysis:
an aggregate estimate of the treatment effect, if it were applied to the treated population. Estimates the ATT (average treatment effect for the treated) 41

Inverse Probability of Treatment Weighting Using the Propensity Score
42

Weights 43

Weights Treated: W = 𝟏 𝑷𝑺 Untreated: W = 𝟏 𝟏 − 𝑷𝑺 Subjects weighted by the inverse of the probability of receiving the treatment that was actually received. 44

The Idea Create two datasets: one for each treatment group.
Everyone contributes to both datasets In each data set, there are people for whom information is missing: effect of treatment A is missing for people who received B; effect of treatment B is missing for people who received A 45

The Idea Dataset to estimate the effect of treatment A (treatment = 1)
Estimated average effect of treatment A = 1/N 𝑖=1 𝑁 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡 × 𝑌𝑖 𝑃𝑆 ID Treatment group A=1;B=0 PS Weight 1/PS Outcome Effect of treatment A 1 A 0.2 1/.2 = 5 Y1 2 0.5 1/.5 = 2 Y2 3 B 0.3 1/(1-.3) = 1.4 Y3 ?? 4 0.4 1/(1-.4) = 1.7 Y4 5 0.8 1/.8 = 1.3 Y5 46

The Idea Dataset to estimate the effect of treatment B (treatment = 0)
Estimated average effect of treatment B = 1/N 𝑖=1 𝑁 (1 − 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡)× 𝑌𝑖 1 −𝑃𝑆 ID Treatment group A=1;B=0 PS Weight Outcome Effect of treatment B 1 A 0.2 1/.2 = 5 Y1 ?? 2 0.5 1/.5 = 2 Y2 3 B 0.3 1/(1-.3) = 1.4 Y3 4 0.4 1/(1-.4) = 1.7 Y4 5 0.8 1/.8 = 1.3 Y5 47

The Idea The treatment difference is obtained by taking the difference in means or difference in proportions However, the estimate of the variance is not as straightforward. Estimated difference (treatment A – treatment B) = 1/N 𝑖=1 𝑁 (𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡)× 𝑌𝑖 𝑃𝑆 /N 𝑖=1 𝑁 (1 − 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡)× 𝑌𝑖 1 −𝑃𝑆 48

What are the Weights Doing
49

Mr. X’s weight is 1/0.2 = 5. He represents 5 people on treatment A.
In dataset 1 we are missing information about the effect of treatment A for people who received B. If Mr. X, who received A, had a low (e.g. 20%) probability of getting A, there must be 4 similar people who received B. If they had received A, we expect their outcome would be the same as Mr. X’s outcome. We impute the missing outcome for these people using Mr. X’s outcome. Mr. X’s weight is 1/0.2 = 5. He represents 5 people on treatment A. 50

In dataset 1 we are missing information about the effect of treatment A for people who received B.
If Ms Y, who received A, had a higher (e.g. 50%) probability of getting A, there should be only 1 similar person who received B. If this person had received A, we expect their outcome would be the same as Ms. Y’s outcome. We impute the missing outcome for this person using information from Ms Y. Ms Y’s weight is 1/.5 = 2. She represents 2 people on treatment A (herself plus one). 51

For each person with a high propensity to receive treatment A, there are only a few people who actually got B. There are only a few people for whom we do not know their outcome if they had received treatment A. Each high PS person who received treatment A represents only a small number of people. Their weight is low. And vice versa. 52

In dataset 2 we are missing information about the effect of treatment B for people who received A.
If Mr. Z, who received B, had a large PS (80%), he has a low probability of receiving B. His weight should be high, and it is (weight = 1 / (1-PS) = 1/.2 = 5. When it comes to dataset 2, Mr. Z represents himself plus 4 people who got treatment A, so their outcome on treatment B is missing. 53

See references for methods.
In effect, we have 2 datasets: one used to estimate the effect of treatment A and the other used to estimate the effect of treatment B. Both are standardized to the overall dataset, so there is no longer any confounding. We can directly compare outcomes between the two treatment groups in the weighted samples. See references for methods. 54

Interpretation Result of an inversely weighted propensity score analysis is an aggregate estimate of the treatment effect, if it were applied to the entire population Estimates the ATE (Average Treatment Effect 55

Unusual individuals (treated but don’t fit the description of those usually treated, small PS) have high weights Unusual individuals (not treated, but look a lot like people who are usually treated, low value for (1-PS)) have high weights May trim high weights.

IPW is Related to Survey Weights
In the CCHS, we oversample people from rural areas. In order to obtain a population estimate, we upweight the responses from urban respondents and downweight the responses from rural people.

Multivariate Behavioral Research, 46:399–424, 2011
Peter C. Austin. An introduction to propensity score methods for reducing the effects of confounding in observational studies. Multivariate Behavioral Research, 46:399–424, 2011 A tutorial and case study in propensity score analysis: An application to estimating the effect of in-hospital smoking cessation counseling on mortality. Multivariate Behavioral Research, 46:119–151, 2011 58

The gmatch macro (greedy matching) Created by Jon Kosanke and Erik Bergstralh 59

It’s magic

Well, almost magic Makes no claims to balance unmeasured covariates.
Remove hidden biases only to the extent that unmeasured variables are correlated with the measures used to compute the score.

Drawbacks to the Propensity Score
Dataset probably missing key covariates (e.g. living arrangements, smoking) Definition of the baseline time may be difficult (should be the time at which the decision about treatment was made) Does not eliminate the need to think about patient identification/selection

Advantages of the Propensity Score
Reduced dimensionality of covariates (useful if the outcome is rare). If matching is used, can demonstrate that the two treatment groups are similar (on all measured covariates). If inverse weighting is used, the two groups are similar on all measured covariates.

What question is the propensity score analysis answering
Impact of providing a given treatment to the entire population (IPW) Impact of treatment on the treated population (matching)

What Questions are not Answered
Does not predict the outcome for a person with a given set of characteristics Does not tell you the role of the other covariates in predicting the outcome (e.g. are older patients more likely to have a stroke) (this is what regression does). Does not tell you who will benefit most from a given treatment. 65

Instrumental Variable Analysis
66

Controls for hidden bias as well as overt bias An IV has two characteristics It is highly correlated with treatment It does not affect the outcome (other than through its effect on the treatment) (it is not associated with measured or unmeasured patient health) The “coin toss” used in an RCT is the perfect IV.

Types of Instrumental Variables
Medication IV is the type of medication given by the same physician to the previous patient To the extent that physicians favor a particular medication to treat a condition, the medication given to the previous patient is related to the medication given to the patient we are interested in. The outcome for our patient is unlikely to be affected by the previous patient.

Types of Instrumental Variables
Geographical Effect of invasive cardiac treatment on survival after an AMI IV is the rate of angiography in the patient’s area of residence Angiography availability  probability that a patient will have an angiography  probability of a bypass or PCI Not related to AMI severity,so not related to patient outcome (patients in different regions were shown to have similar severity)

Analysis Generic regression equation outcome i = α+ β treatment i + ε
Due to treatment bias, treatment is correlated with unmeasured factors, whose effects end up in the error term. A change in treatment affects the outcome in two ways: Because the treatment has changed Because other factors, contained in the error term (e.g., age, comorbidities) have changed

outcome = αIV + βIV IVi + errorIV
Analysis Replacing the treatment with an instrumental variable outcome = αIV + βIV IVi + errorIV By definition, the IV is not associated with any confounding variables. If changes to the IV are associated with changes in the outcome, which can only be sue to the instrument’s correlation with the treatment.

Two Stage Least Squares Regression
Stage 1: estimate the part of the treatment choice that is uncorrelated with the confounding variables – the part that is related to the IV treatment = α+ β IV+error Stage 2: outcome= α+ β TSLS treatment +error

Cannot prove that something is a good IV. Can explore, but not prove, that the groups are similar in unmeasured patient characteristics. Estimates the treatment effect on the ‘marginal’ population – patients who would receive angiography if they lived in an area with higher rates but not if they lived in an area with lower rates.

Difficult in identifying a good IV IV only weakly associated with treatment  very high standard errors IV associated with the outcome  biased estimate

References Stukel TA, Fisher ES, Wennberg DE, Alter DA, Gottleib DJ, Vermeulan MJ. Analysis of Observational Studies in the Presence of Treatment Selection Bias: Effects of invasive cardiac management on AMI survival using propensity score and instrumental variable methods. JAMA. 2007; 297(3): A series of great YouTube videos by Ben Lambert:

Analysis of Observational Data

Similar presentations

Presentation on theme: "Analysis of Observational Data"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Analysis of Observational Data

Similar presentations

Presentation on theme: "Analysis of Observational Data"— Presentation transcript:

Similar presentations

About project

Feedback