Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky.

Slides:



Advertisements
Similar presentations
How would you explain the smoking paradox. Smokers fair better after an infarction in hospital than non-smokers. This apparently disagrees with the view.
Advertisements

Sample size estimation
Statistics.  Statistically significant– When the P-value falls below the alpha level, we say that the tests is “statistically significant” at the alpha.
KRUSKAL-WALIS ANOVA BY RANK (Nonparametric test)
Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Planning Your Study Statistical Issues.
Find the Joy in Stats ? ! ? Walt Senterfitt, Ph.D., PWA Los Angeles County Department of Public Health and CHAMP.
Categorical Data. To identify any association between two categorical data. Example: 1,073 subjects of both genders were recruited for a study where the.
Extension Article by Dr Tim Kenny
Introduction to Risk Factors & Measures of Effect Meg McCarron, CDC.
What do p-values and confidence intervals really tell us?
Statistical Inference June 30-July 1, 2004 Statistical Inference The process of making guesses about the truth from a sample. Sample (observation) Make.
Stat Day 16 Observations (Topic 16 and Topic 14)
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Sample Size Determination
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Lecture 9: p-value functions and intro to Bayesian thinking Matthew Fox Advanced Epidemiology.
Thomas Songer, PhD with acknowledgment to several slides provided by M Rahbar and Moataza Mahmoud Abdel Wahab Introduction to Research Methods In the Internet.
The Bahrain Branch of the UK Cochrane Centre In Collaboration with Reyada Training & Management Consultancy, Dubai-UAE Cochrane Collaboration and Systematic.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Analytic Epidemiology
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Hadpop Calculations. Odds ratio What study applicable? Q. It is suggested that obesity increases the chances on an individual becoming infected with erysipelas.
Overview Definition Hypothesis
Statistical significance using p-value
 Mean: true average  Median: middle number once ranked  Mode: most repetitive  Range : difference between largest and smallest.
Multiple Choice Questions for discussion
Evidence-Based Medicine 4 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Statistics for clinical research An introductory course.
Essentials of survival analysis How to practice evidence based oncology European School of Oncology July 2004 Antwerp, Belgium Dr. Iztok Hozo Professor.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
Biostatistics Case Studies 2005 Peter D. Christenson Biostatistician Session 4: Taking Risks and Playing the Odds: OR vs.
Inference for a Single Population Proportion (p).
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Statistics for Infection Control Practitioners Presented By: Shana O’Heron, MPH, CIC Infection Prevention and Management Associates.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 8 – Comparing Proportions Marshall University Genomics.
CAT 3 Harm, Causation Maribeth Chitkara, MD Rachel Boykan, MD.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Contingency tables Brian Healy, PhD. Types of analysis-independent samples OutcomeExplanatoryAnalysis ContinuousDichotomous t-test, Wilcoxon test ContinuousCategorical.
Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma, Ursa et al. Statistical Methods:
The binomial applied: absolute and relative risks, chi-square.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Nonparametric Statistics
How to Read Scientific Journal Articles
Understanding Medical Articles and Reports Linda Vincent, MPH UCSF Breast SPORE Advocate September 24,
1 EPI 5240: Introduction to Epidemiology Measures used to compare groups October 5, 2009 Dr. N. Birkett, Department of Epidemiology & Community Medicine,
Measuring associations between exposures and outcomes
Medical Statistics as a science
Ch 10 – Intro To Inference 10.1: Estimating with Confidence 10.2 Tests of Significance 10.3 Making Sense of Statistical Significance 10.4 Inference as.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 8 First Part.
Introduction to Inference: Confidence Intervals and Hypothesis Testing Presentation 4 First Part.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
Lecture & tutorial material prepared by: Dr. Shaffi Shaikh Tutorial presented by: Dr. Rufaidah Dabbagh Dr. Nurah Al-Amro.
BC Jung A Brief Introduction to Epidemiology - XIII (Critiquing the Research: Statistical Considerations) Betty C. Jung, RN, MPH, CHES.
01/20151 EPI 5344: Survival Analysis in Epidemiology Actuarial and Kaplan-Meier methods February 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
More Contingency Tables & Paired Categorical Data Lecture 8.
Statistical significance using Confidence Intervals
Dr.Shaikh Shaffi Ahamed, PhD Associate Professor Department of Family & Community Medicine College of Medicine, KSU Statistical significance using p -value.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Hypothesis Testing and Statistical Significance
Statistical Significance or Hypothesis Testing. Significance testing Learning objectives of this lecture are to Understand Hypothesis: definition & types.
Case Control study. An investigation that compares a group of people with a disease to a group of people without the disease. Used to identify and assess.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Inference for a Single Population Proportion (p)
The binomial applied: absolute and relative risks, chi-square
Measures of Association
Risk ratios 12/6/ : Risk Ratios 12/6/2018 Risk ratios StatPrimer.
Statistical significance using p-value
Interpreting Epidemiologic Results.
Research Techniques Made Simple: Interpreting Measures of Association in Clinical Research Michelle Roberts PhD,1,2 Sepideh Ashrafzadeh,1,2 Maryam Asgari.
Presentation transcript:

Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky

1. Testing for an Association 3. Confidence Intervals 2. Other Measures of Association Overview

3. Confidence Intervals 2. Other Measures of Association Overview 1. Testing for an Association

Null hypothesis: There is no association Alternative hypothesis: There is an association 1. Develop hypothesis Testing for Association

1. Develop hypothesis Testing for Association

What P-value will you consider statistically significant? Usually arguments for bigger/smaller 2. Choose your level of significance α value Testing for Association

Call your statistician. A bad test gives bad results. A good test may give bad results (bad data?). A good statistician may tell you if the results are bad, but cannot always tell you if the data were bad. 3. Choose Your Test Testing for Association

Will tell you if there is an association between two variables Chi-squared Test Testing for Association

Will tell you if there is an association between two variables Chi-squared Test Testing for Association Measures observed versus expected counts in study groups

Will tell you if there is an association between two variables Chi-squared Test Testing for Association Measures observed versus expected counts in study groups Must have adequate sample size

2x2 table – categorical data Chi-squared Test Outcome +Outcome - Predictor + Predictor - Testing for Association

Example Research question: Does HIV impact mortality in hospitalized patients with community-acquired pneumonia?

Hospitalized CAP Patients HIV+ HIV- Dead Alive Does HIV Have an Effect on Patient In-Hospital Mortality? Example

Hospitalized CAP Patients HIV+ HIV- Dead Alive Predictor Variable: ? Example

Hospitalized CAP Patients HIV+ HIV- Dead Alive Outcome Variable: ? Example

Significance Level Null Hypothesis What Test? Does HIV Have an Effect on Patient In-Hospital Mortality? Example

Does HIV Have an Effect on Patient In- Hospital Mortality? Outcome +Outcome - Predictor + Predictor - Example

Does HIV Have an Effect on Patient In- Hospital Mortality? + HIV, - died: - HIV, - died: + HIV, + died : - HIV, + died : Example

Does HIV Have an Effect on Patient In- Hospital Mortality? Outcome +Outcome - Predictor + Predictor - Example

Does HIV Have an Effect on Patient In- Hospital Mortality? How many patients died in- hospital? Example

Does HIV Have an Effect on Patient In- Hospital Mortality? How many patients died in- hospital? n=27 Example

Does HIV Have an Effect on Patient In- Hospital Mortality? How many patients had HIV? Example

Does HIV Have an Effect on Patient In- Hospital Mortality? How many patients had HIV? n=30 Example

Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV+ HIV- Example n=27 n=30 n=100

=countifs(b2:b101, 1, z2:z101, 1) Does HIV Have an Effect on Patient In- Hospital Mortality? How many patients with HIV died? Example count the number of cases of deaths (column b, in_hosp_mort=1) that had HIV (column z, hiv=1)

Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV+11 HIV- Example n=27 n=30 n=100

Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV+11 HIV = 16 Example n=27 n=30 n=100

Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV = 19 HIV = 16 Example n=27 n=30 n=100

Check this! Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV = 19 HIV = 16 Example n=27 n=30 n=100 =countifs(b2:b101, 0, z2:z101, 1)

Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV = 19 HIV = – ( ) = 54 Example n=27 n=30 n=100

Plug the data into your excel stats program Does HIV Have an Effect on Patient In- Hospital Mortality? Dead +Dead - HIV = 19 HIV = – ( ) = 54 Example

Do they? Example

No! P=0.154 P>0.05 Do they? Example

Where to publish? Example

Example Maybe those without HIV are older than those with HIV, so the mortality ends up the same

Example How do we check this?

Null Hypothesis: Example Alternative Hypothesis:

Null Hypothesis: The age of patients with and without HIV are NOT different. Example Alternative Hypothesis: The age of patients with and without HIV ARE different.

Is age different in patients with and without HIV? patients? Example

Back to your dataset! Total cases of HIV mean age HIV SD age HIV Total cases of non-HIV mean age non HIV SD age non HIV Example

Total Cases Total cases of HIV =countif(Z2:Z101,1) Total cases of non-HIV =countif(Z2:Z101,0) Example

Average Age =averageif(Z2:Z101,1,AN2:AN101) Example =averageif(Z2:Z101,0,AN2:AN101) HIV+ HIV-

Standard Deviations… not as easy. =stdev(if(Z2:Z101=1,AN2:AN101)) Example Need to use an Array and a nested IF HIV+ DON’T HIT ENTER!!!!!!!!!

Standard Deviations… not as easy. =stdev(if(Z2:Z101=1,AN2:AN101)) Example Need to use an Array and a nested IF HIV+ ON WINDOWS: Control+Shift+Enter ON MAC: Command+Enter

Back to your stats program! Total cases of HIV = 30 mean age HIV: 50.3 SD age HIV: Total cases of non-HIV = 70 mean age non HIV: 56.5 SD age non HIV: Example

Is it? Example

NO! P>0.05 Do they? Example BUT IT IS SOOOOO CLOSE!

3. Confidence Intervals 1. Testing for an Association 2. Other Measures of Association Overview

Used for cohort studies or clinical trials Gold standard measure for observational studies 1. Risk Ratio Answers: How much more (less) likely is this group to get an outcome versus this other group? Measures of Association

Do those admitted to the ICU die more than those not admitted to the ICU? Example Use the 2x2 Totals Tab Total with outcome: Total without outcome:

Do those admitted to the ICU die more than those not admitted to the ICU? Example Use the 2x2 Totals Tab Total with outcome: =countif(B2:B101,1) n=27 Total without outcome: 100 – 27 n=73

Do those admitted to the ICU die more than those not admitted to the ICU? Example Total with outcome in the ICU: Total without outcome in the ICU:

Do those admitted to the ICU die more than those not admitted to the ICU? Example Total with outcome in the ICU: =countifs(B2:B101,1,I2:I101,1) n=9 Total without outcome in the ICU: =countifs(B2:B101,0,I2:I101,1) n=7

Do those admitted to the ICU die more than those not in the ICU? Example Dead +Dead - ICU+97 ICU-?? P=0.004

Do those admitted to the ICU die more than those not in the ICU? Example Dead +Dead - ICU+97 ICU = 1873 – 7 = 66 P=0.004

How much more likely are those admitted to the ICU to die? Example Risk of death in ICU group: 9/ 9+7= 56.3% Dead +Dead - ICU+97 ICU-1866

How much more likely are those admitted to the ICU to die? Example Risk of death in ICUgroup: 9/ 9+7= 56.3% Risk of death in non ICU group: 18/ 18+66= 21.4% Dead +Dead - ICU+97 ICU-1866

How much more likely are those admitted to the ICU to die? Example Risk of death in ICUgroup: 9/ 9+7= 56.3% Risk of death in non ICU group: 18/ 18+66= 21.4% Dead +Dead - ICU+97 ICU-1866 Risk Ratio: 0.563/0.214 = 2.63

Interpret the Risk Ratio Example Who wants to interpret a risk ratio of 2.63?

Interpret the Risk Ratio Example Patients admitted to the ICU are 2.63 times more likely to die than those patients not admitted to the ICU.

Example

CAP Patients Empiric Atypical Pathogen Coverage No Empiric Atypical Pathogen Coverage Dead Alive Does Empiric Atypical Pathogen Coverage Have an Effect on Patient Mortality? Example

Assuming a cohort study… Do those patients who have empiric atypical pathogen coverage die less often than those without atypical coverage? + Atypical : Atypical : Atypical + died : Atypical + died : 110 Example

Assuming a cohort study… Do those patients who have atypical pathogen coverage die more often than those without atypical coverage? Outcome +Outcome - Predictor + Predictor - Example

Assuming a cohort study… Do those patients who have empiric atypical pathogen coverage die less often than those without atypical coverage? + Atypical : Atypical : Atypical + died : Atypical + died : 110 Example

Assuming a cohort study… Do those patients who have atypical pathogen coverage die more often than those without atypical coverage? Outcome +Outcome - Predictor Predictor Example

Anyone?? Interpret the Risk Ratio Example

Interpret the Risk Ratio Example Those with atypical coverage are 42% less likely to die as compared to those without atypical coverage

Remember your baseline risk. What does that mean? Assuming 8% of CAP patients die, what is the risk of death with empiric atypical pathogen coverage? Example

What does that mean? Example 8% x 0.58 = 4.64 Just multiply original risk by the risk ratio!

Even Better: Example Number Needed to Treat 1/Absolute Risk Reduction (ARR) ARR = Unexposed Risk – Exposed Risk

Even Better: Example Number Needed to Treat ARR = Unexposed Risk – Exposed Risk ARR = Risk w/out atypical coverage – Risk w/atypical coverage

Even Better: Example Number Needed to Treat

Even Better: Example Number Needed to Treat 16.7 = unexposed risk

Even Better: Example Number Needed to Treat 9.8 = exposed risk

Even Better: Example Number Needed to Treat 1 / (16.7 – 9.8) = 15 (round up!) Need to treat 15 patients to save 1

Used for case-control studies Is an approximation of the risk ratio 2. Odds Ratio Answers: How much more (less) likely are those with the outcome to have been in this group versus this other group? Measures of Association

Only a good approximation when the outcome is rare Can be an extremely bad approximation 2. Odds Ratio Can correct with a formula Zhang, J., & Yu, K. F. (1998). What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA, 280(19), Measures of Association

Acinetobacter outbreak You gather information from 100 patients with Acinetobacter and 200 patients without. Example Need to identify the risk factors Measures of Association Select sample based on the outcome (Acinetobacter)

Key: Example Measures of Association Because the sample was selected based on the outcome (a subset of everyone who might get the outcome in your population), you can never know the actual incidence of the outcome in everyone who was exposed.

Cohort Study Sample Example Measures of Association Everyone Exposed Everyone Not Exposed Outcome

Case-Control Study Sample Example Measures of Association Subset with Outcome Subset without Outcome Exposure Status

Case-Control Study Sample Example Measures of Association Subset with Outcome Subset without Outcome Exposure Status Cannot know everyone exposed who gets the outcome

Example Analyze a number of risk factors to see if they are associated with Acinetobacter infection Measures of Association

+ Acinetobacter : Acinetobacter : Acinetobacter + wound : 55 - Acinetobacter + wound : 10 Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection? Example

Assuming a case-control study… Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection? Outcome +Outcome - Predictor + Predictor - Example

+ Acinetobacter : Acinetobacter : Acinetobacter + wound : 55 - Acinetobacter + wound : 10 Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection? Example

Assuming a case-control study… Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection? Acinetobacter +Acinetobacter - Wound Wound Example

Anyone?? Interpret the Odds Ratio Example

Those with Acinetobacter have a 23 times higher odds of having a nonsurgical wound compared to those without Acinetobacter. Interpret the Odds Ratio Example

What? Interpret the Odds Ratio Outcome +Outcome - Predictor + Predictor - Order of interpretation: Example

Risk: Know the incidence of the outcome. So what’s the difference? How you choose your population Odds: Don’t know the incidence of the outcome. Risk Versus Odds

So what’s the difference? How you choose your population You can’t identify the likelihood of someone with a predictor getting an outcome because you don’t know who all had the outcome Risk Versus Odds

Correct the Odds Common Outcomes = Odds is a poor approximation of Risk Risk Versus Odds

Even Chuck Norris Hates Odds. So what’s the difference? How you choose your population Risk Versus Odds

Used for Time-to-event data As good as the risk ratio 3. Hazard Ratio Answers: How much more (less) likely are those in this group to get the outcome versus this other group at any given time? Measures of Association

1. Testing for an Association 2. Other Measures of Association 3. Confidence Intervals Overview

Patients in the Universe Patients in the Sample Sampling Generalizing Confidence Intervals

Uses an arbitrary cutoff (0.05) Doesn’t give info on precision P-value is not good. Doesn’t help you generalize Confidence Intervals Fix: Use Confidence Interval

You are 95% confident that the risk (odds) of the patients in the universe is between that interval. Definition – 95% CI Confidence Intervals

You are 95% confident that the risk (odds) of the patients in the universe is between that interval. Definition – 95% CI “Universe” is not everyone in the world – it is everyone you can generalize back to. Confidence Intervals

You are 95% confident that the risk (odds) of the patients in the universe is between that interval. Definition – 95% CI “Universe” is not everyone in the world – it is everyone you can generalize back to. Confidence Intervals If the CI includes 1, that measure of association is not statistically significant (like a P-value >0.05)

You are 95% confident that the risk (odds) of the patients in the universe is between that interval. Definition – 95% CI “Universe” is not everyone in the world – it is everyone you can generalize back to. Confidence Intervals ‘Tighter’ CI = more power, more precision, larger sample If the CI includes 1, that measure of association is not statistically significant (like a P-value >0.05)

Caveat Confidence Intervals Since CI gets tighter with more people in the sample, every measure of association (except exactly 1) will eventually be significant with a large enough sample size.

Is this risk ratio statistically significant? Dead +Dead - Bacteremia Bacteremia Confidence Intervals

No – 95% Confidence Interval includes 1 Is the RR from the bacteremia example statistically significant? Risk Ratio: % CI: (0.83, 1.72) Confidence Intervals

Using the same proportions of Predictors and Outcomes What happens as we increase the sample size? Dead +Dead - Bacteremia Bacteremia Example

Yes – 95% CI does not include 1. Now is the RR from the bacteremia example statistically significant? Risk Ratio: 1.19 (Same as before) 95% Confidence Interval: (1.05, 1.36) Sample Size

The confidence interval becomes tighter What happens as we increase the sample size? Sample Size

The confidence interval becomes tighter What happens as we increase the sample size? Assuming the proportion of patients in each group stays the same, the risk ratio eventually becomes statistically significant. Sample Size

The confidence interval becomes tighter What happens as we increase the sample size? Assuming the proportion of patients in each group stays the same, the risk ratio eventually becomes statistically significant. Sample Size This is because the power you have to detect that effect size has increased.

The larger your sample, the closer you are to actually sampling the entire universe. What happens as we increase the sample size? Sample Size Therefore, your confidence interval is tighter and closer to “the truth in your universe.”

This makes sense. What happens as we increase the sample size? Sample Size The more people in your study, the closer you are to having the universe as your sample. Therefore your statistic should be pretty close to the ‘truth in the universe’.

Patients in the Universe Patients in the Sample Sampling (easy) Generalizing (hard) Confidence Intervals

Patients in the Universe Patients in the Sample Sampling (hard) Generalizing (easy) Confidence Intervals