Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.

Slides:



Advertisements
Similar presentations
From study objectives to analysis plan Helen Maguire.
Advertisements

Analytical epidemiology
M2 Medical Epidemiology
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data.
Third training Module, EpiSouth: Multivariate analysis, 15 th to 19 th June 20091/29 Multivariate analysis: Introduction Third training Module EpiSouth.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
Generating new variables and manipulating data with STATA Biostatistics 212 Lecture 3.
EPID Introduction to Analysis and Interpretation of HIV/STD Data Confounding Manya Magnus, Ph.D. Summer 2001 adapted from M. O’Brien and P. Kissinger.
What is Interaction for A Binary Outcome? Chun Li Department of Biostatistics Center for Human Genetics Research September 19, 2007.
Departments of Medicine and Biostatistics
HSRP 734: Advanced Statistical Methods July 24, 2008.
Chance, bias and confounding
Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.
Lecture 17: Regression for Case-control Studies BMTRY 701 Biostatistical Methods II.
Section 4.2: How to Look for Differences. Cross-Tabulations College student binge drinkers experienced many personal and social problems, the researchers.
Basic epidemiologic analysis with Stata
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
Analysis of Complex Survey Data Day 3: Regression.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
Multiple Regression III 4/16/12 More on categorical variables Missing data Variable Selection Stepwise Regression Confounding variables Not in book Professor.
Assessing Survival: Cox Proportional Hazards Model Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research.
Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director.
Presentation 12 Chi-Square test.
Stratification and Adjustment
Logistic Regression. Outline Review of simple and multiple regressionReview of simple and multiple regression Simple Logistic RegressionSimple Logistic.
Analysis of Categorical Data
Concepts of Interaction Matthew Fox Advanced Epi.
Making a figure, dates, and other advanced topics Biostatistics 212 Lecture 6.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5.
Evidence-Based Medicine 3 More Knowledge and Skills for Critical Reading Karen E. Schetzina, MD, MPH.
Making a figure with Stata or Excel Biostatistics 212 Lecture 7.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 5: Analysis Issues in Large Observational Studies.
Using the Margins Command to Estimate and Interpret Adjusted Predictions and Marginal Effects Richard Williams
 Is there a comparison? ◦ Are the groups really comparable?  Are the differences being reported real? ◦ Are they worth reporting? ◦ How much confidence.
Assessing Survival: Cox Proportional Hazards Model
Making Tables and Figures with Stata Biostatistics 212 Lecture 6.
Amsterdam Rehabilitation Research Center | Reade Multiple regression analysis Analysis of confounding and effectmodification Martin van de Esch, PhD.
Practical Missing Data Analysis in SPSS (v17 onwards) Peter T. Donnan Professor of Epidemiology and Biostatistics.
Organizing a project, making a table Biostatistics 212 Lecture 7.
Organizing a project, making a table Biostatistics 212 Session 5.
Basic epidemiologic analysis with Stata Part II Biostatistics 212 Lecture 6.
October 15. In Chapter 19: 19.1 Preventing Confounding 19.2 Simpson’s Paradox 19.3 Mantel-Haenszel Methods 19.4 Interaction.
Organizing a project, making a table Biostatistics 212 Lecture 7.
Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
Analytical epidemiology Disease frequency Study design: cohorts & case control Choice of a reference group Biases Alain Moren, 2006 Impact Causality Effect.
Making Tables and Figures with Stata Biostatistics 212 Lecture 6.
Tim Wiemken PhD MPH CIC Assistant Professor Division of Infectious Diseases University of Louisville, Kentucky Confounding.
11/20091 EPI 5240: Introduction to Epidemiology Confounding: concepts and general approaches November 9, 2009 Dr. N. Birkett, Department of Epidemiology.
Basics of Biostatistics for Health Research Session 3 – February 21, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health Sciences.
Basics of Biostatistics for Health Research Session 1 – February 7 th, 2013 Dr. Scott Patten, Professor of Epidemiology Department of Community Health.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
Logistic Regression. Linear regression – numerical response Logistic regression – binary categorical response eg. has the disease, or unaffected by the.
01/20151 EPI 5344: Survival Analysis in Epidemiology Confounding and Effect Modification March 24, 2015 Dr. N. Birkett, School of Epidemiology, Public.
Confounding and effect modification Epidemiology 511 W. A. Kukull November
POPLHLTH 304 Regression (modelling) in Epidemiology Simon Thornley (Slides adapted from Assoc. Prof. Roger Marshall)
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Introduction to Biostatistics, Harvard Extension School, Fall, 2005 © Scott Evans, Ph.D.1 Contingency Tables.
Measures of disease frequency Simon Thornley. Measures of Effect and Disease Frequency Aims – To define and describe the uses of common epidemiological.
Lecture #8 Thursday, September 15, 2016 Textbook: Section 4.4
Presentation 12 Chi-Square test.
BMTRY 747: Introduction Jeffrey E. Korte, PhD
Effective Feedback, Rubrics, and Grading
Soc 3306a Lecture 11: Multivariate 4
Lexico-grammar: From simple counts to complex models
Evaluating Effect Measure Modification
Discussion Week 1 (4/1/13 – 4/5/13)
Presentation transcript:

Basic epidemiologic analysis with Stata Biostatistics 212 Lecture 5

Housekeeping Questions about Lab 4? Lab 3 issues –Categorizing continuous variables (21-30 v 20-29) –Include p-values when appropriate –Don’t forget the missing values! Check your work with a cross tabulation, i.e. tab genderp female, missing Next week we’ll start with the Final Project! –What data will you use? –Explore and clean your data –Start planning tables and figures

Today... What’s the difference between epidemiologic and statistical analysis? Interaction and confounding with 2 x 2’s Stata’s “Epitab” commands Adjusting for many things at once Logistic regression Testing for trends

Epi vs. Biostats Epidemiologic analysis – Analyzing and interpreting clinical research data in the context of scientific knowledge Biostatistical analysis – Evaluating the role of chance

Epi vs. Biostats Epi –Confounding, interaction, and causal diagrams. –What to adjust for? –What do the adjusted estimates mean? A B C ABC

2 x 2 Tables “Contingency tables” are the traditional analytic tool of the epidemiologist Outcome Exposure ab cd OR = (a/b) /(c/d) = ad/bc RR = a/(a+b) / c/(c+d)

2 x 2 Tables Example Coronary calcium Binge drinking OR = 2.1 (1.6 – 2.7) RR = 1.9 (1.6 – 2.4)

2 x 2 Tables There is a statistically significant association, but is it causal? Does male gender confound the association? Binge drinking Coronary calcium Male

2 x 2 Tables Men more likely to binge –34% of men, 14% of women Men have more coronary calcium –15% of men, 7% of women

2 x 2 Tables But what does confounding look like in a 2x2 table? And how do you adjust for it?

2 x 2 Tables First, stratify… CAC Binge CAC Binge CAC Binge In menIn women RR = 1.94 ( ) (34%)(14%) (15%)(7%) RR = 1.57 ( )RR = 1.50 ( )

2 x 2 Tables …compare strata-specific estimates… (they’re about the same) CAC Binge CAC Binge In menIn women (34%)(14%) (15%)(7%) RR = 1.57 ( )RR = 1.50 ( )

2 x 2 Tables …compare to the crude estimate CAC Binge CAC Binge CAC Binge In menIn women RR = 1.94 ( ) (34%)(14%) (15%)(7%) RR = 1.57 ( )RR = 1.50 ( )

2 x 2 Tables …and then adjust the summary estimate CAC Binge CAC Binge In menIn women RR = 1.50 ( )RR = 1.57 ( ) RRadj = 1.51 ( )

Binge CAC Binge CAC Binge In menIn women (34%)(14%) (15%)(7%) RR = 1.57 ( )RR = 1.50 ( ) RR = 1.94 ( ) RRadj = 1.51 ( )

2 x 2 Tables How do we do this with Stata? –Tabulate – output not exactly what we want. –The “epitab” commands Stata’s answer to stratified analyses cs, cc csi, cci tabodds, mhodds

2 x 2 Tables Example – demo using Stata cs cac binge cs cac binge, by(male) cs cac modalc cs cac modalc, by(racegender) cc cac binge

2 x 2 Tables Intermediate commands –csi, cci –No dataset required – just 2x2 cell frequencies csi a b c d csi (for cac binge)

Multivariable adjustment Binge drinking appears to be associated with coronary calcium –Association partially due to confounding by gender What about race? Age? SES? Smoking?

Multivariable adjustment manual stratification # 2x2 tables Crude association1 Adjust for gender2 Adjust for gender, race4 Adjust for gender, race, age68 Adjust for “” + income, education816 Adjust for “” + “” + smoking2448

Multivariable adjustment cs command cs command –Does manual stratification for you Lists results from every strata Tests for overall homogeneity Adjusted and crude results –Demo cs cac binge, by(male black age)

Multivariable adjustment cs command cs command –Does manual stratification for you Lists results from every strata Tests for overall homogeneity Adjusted and crude results –Demo cs cac binge, by(male black age) –Can’t interpret interactions!

Multivariable adjustment mhodds command mhodds allows you to look at specific interactions, adjusted for multiple covariates –Does same stratification for you –Adjusted results for each interaction variable –P-value for specific interaction (homogeneity) –Summary adjusted result Demo mhodds cac binge age, by(racegender)

Multivariable adjustment mhodds command mhodds allows you to look at specific interactions, adjusted for multiple covariates –Does same stratification for you –Adjusted results for each interaction variable –P-value for specific interaction (homogeneity) –Summary adjusted result Demo mhodds cac binge age, by(racegender) But strata get thin!

Multivariable adjustment logistic command Assumes logit model –Await biostats class for details! –Coefficients estimated, no actual stratification –Continuous variables used as they are

Multivariable adjustment logistic command Basic syntax: logistic outcomevar [predictorvar1 predictorvar2 predictorvar3…]

Multivariable adjustment logistic command If using any categorical predictors: xi: logistic outcomevar [i.catvar var2…] Creates “dummy variables” on the fly If you forget, Stata won’t know they are categorical, and you’ll get the wrong answer!

Multivariable adjustment logistic command Demo logistic cac binge logistic cac binge male logistic cac binge male black logistic cac binge male black age xi: logistic cac binge male black age i.smoke

Multivariable adjustment logistic command Pro’s –Provides all OR’s in the model –Accepted approach –Can deal with continuous variables –Better estimation for large models? Con’s –Interaction testing more cumbersome, less automatic –More assumptions –Harder to test for trends

Testing for trend Alcohol consumption can be a lot or a little –Does association increase with larger amounts of consumption? –(no j-shaped curve) Test of trend? –Look through epitab suite

Testing for trends tabodds command chi2 test of trend –tabodds cac alccat –Look at output Adjustment for multiple variables possible –tabodds cac alccat, adjust(age male black)

Approaching your analysis Number of potential models/analyses is daunting –Where do you start? How do you finish? My suggestion –Explore –Plan definitive analysis, make dummy tables/figures –Do analysis (do/log files), fill in tables/figures –Show to collaborators, reiterate prn –Write paper

Summary Make sure you understand confounding and interaction with 2x2 tables in Stata Epitab commands are a great way to explore your data –Emphasis on interaction Logistic regression is a more general approach, ubiquitous, but testing for interactions and trends is more difficult

In lab today… Lab 5 –Epi analysis of coronary calcium dataset –Walks you through evaluation of confounding and interaction Judgment calls – often no right answer, just focus on reasoning.