Download presentation
Presentation is loading. Please wait.
1
Analysis of Complex Survey Data Day 3: Regression
2
Today’s schedule Part I: Basic review of common regressions and when to use them PART II: Introduction to – PROC REGRESS – PROC RLOGIST – PROC LOGLINK – PROC MULTILOG
3
Regression Typically in epidemiologic research, our outcomes fall into four major types: – Continuous Normally distributed Skewed – Counts – Binary – Ordinal – Nominal
4
Continuous outcome, normally distributed Linear regression
5
Continuous outcome, right skewed Poisson regression
6
Counts Poisson regression
7
Binary outcome Logistic regression
8
Ordinal Polytomous regression, cumulative logit link function Likert scales Ordered categorical scales (age, income) The cumulative logit link function assumes that the effect of going from 1 to 2 is the same as the effect of going from 2 to 3
9
Nominal Polytomous regression, general logit link function Race Diagnosis (depression versus anxiety versus substance use disorder) The general logit link function gives a different estimate for the effect of going from 1 to 2 and the effect of going from 2 to 3
10
Categorizing your exposure Check assumptions regarding the functional form of the relationship between the exposure and the outcome – E.g., relationship between age and alcohol use disorders. We would not want to enter age as a continuous variable because we do not think age is linearly related to risk of alcohol use disorders If you decide to categorize a continuous variable, decision on cutpoints can best be made if there is literature precedent – Relying on data driven cutpoints will make your work incomparable with other work in the literature If there is no precedent: – Use quartiles or – Break up the exposure into small categories, and examine the relationship with the outcome in a regression model with no predictors (on the log scale if using logistic regression).
11
Choosing covariates Most important: DO NOT SKIP THE GOUNDWORK! – Check associations with exposure and outcome – Check associations among covariates – Categorize the covariates appropriately When should something be evaluated as a moderator, and when should it be a confounder/covariate? – Most of the time, it is clear: do you think that the relationship between exposure and outcome will be the same across levels of the third variable, or do you think it will be different? – If you do not have an a priori hypothesis and are just trying to build a solid statistical model, try as a moderator first. If significant, leave in as a moderator. – Because interaction terms are sometimes difficult to interpret on their own, think about just creating subset statistical models.
12
LAB 3: Regression in SUDAAN
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.