Jul-15H.S.1 Linear Regression Hein Stigum Presentation, data and programs at:
CONCEPTS Linear regression Jul-15H.S.2
Jul-15H.S.3 Outcome and regression types Numerical data –Discrete number of partners –Continuous Weight Categorical data –Nominal disease/ no disease –Ordinal small/ medium/ large Poisson regression Linear regression Logistic regression Ordinal regression
Jul-15H.S.4 Regression idea
Jul-15H.S.5 Measures and Assumptions Adjusted effects –b 1 is the increase in weight per day of gestational age –b 1 is adjusted for b 2 Assumptions –Independent errors –Linear effects –Constant error variance Robustness –influence
Jul-15H.S.6 Workflow DAG Plots: distribution and scatter Bivariate analysis Regression –Model estimation –Test of assumptions Independent errors Linear effects Constant error variance –Robustness Influence Discuss Plot
ANALYSIS Continuous outcome: Linear regression, Birth weight Jul-15H.S.7
Jul-15H.S.8 DAGs E gest age D birth weight C2 parity C1 sex AssociationsBivariate (unadjusted) Causal effectsMultivariable (adjusted) Draw your assumptions before your conclusions
Jul-15H.S.9 Plot outcome by exposure OK Be clear on the research question: overall birth weight: linear regression low birth weight:logistic regression linear and logistic can give opposite results May lead to non-constant error variance May have high influential outliers Effects on linear regression:
Plot outcome by exposure, cont. Jul-15H.S.10 Linear effects? Yes
Bivariate analysis Jul-15H.S.11 Outcome: birthweight
REGRESSION Continuous outcome: Linear regression, Birth weight Jul-15H.S.12
Categorical covariates 2 categories –OK, but know the coding 3+ categories –Use “dummies” “Dummies” are 0/1 variables used to create contrasts Want 3 categories for parity: 0, 1 and 2-7 children Choose 0 as reference Make dummies for the two other categories Jul-15H.S.13 generate Parity1 =(parity==1) if parity<. generate Parity2_7 =(parity>=2) if parity<.
Model estimation Jul-15H.S.14 Syntax: regress weight gest sex Parity1 Parity2_7
Create meaningful constant Expected birth weight at: gest= 0, sex=0, parity=0 gest=280, sex=1, parity=0 Alternative: center variables gen gest280=gest-280 gest280 has a meaningful zero at 280 days gen sex0=sex-1 sex0 has a meaningful zero at boys
Model results Jul-15H.S.16
Jul-15H.S.17 Test of assumptions Discuss Independent residuals? Plot residuals versus predicted y Linear effects? constant variance?
Jul-15H.S.18 Violations of assumptions Dependent residuals Use linear mixed models Non linear effects Add square term Or use piecewise linear Non-constant variance Use robust variance estimation
Jul-15H.S.19 Influence
Jul-15H.S.20 Measures of influence Measure change in: –Predicted outcome –Deviance –Coefficients (beta) Delta beta Remove obs 1, see change remove obs 2, see change
Delta beta for gestational age Jul-15H.S.21 If obs nr 539 is removed, beta will change from 6 to 16
Removing outlier Jul-15H.S.22 Full dataOutlier removed One outlier affected two estimatesFinal model
Jul-15H.S.23 Summing up DAGs –Guide analysis Plots –Unequal variance, non-linearity, outliers Bivariate analysis Linear regression –Fit model –Check assumptions –Check robustness –Make meaningful constant