1 Applied biostatistics Francisco Javier Barón López Dpto. Medicina Preventiva Universidad de Málaga – España
2 Multivariate analysis Generally used to study: the effect of one variable Numerical dichotomous, or qualitative by using multiple binary variables. On another variable Numerical: Multiple linear regression Binary: Logistic regression Controlling for the effect of a few other variables Control variables Covariates Confusion
The usual multivariate model in Health sciences 3 Multivariate model Outcome Interesting variable Covariates age sex … Does [interesting variable] influence [Outcome variable] when [adjusting/controlling/taking into account] covariates1, covariates2,…?
Numerical outcome: Multiple linear regression 4 Multivariate model: Linear regression model Numeric outcome Interesting variable Covariates age sex … Estimate±std.error; p Estimate; CI 95%; p We are NOT (very) interested in the significance of covariates. Estimate>0, Increasing effect Estimate<0, Decreasing effect Estimate=0, No effect
Binary logistic regression 5 Multivariate model: Linear regression model Binary outcome Interesting variable Covariates age sex … OR; p OR; CI 95%; p 0/1 The estimates now are: Odds Ratios (OR) OR>1, Increased risk OR<1, Decreased risk OR=1, No effect
Dummy variables 6 Multivariate model Outcome Qualitative interesting variable with 3+ levels Covariates age sex … We must encode the qualitatives non binary variables using only binary variables. How?
Encoding dummy variables Categoria laboral Administativo dummySeguridad=0 dummyDirectivo=0 Seguridad: dummySeguridad=1 dummyDirectivo=0 Directivo: dummySeguridad=0 dummyDirectivo=1 7
Dummy variables 8 Multivariate model Outcome Qualitative interesting variable with 3+ levels Covariates … Coding qualitative variables using dummy variables Multivariate model Outcome Dummy 1 Covariates … Dummy 2 …