EHS 655 Lecture 15: Exposure variability and modeling
What we’ll cover today Within- and between-subjects variability Variance components Fixed, random, and mixed effects
WITHIN- AND BETWEEN-PERSON VARIABILITY We’ve already talked about within- and between group variability What about variability at individual level? Between-person variability Reflects differences between workers beyond those explained by shared characteristics (e.g., workplace) Within-person variability Reflects variability due to day to day differences in exposure
Example: within- and between-person variability A Burdorff 2005
Example: within-person variability between groups Rappaport, 2008
How do we evaluate within- and between-person variability? Step 1: measure it Need repeated measurements on individuals Step 2: analyze it Need to evaluate variance components Within-person Between-person Why do we care? Relative importance of day-to-day differences in exposure vs between-worker differences Useful when considering approaches to exposure control
Exercise Why do we care about variance components? Think about how within-person variability applies to exposure controls For which jobs do you think training and personal protective equipment might be most and least effective at controlling exposures? Welder: within-person, between-person Mechanic: within-person, between-person Assembler: within-person, between-person
Within- and between-person variability and controls We always adhere to the hierarchy of controls But…theoretically… Engineering controls Could be more effective where within-subject variance >> between-subject Administrative and personal protective equipment (PPE) Could be more effective where between-subject variance >> within-subject
VARIANCE COMPONENTS σT2 = σW2 +σB2 Where Total variance, σT2 Within-person variance, σW2 Between-person variance, σB2 σW2 requires repeated measurements on individuals
Repeated measurements Correlation between repeated measurements for person must be taken into account Where ρ is correlation between any two repeated measurements on individual We typically assume this is constant over time
Can we analyze variance via standard multivariable linear regression? Not accurately Problem Ignores within-person variability and correlation between measurements on same person (i.e. errors not independent) Appropriate only if people have exactly one measurement each
Repeated measurements – attenuation Where b = expected value of regression coefficient of dependent Y on independent X Β = true regression coefficient σW2 = within-person variance σB2 = between-person variance Heederik, Bolei, Kromhout, Smid, 1991
Variance components example – job title Heederik, Bolei, Kromhout, Smid, 1991
Variance components example – different exposure predictors Kromhout et al, 1993
So, how do we get estimates of within- and between-person variance? Things to include in analyses We’ve already been considering fixed effects Shared characteristic (e.g., workplace, department) Average response of group from common regression model Now we can add random effects Allows estimation of responses for each individual (i.e., accounts for individual variability) Mixed models include both effects
Fixed vs. random effects Rappaport et al, Ann Occup Hyg, 1999
Variance components approach 1: Oneway random effects ANOVA Strength Addresses between- and within-person variability Weakness Ignores work characteristics Stata: loneway depvar idvar Take within- and between-person SDs that are output and square them to get variance components Compute ƛ, (σW2/σB2)
Variance components approach 1: Oneway random effects ANOVA Can also look at within- and between-group variance Stata: loneway depvar groupvar Take within- and between-group SDs that are output and square them to get variance components
Reference: One-way random effects ANOVA model for between- and within-person variance Where Xij is exposure level for ith worker on jth day μ is mean exposure Β1 is random deviation of ith person’s true exposure μyi from μy ε is random deviation of ith person’s exposure for the jth day from true expsure μyi
Variance components approach 2: Mixed effects regression Strengths Can address work characteristics Can evaluate variance components Can consider several sources of variability One continuous dependent variable (linear regression) One binary dependent variable (logistic regression) One to many categorical or continuous predictors
Mixed effects regression Includes both fixed and random effects Fixed effect: same coefficient applied to all in group Random effect: coefficients vary by individual, site, or some other factor Estimates within- and between-person variance while adjusting for fixed effects Linear regression Stata: xtmixed depvar indepvar || idvar:
Mixed effects vs multivariable linear regression Peretz et al, 2002
Comparison: random vs mixed effects models Peretz et al, 2002
Comparison: random vs fixed effects models Peretz et al, 2002
Comparing models Log likelihood Absolute number meaningless Lower number is better To compare nested models only (logistic regression) Wald test – basically chi-square test to evaluate whether coefficient = 0 Likelihood ratio test – measure of unexplained variance in the dependent variable Smaller value = better fit, but absolute number is meaningless
Resources Linear models – repeated measurements analysis http://faculty.ucr.edu/~hanneman/linear_models/c8.html Interpreting Stata linear mixed model output http://blog.stata.com/2013/02/18/multilevel-linear-models-in-stata-part-2-longitudinal-data/ www.stata.com/meeting/fnasug08/gutierrez.pdf
Reference: Mixed effects regression model Where Xij is exposure level i=worker 1 to k j = 1 to n repetitions for the ith person Β0 is intercept for background exposure group Β1 to Βp are fixed effects Xij1 to xijp are values of variables on ith person on jth day bi to bk is ith worker random effects (discrepancy between their intercept and group intercept) Z1 to zk are person’s indicators (0/1 – i.e., dummy variables)