Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics

Similar presentations


Presentation on theme: "Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics"— Presentation transcript:

1 Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics
Karen Bandeen-Roche, PhD Department of Biostatistics Johns Hopkins University July 14, 2011 Introduction to Statistical Measurement and Modeling

2 Data examples Boxing and neurological injury
Scientific question: Does amateur boxing lead to decline in neurological performance? Some related statistical questions: Is there a dose-response increase in the rate of cognitive decline with increased boxing exposure? Is boxing-associated decline independent of initial cognition and age? Is there a threshold of boxing that initiates harm?

3 Boxing data

4 Outline Topic #1: Confounding Topic #2: Signal / noise decomposition
Handling this is crucial if we are to draw correct conclusions about risk factors Topic #2: Signal / noise decomposition Signal: Regression model predictions Noise: Residual variation Another way of approaching inference, precision of prediction

5 Topic # 1: Confounding Confound means to “confuse”
When the comparison is between groups that are otherwise not similar in ways that affect the outcome Coffee drinking and smoking re CVD Lurking variables,….

6 Confounding Example: Drowning and Eating Ice Cream
* * * * * * * Drowning rate * * * * * * * * * * * * * * * * * * * Ice Cream eaten

7 JHU Intro to Clinical Research
Confounding Epidemiology definition: A characteristic “C” is a confounder if it is associated (related) with both the outcome (Y: drowning) and the risk factor (X: ice cream) and is not causally in between Ice Cream Consumption Drowning rate ?? July 2010 JHU Intro to Clinical Research

8 Confounding Statistical definition: A characteristic “C” is a confounder if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs with, versus without, adjustment for C Ice Cream Eaten Drowning rate Remind what “adjustment” means: direct vs. indirect effect Outdoor Temperature

9 Confounding Example: Drowning and Eating Ice Cream
* * * * * * * Drowning rate * * * * * * * * * Warm temperature * * * * * * * * * * Cool temperature Ice Cream eaten

10 JHU Intro to Clinical Research
Effect modification A characteristic “E” is an effect modifier if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs within levels of E Ice Cream Consumption Drowning rate Birth control pills and smoking re CVD Outdoor temperature July 2010 JHU Intro to Clinical Research

11 Effect Modification: Drowning and Eating Ice Cream
* * * * * * * * * * Drowning rate * * * * * * Warm temperature * * * * * * * * * * Cool temperature Ice Cream eaten

12 Topic #2: Signal/Noise Decomposition
Lovely due to geometry of least squares Facilitates testing involving multiple parameters at once Provides insight into R-squared

13 Signal/Noise Decomposition
First step: decomposition of variance “Regression” part: Variance of s “Error” or “Residual” part: Variance of e Together: These determine “total” variance of Ys “Sums of Squares” (SS) rather than variance per se Regression SS (SSR): Error SS (SSE): Total SS (SST):

14 Signal/Noise Decomposition
Properties SST = SSR + SSE SSR/SST = “proportion of variance explained” by regression = R-squared Follows from geometry SSR and SSE are independent (assuming A1-A5) and have easily characterized probability distributions Provides convenient testing methods Follows from geometry plus assumptions Do the first one from the geometry

15 Signal/Noise Decomposition
SSR and SSE are independent Define M = span(X) and take “Y” as centered at It is possible to orthogonally rotate the coordinate axes so that first p axes ε M; remaining n-p-1 axes ε M⊥ Gram-Schmidt orthogonalization Doing this transforms Y into TY :=Z, for some orthonormal matrix T with columns:= {e1,...,en-1} Distribution of Z = N(TE[Y|X],σ2I) Distribution of Z = N(TE[Y|X],TVar(Y)Tʹ) = N(TE[Y|X],Tσ2ITʹ) = N(TE[Y|X],σ2I) (TTʹ=I)

16 Signal/Noise Decomposition
SSR and SSE are independent - continued TY=Z Y = T’Z SSE = squared length of = SSR = squared length of = Claim now follows: SSR & SSE are independent because (Z1,…,Zp) and (Zp+1,…,Zn-1) are independent SSE expression due to (because ejs are orthogonal, length=1)

17 Signal/Noise Decomposition
Under A1-A5 SSE, SSR and their scaled ratio have convenient distributions Under A1-A2: E[Y|X] ε M, E[Zj|X] =0, all j>p Recall {Z1,...,Zn-1} are mutually independent normal with variance=σ2 Thus SSE = = ~ σ2 χ2n-p-1 under A1-A5 (a sum of k independent squared N(0,1) is ) SSE expression due to (because ejs are orthogonal, length=1)

18 Signal/Noise Decomposition
Under A1-A5 SSE, SSR and their scaled ratio have convenient distributions For j ≤ p E[Zj|X] ≠ 0 in general Exception: H0: β1=…=βp = 0 Then SSR = ~ σ2 χ2p under A1-A5 and ~ Fp,n-p-1 ~ with numerator and denominator independent. Here: pause to remark re the t distribution

19 Signal/Noise Decomposition
An organizational tool: The analysis of variance (ANOVA) table SOURCE Sum of Squares (SS) Degrees of freedom (df) Mean square (SS/df) Regression SSR p SSR/p Error SSE n-p-1 SSE/(n-p-1) = Total SST = SSR + SSE n-1 F = MSR/MSE

20 “Global” hypothesis tests
These involve sets of parameters Hypotheses of the form H0: βj = 0 for all j in a defined subset of {j=1,...,p} vs. H1: βj ≠ 0 for at least one of the j Example 1: H0: βLATITUDE = 0 and βLONGITUDE = 0 Example 2: H0: all polynomial or spline coefficients involving a given variable = 0. Example 3: H0: all coefficients involving a variable = 0. [Note wording of the hypothesis: all=0 vs any not eq.0] a. Example 1: H0: βLATITUDE = 0 and βLONGITUDE = 0 [NO ASSOCIATION BETWEEN GEOGRAPHICAL LOCATION & TEMP] b. Example 2: H0: all polynomial or spline coefficients involving a given variable = 0. [LONGITUDE ASSOCIATION IS LINEAR] c. Example 3: H0: all coefficients involving a variable = 0.

21 “Global” hypothesis tests
Testing method: Sequential decomposition of sums of squares Hypothesis to be tested is H0: βj1=...=βjk = 0 in full model Fit model excluding xj1,...,xjpj: Save SSE = SSEs Fit “full” (or larger) model adding xj1,...,xjpj to smaller model. Save SSE=SSEL, often=overall SSE Test statistic S = [(SSES-SSEL)/pj]/[SSEL(n-p-1)] Distribution under null: F(pj,n-p-1) Define rejection region based on this distribution Compute S Reject or not as S is in rejection region or not Draw out the model on the board – box in part of it Draw the F

22 Signal/Noise Decomposition
An augmented version for global testing SOURCE Sum of Squares (SS) Degrees of freedom (df) Mean square (SS/df) Regression SSR p SSR/p X1 SST-SSEs p1 X2|X1 SSES-SSEL p2 (SSES-SSEL )/p2 Error SSEL n-p-1 SSEL/(n-p-1) Total SST = SSR + SSE n-1 Go back and forth to geometry slide (ok to Boxing) F = MSR(2|1)/MSE

23 R-squared – Another view
From last lecture: ECDF Corr(Y, ) squared More conventional: R2 = SSR/SST Geometry justifies why they are the same Cov(Y, ) = Cov(Y , ) = Cov(e, ) + Var( ) Covariance = inner product first term = 0 A measure of precision with which regression model describes individual responses Illustrate on plot; complete the argument

24 Outline: A few more topics
Colinearity Overfitting Influence Mediation Multiple comparisons

25 Main points Confounding occurs when an apparent association between a predictor and outcome reflects the association of each with a third variable A primary goal of regression is to “adjust” for confounding Least squares decomposition of Y into fit and residual provides an appealing statistical testing framework An association of an outcome with predictors is evidenced if SS due to regression is large relative to SSE Geometry: orthogonal decomposition provides convenient sampling distribution, view of R2 ANOVA


Download ppt "Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics"

Similar presentations


Ads by Google