Download presentation
Presentation is loading. Please wait.
1
Chapter 13 Nonlinear and Multiple Regression
Assessing Model Accuracy Regression with Transformed Variables Polynomial Regression Multiple Regression Analysis Other Issues in Multiple Regression
2
Multilinear Regression
Testing for linear association between a population response variable Y and multiple predictor variables X1, X2, X3, … etc. Multilinear Regression “Response = Model + Error” “main effects” For now, assume the “additive model,” i.e., main effects only.
3
Multilinear Regression
Fitted response Residual True response yi X1 X2 Y (x1i , x2i) Predictors Least Squares calculation of regression coefficients is computer-intensive. Formulas require Linear Algebra (matrices)! Once calculated, how do we then test the null hypothesis? ANOVA
4
Multilinear Regression
Testing for linear association between a population response variable Y and multiple predictor variables X1, X2, X3, … etc. Multilinear Regression “Response = Model + Error” “main effects” R code example: lsreg = lm(y ~ x1+x2+x3)
5
Multilinear Regression
Testing for linear association between a population response variable Y and multiple predictor variables X1, X2, X3, … etc. Multilinear Regression “Response = Model + Error” “main effects” quadratic terms, etc. (“polynomial regression”) R code example: lsreg = lm(y ~ x+x^2+x^3) R code example: lsreg = lm(y ~ x1+x2+x3)
6
Multilinear Regression
Testing for linear association between a population response variable Y and multiple predictor variables X1, X2, X3, … etc. Multilinear Regression “Response = Model + Error” “main effects” quadratic terms, etc. (“polynomial regression”) “interactions” “interactions” R code example: lsreg = lm(y ~ x1*x2) R code example: lsreg = lm(y ~ x1+x2+x1:x2) R code example: lsreg = lm(y ~ x+x^2+x^3)
11
Recall… Multiple Linear Reg with interaction Example in R (reformatted for brevity): with an indicator (“dummy”) variable: > I = c(1,1,1,1,1,0,0,0,0,0) I = 1 > lsreg = lm(y ~ x*I) > summary(lsreg) Coefficients: Estimate (Intercept) x I x:I I = 0 Suppose these are actually two subgroups, requiring two distinct linear regressions!
12
ANOVA Table (revisited)
Note that if true, then it would follow that From sample of n data points…. Note that if true, then it would follow that But how are these regression coefficients calculated in general? “Normal equations” solved via computer (intensive).
13
ANOVA Table (revisited)
(based on n data points). Source df SS MS F p-value Regression Error Total *** How are only the statistically significant variables determined? ***
14
“MODEL SELECTION”(BE)
Step 0. Conduct an overall F-test of significance (via ANOVA) of the full model. “MODEL SELECTION”(BE) If significant, then… X1 + …… X2 X3 X4 Step 1. t-tests: …… …… p-values: p1 < p2 < p4 < .05 …… Reject H Reject H Accept H Reject H0 Step 2. Are all coefficients significant at level ? If not….
15
“MODEL SELECTION”(BE)
Step 0. Conduct an overall F-test of significance (via ANOVA) of the full model. “MODEL SELECTION”(BE) If significant, then… X1 + …… X2 X3 X4 Step 1. t-tests: …… …… p-values: p1 < p2 < p4 < .05 …… Reject H Reject H Accept H Reject H0 Step 2. Are all coefficients significant at level ? If not…. delete that term, X1 X2 X3 X4 + ……
16
“MODEL SELECTION”(BE)
Step 0. Conduct an overall F-test of significance (via ANOVA) of the full model. “MODEL SELECTION”(BE) If significant, then… X1 + …… X2 X3 X4 Step 1. t-tests: …… …… p-values: p1 < p2 < p4 < .05 …… Reject H Reject H Accept H Reject H0 Step 2. Are all coefficients significant at level ? If not…. delete that term, and recompute new coefficients! X1 + …… X2 X4 X1 X2 X4 + …… Step 3. Repeat 1-2 as necessary until all coefficients are significant → reduced model
21
Re-plot data on a “log-log” scale.
24
Re-plot data on a “log” scale (of Y only)..
25
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
26
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
27
“MAXIMUM LIKELIHOOD ESTIMATION”
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) “log-odds” (“logit”) = example of a general “link function” “MAXIMUM LIKELIHOOD ESTIMATION” (Note: Not based on LS implies “pseudo-R2,” etc.)
28
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
“log-odds” (“logit”) Suppose one of the predictor variables is binary… SUBTRACT!
29
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
“log-odds” (“logit”) Suppose one of the predictor variables is binary… SUBTRACT!
30
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
“log-odds” (“logit”) Suppose one of the predictor variables is binary…
31
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
“log-odds” (“logit”) Suppose one of the predictor variables is binary…
32
Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)
“log-odds” (“logit”) Suppose one of the predictor variables is binary… ODDS RATIO ………….. implies …………..
33
in population dynamics
Unrestricted population growth (e.g., bacteria) Restricted population growth (disease, predation, starvation, etc.) Population size y obeys the following law Population size y obeys the following law, constant a > 0, and “carrying capacity” M. with constant a > 0. Let survival probability = With initial condition Logistic growth Exponential growth
34
Summary ~ J I c r × c contingency table r Categorical (Qualitative)
e.g., Income Level: Low, Mid, High 2 CATEGORIES per each of two variables: H0: “There is no association between (the categories of) I and (the categories of) J.” r × c contingency table J 1 2 3 ••• c I etc. r Chi-squared Tests Test of Independence (1 population, 2 categorical variables) Test of Homogeneity (2 populations, 1 categorical variable) “Goodness-of-Fit” Test (1 population, 1 categorical variable) Modifications McNemar Test for paired 2 × 2 categorical data, to control for “confounding variables” e.g., case-control studies Fisher’s Exact Test for small “expected values” (< 5) to avoid possible “spurious significance”
35
Summary ~ Numerical (Quantitative) e.g., $ Annual Income
2 POPULATIONS: Independent e.g., RCT X σ1 σ2 1 2 Paired (Matched) e.g., Pre- vs. Post- Sample 1 Sample 2 H0: 1 = 2 No Yes Yes No Normally distributed? Q-Q plots Shapiro-Wilk Anderson-Darling others… Yes No Yes No Equivariance? “Nonparametric Tests” F-test Bartlett others… “Nonparametric Tests” Wilcoxon Rank Sum (aka Mann-Whitney U) “Approximate” T Sign Test Wilcoxon Signed Rank 2-sample T (w/o pooling) 2-sample T (w/ pooling) Satterwaithe Welch Paired T 2 POPULATIONS: ANOVA F-test (w/ “repeated measures” or “blocking”) Friedman Kendall’s W others… Kruskal-Wallis ANOVA F-test Regression Methods Various modifications
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.