7.1 - Motivation 7.1 - Motivation 7.2 - Correlation / Simple Linear Regression 7.2 - Correlation / Simple Linear Regression 7.3 - Extensions of Simple.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
BA 275 Quantitative Business Methods
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Linear regression models
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Generalized Linear Models (GLM)
Chapter 13 Multiple Regression
Chapter 12 Simple Regression
Chapter 12 Multiple Regression
Statistics for Business and Economics
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 11 th Edition.
Chapter 11 Multiple Regression.
Ch. 14: The Multiple Regression Model building
BCOR 1020 Business Statistics
Crime? FBI records violent crime, z x y z [1,] [2,] [3,] [4,] [5,]
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Lorelei Howard and Nick Wright MfD 2008
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
6.1 - One Sample One Sample  Mean μ, Variance σ 2, Proportion π Two Samples Two Samples  Means, Variances, Proportions μ 1 vs. μ 2.
Objectives of Multiple Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Chapter 13: Inference in Regression
Simple Linear Regression
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Chap 14-1 Copyright ©2012 Pearson Education, Inc. publishing as Prentice Hall Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
6-1 Introduction To Empirical Models Based on the scatter diagram, it is probably reasonable to assume that the mean of the random variable Y is.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Lecture 7: Multiple Linear Regression Interpretation with different types of predictors BMTRY 701 Biostatistical Methods II.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
Linear Models Alan Lee Sample presentation for STATS 760.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc.. Chap 14-1 Chapter 14 Introduction to Multiple Regression Basic Business Statistics 10 th Edition.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 14-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Summary of the Statistics used in Multiple Regression.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Simple and multiple regression analysis in matrix form Least square Beta estimation Beta Simple linear regression Multiple regression with two predictors.
Chapter 20 Linear and Multiple Regression
Chapter 12 Simple Linear Regression and Correlation
CHAPTER 7 Linear Correlation & Regression Methods
Chapter 13 Nonlinear and Multiple Regression
CHAPTER 29: Multiple Regression*
Chapter 12 Simple Linear Regression and Correlation
Presentation transcript:

7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple Linear Regression Extensions of Simple Linear Regression CHAPTER 7 Linear Correlation & Regression Methods CHAPTER 7 Linear Correlation & Regression Methods

Parameter Estimation via SAMPLE DATA … Testing for association between two POPULATION variables X and Y… Categorical variables Numerical variables Chi-squared Test  Chi-squared Test ???????  ??????? Examples: X = Disease status (D+, D–) Y = Exposure status (E+, E–) X = # children in household (0, 1-2, 3-4, 5+) Y = Income level (Low, Middle, High) PARAMETERS  Means:  Variances:  Covariance:

Parameter Estimation via SAMPLE DATA … PARAMETERS  Means:  Variances: Numerical variables ???????  ???????  Means:  Variances: PARAMETERS STATISTICS (can be +, –, or 0)  Covariance:

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn PARAMETERS  Means:  Variances:  Covariance: Numerical variables ???????  ???????  Means:  Variances: PARAMETERS STATISTICS (can be +, –, or 0) X Y JAMA. 2003;290: Scatterplot ( n data points)  Covariance:

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn PARAMETERS  Means:  Variances:  Covariance: Numerical variables ???????  ???????  Means:  Variances: PARAMETERS STATISTICS (can be +, –, or 0) X Y JAMA. 2003;290: Scatterplot Does this suggest a linear trend between X and Y? If so, how do we measure it? ( n data points)  Covariance:

Testing for association between two population variables X and Y… Numerical variables ???????  ??????? PARAMETERS  Means:  Variances:  Covariance:  Linear Correlation Coefficient: Always between –1 and +1 LINEAR ^

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn PARAMETERS  Means:  Variances:  Covariance: Numerical variables ???????  ???????  Means:  Variances:  Covariance: PARAMETERS STATISTICS (can be +, –, or 0) X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 ( n data points)

Parameter Estimation via SAMPLE DATA … JAMA. 2003;290: x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn PARAMETERS  Means:  Variances:  Covariance: Numerical variables ???????  ???????  Means:  Variances:  Covariance: PARAMETERS STATISTICS (can be +, –, or 0) X Y Scatterplot ( n data points) Example in R (reformatted for brevity): > c(mean(x), mean(y)) > var(x) > var(y) > cov(x, y)  Linear Correlation Coefficient: Always between –1 and +1 > cor(x, y) > pop = seq(0, 20, 0.1) > x = sort(sample(pop, 10)) > y = sample(pop, 10) n = 10 plot(x, y, pch = 19)

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Numerical variables X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 r measures the strength of linear association ( n data points)

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Numerical variables X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 – positive linear correlation negative linear correlation r r measures the strength of linear association ( n data points)

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Numerical variables X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 – positive linear correlation negative linear correlation r r measures the strength of linear association ( n data points)

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Numerical variables X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 – positive linear correlation negative linear correlation r r measures the strength of linear association ( n data points) linear r measures the strength of linear association

Parameter Estimation via SAMPLE DATA … x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Numerical variables X Y JAMA. 2003;290: Scatterplot  Linear Correlation Coefficient: Always between –1 and +1 – positive linear correlation negative linear correlation r ( n data points) > cor(x, y) > cor(x, y) linear r measures the strength of linear association

linear Testing for linear association between two numerical population variables X and Y…  Linear Correlation Coefficient Now that we have r, we can conduct HYPOTHESIS TESTING on  Test Statistic for p-value p-value =.0189 <.05 2 * pt(-2.935, 8)

Parameter Estimation via SAMPLE DATA … If such an association between X and Y exists, then it follows that for any intercept  0 and slope  1, we have…  Linear Correlation Coefficient: linear r measures the strength of linear association “Response = Model + Error” Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals in what sense???

Parameter Estimation via SAMPLE DATA … in what sense??? If such an association between X and Y exists, then it follows that for any intercept  0 and slope  1, we have…  Linear Correlation Coefficient: linear r measures the strength of linear association “Response = Model + Error” Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes “Least Squares Regression Line” SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES

If such an association between X and Y exists, then it follows that for any intercept  0 and slope  1, we have…  Linear Correlation Coefficient: linear r measures the strength of linear association “Response = Model + Error” Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES Check 

Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes X Y SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response

Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes X Y SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response

Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes X Y ~EXERCISE~ SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response

Find estimates and for the “best” line > cor(x, y) > cor(x, y) Residuals i.e., that minimizes SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response residuals X Y ~EXERCISE~

Find estimates and for the “best” line > cor(x, y) > cor(x, y) i.e., that minimizes SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response residuals X Y ~EXERCISE~ ~EXERCISE~ Residuals

linear Testing for linear association between two numerical population variables X and Y…  Linear Regression Coefficients Test Statistic for p-value? “Response = Model + Error” Now that we have these, we can conduct HYPOTHESIS TESTING on  0 and  1

Find estimates and for the “best” line > cor(x, y) > cor(x, y) i.e., that minimizes SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response residuals X Y ~EXERCISE~ ~EXERCISE~ Residuals

linear Testing for linear association between two numerical population variables X and Y…  Linear Regression Coefficients Now that we have these, we can conduct HYPOTHESIS TESTING on  0 and  1 Test Statistic for p-value “Response = Model + Error” Same t-score as H 0 :  = 0! p-value =.0189

> plot(x, y, pch = 19) > lsreg = lm(y ~ x) # or lsfit(x,y) > abline(lsreg) > summary(lsreg) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) *** x * --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 8 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 8 DF, p-value: BUT WHY HAVE TWO METHODS FOR THE SAME PROBLEM??? Because this second method generalizes…

SourcedfSSMSF-ratiop-value Treatment Error Total – ANOVA Table

SourcedfSSMSF-ratiop-value Regression Error Total – ANOVA Table

SourcedfSSMSF-ratiop-value Regression 1 Error Total – ANOVA Table

linear Testing for linear association between two numerical population variables X and Y…  Linear Regression Coefficients Now that we have these, we can conduct HYPOTHESIS TESTING on  0 and  1 Test Statistic for p-value “Response = Model + Error” Same t-score as H 0 :  = 0! p-value =.0189

SourcedfSSMSF-ratiop-value Regression 1 Error 8 Total – ANOVA Table

Parameter Estimation via SAMPLE DATA …  Means:  Variances: STATISTICS JAMA. 2003;290: x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Scatterplot ( n data points)

Parameter Estimation via SAMPLE DATA …  Means:  Variances: STATISTICS JAMA. 2003;290: x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Scatterplot ( n data points) SS Tot is a measure of the total amount of variability in the observed responses (i.e., before any model-fitting).

Parameter Estimation via SAMPLE DATA … JAMA. 2003;290: x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Scatterplot ( n data points)  Means:  Variances: STATISTICS SS Reg is a measure of the total amount of variability in the fitted responses (i.e., after model-fitting.)

Parameter Estimation via SAMPLE DATA …  Means:  Variances: STATISTICS JAMA. 2003;290: x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn Scatterplot ( n data points) SS Err is a measure of the total amount of variability in the resulting residuals (i.e., after model-fitting).

> cor(x, y) > cor(x, y) SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response residuals X Y ~EXERCISE~ ~EXERCISE~ Residuals = = = 9 ( ) = 204.2

SIMPLE LINEAR REGRESSION via the METHOD OF LEAST SQUARES predictor observed response fitted response residuals X Y ~EXERCISE~ ~EXERCISE~ Residuals = = = SS Tot = SS Reg + SS Err minimum > cor(x, y) > cor(x, y) Tot Err Reg

SourcedfSSMSF-ratiop-value Regression MS Reg F k – 1, n – k 0 < p < 1 Error MS Err Total – ANOVA Table

SourcedfSSMSF-ratiop-value Regression Error Total – ANOVA Table Same as before!

SourcedfSSMSF-ratiop-value Regression Error Total – > summary(aov(lsreg)) Df Sum Sq Mean Sq F value Pr(>F) x * Residuals

SourcedfSSMSF-ratiop-value Regression Error Total – Moreover, The least squares regression line accounts for 51.85% of the total variability in the observed response, with 48.15% remaining. Coefficient of Determination

> cor(x, y) > cor(x, y) Coefficient of Determination Moreover, The least squares regression line accounts for 51.85% of the total variability in the observed response, with 48.15% remaining.

> plot(x, y, pch = 19) > lsreg = lm(y ~ x) > abline(lsreg) > summary(lsreg) Call: lm(formula = y ~ x) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) *** x * --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 8 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 1 and 8 DF, p-value: Coefficient of Determination The least squares regression line accounts for 51.85% of the total variability in the observed response, with 48.15% remaining.

Given:  Linear Correlation Coefficient  Least Squares Regression Line minimizes SS Err = Summary of Linear Correlation and Simple Linear Regression x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn MeansVariancesCovariance JAMA. 2003;290: X Y XYXY = SS Tot – SS Reg –1  r  +1 linear measures the strength of linear association (ANOVA) All point estimates can be upgraded to CIs for hypothesis testing, etc.

Given:  Linear Correlation Coefficient –1  r  +1 linear measures the strength of linear association  Least Squares Regression Line minimizes SS Err = (ANOVA) Summary of Linear Correlation and Simple Linear Regression x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn MeansVariancesCovariance JAMA. 2003;290: X Y XYXY = SS Tot – SS Reg upper 95% confidence band 95% Confidence Intervals lower 95% confidence band All point estimates can be upgraded to CIs for hypothesis testing, etc. (see notes for “95% prediction intervals”)

Given:  Linear Correlation Coefficient –1  r  +1 linear measures the strength of linear association  Least Squares Regression Line minimizes SS Err = (ANOVA) Summary of Linear Correlation and Simple Linear Regression x1x1 x2x2 x3x3 x4x4 …xnxn y1y1 y2y2 y3y3 y4y4 …ynyn MeansVariancesCovariance JAMA. 2003;290: X Y XYXY  Coefficient of Determination = SS Tot – SS Reg All point estimates can be upgraded to CIs for hypothesis testing, etc. proportion of total variability modeled by the regression line’s variability.

linear Testing for linear association between a population response variable Y and multiple predictor variables X 1, X 2, X 3, … etc. “Response = Model + Error” Multilinear Regression “main effects” For now, assume the “additive model,” i.e., main effects only.

Multilinear Regression Fitted response  Residual True response y i X1X1 X2X2 0 Y (x 1i, x 2i ) Predictors Once calculated, how do we then test the null hypothesis? Least Squares calculation of regression coefficients is computer-intensive. Formulas require Linear Algebra (matrices)! ANOVA

linear Testing for linear association between a population response variable Y and multiple predictor variables X 1, X 2, X 3, … etc. “Response = Model + Error” Multilinear Regression “main effects” R code example: lsreg = lm(y ~ x1+x2+x3)

R code example: lsreg = lm(y ~ x+x^2+x^3) linear Testing for linear association between a population response variable Y and multiple predictor variables X 1, X 2, X 3, … etc. “Response = Model + Error” Multilinear Regression “main effects” quadratic terms, etc. (“polynomial regression”) quadratic terms, etc. (“polynomial regression”)

R code example: lsreg = lm(y ~ x+x^2+x^3) R code example: lsreg = lm(y ~ x1+x2+x1:x2) R code example: lsreg = lm(y ~ x1*x2) linear Testing for linear association between a population response variable Y and multiple predictor variables X 1, X 2, X 3, … etc. “Response = Model + Error” Multilinear Regression “main effects” quadratic terms, etc. (“polynomial regression”) quadratic terms, etc. (“polynomial regression”) “ interactions ”

Recall… Example in R (reformatted for brevity): > I = c(1,1,1,1,1,0,0,0,0,0) > lsreg = lm(y ~ x*I) > summary(lsreg) Coefficients: Estimate (Intercept) x I x:I Suppose these are actually two subgroups, requiring two distinct linear regressions! Multiple Linear Reg with interaction with an indicator (“dummy”) variable: I = 1 I = 0

ANOVA Table (revisited) From sample of n data points…. Note that if true, then it would follow that But how are these regression coefficients calculated in general? Normal equations “Normal equations” solved via computer (intensive). Note that if true, then it would follow that

SourcedfSS MS Fp-value Regression Error Total ANOVA Table (revisited) (based on n data points). *** How are only the statistically significant variables determined? ***

p-values: p 1 <.05 p 2 <.05 p 4 <.05 “MODEL SELECTION”(BE) Step 0.Conduct an overall F-test of significance (via ANOVA) of the full model. X1X …… X2X2 X3X3 X4X4 …… Step 1. t-tests: Reject H 0 Reject H 0 Accept H 0 Reject H 0 …… Step 2. Are all coefficients significant at level  ? If not…. If significant, then…

p-values: p 1 <.05 p 2 <.05 p 4 <.05 “MODEL SELECTION”(BE) Step 0.Conduct an overall F-test of significance (via ANOVA) of the full model. X1X …… X2X2 X3X3 X4X4 …… Step 1. t-tests: Reject H 0 Reject H 0 Accept H 0 Reject H 0 …… Step 2. Are all coefficients significant at level  ? If not…. X1X …… X2X2 X4X4 X3X3 delete that term, If significant, then…

p-values: p 1 <.05 p 2 <.05 p 4 <.05 “MODEL SELECTION”(BE) Step 0.Conduct an overall F-test of significance (via ANOVA) of the full model. X1X …… X2X2 X3X3 X4X4 …… Step 1. t-tests: Reject H 0 Reject H 0 Accept H 0 Reject H 0 …… Step 2. Are all coefficients significant at level  ? If not…. X1X …… X2X2 X4X4 Step 3. Repeat 1-2 as necessary until all coefficients are significant → reduced model delete that term, and recompute new coefficients! If significant, then… X1X1 X2X2 X4X4 +++ ……

11 22 kk = = = H0:H0: k  2 independent, equivariant, normally-distributed “treatment groups” Recall ~

Re-plot data on a “log-log” scale.

Re-plot data on a “log” scale (of Y only)..

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No)

“MAXIMUM LIKELIHOOD ESTIMATION” “log-odds” (“logit”) = example of a general “link function” (Note: Not based on LS implies “pseudo-R 2,” etc.)

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) Suppose one of the predictor variables is binary… “log-odds” (“logit”) SUBTRACT !

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) Suppose one of the predictor variables is binary… “log-odds” (“logit”) SUBTRACT !

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) Suppose one of the predictor variables is binary… “log-odds” (“logit”)

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) Suppose one of the predictor variables is binary… “log-odds” (“logit”)

Binary outcome, e.g., “Have you ever had surgery?” (Yes / No) Suppose one of the predictor variables is binary… “log-odds” (“logit”) ………….. implies …………..

in population dynamics Unrestricted population growth (e.g., bacteria) Population size y obeys the following law with constant a > 0. With initial condition Restricted population growth (disease, predation, starvation, etc.) Population size y obeys the following law, constant a > 0, and “carrying capacity” M. Exponential growth Let survival probability  = Logistic growth