Sections 6.7, 6.8, 7.7 (Note: The approach used here to present the material in these sections is substantially different from the approach used in the.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

Chapter 12 Simple Linear Regression
Chapter 12 Simple Linear Regression
Recall the hypothesis test we considered last time in Class Exercise #6(a)-(f) in Class Handout #3:
Chapter 10 Simple Regression.
Chapter 12 Multiple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
Linear Regression and Correlation Analysis
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Regression Analysis (2)
Correlation and Regression
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
INTRODUCTORY LINEAR REGRESSION SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
Chapter 11 Linear Regression Straight Lines, Least-Squares and More Chapter 11A Can you pick out the straight lines and find the least-square?
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Chapter 13 Multiple Regression
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Multiple Regression I 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 4 Multiple Regression Analysis (Part 1) Terry Dielman.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Regression Analysis Presentation 13. Regression In Chapter 15, we looked at associations between two categorical variables. We will now focus on relationships.
Chapter 13 Simple Linear Regression
The simple linear regression model and parameter estimation
Statistics for Managers using Microsoft Excel 3rd Edition
Correlation and Simple Linear Regression
Relationship with one independent variable
Slides by JOHN LOUCKS St. Edward’s University.
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
The least squares line is derived in Section 4.2 by minimizing
Relationship with one independent variable
Simple Linear Regression
Simple Linear Regression and Correlation
SIMPLE LINEAR REGRESSION
St. Edward’s University
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Sections 6.7, 6.8, 7.7 (Note: The approach used here to present the material in these sections is substantially different from the approach used in the textbook.) Recall: If X and Y are random variables with E(X) =  X, E(Y) =  Y, Var(X) =  X 2, Var(Y) =  Y 2, and Cov(X,Y) =  X  Y, then the least squares line for predicting Y from X is y =  Y +   Y — (x –  X )  X or y =  Y –   Y —  X +  X  Y  — x  X a b The least squares line is derived in Section 4.2 by minimizing E{[Y – (a + bX)] 2 }. Consider a set of observed data (x 1, y 1 ), (x 2, y 2 ), …, (x n, y n ). Imagine that we treat this data as describing a joint p.m.f. for two random variables X and Y where each points is assigned a probability of 1/n. Then, we see that y x

plays the role of E(X) =  X, plays the role of E(Y) =  Y, plays the role of Var(X) =  X 2, plays the role of Var(Y) =  Y 2, and plays the role of Cov(X,Y) =  X  Y.  i = 1 x i = 1 — n x n  i = 1 y i = 1 — n y n  i = 1 (x i – x) 2 = sx2sx2 n 1 — n n – 1 —— n  i = 1 (y i – y) 2 = sy2sy2 n 1 — n n – 1 —— n  i = 1 (x i – x)(y i – y) = n 1 — n we shall complete this equation shortly.

We define the sample covariance to be c =,and we define the sample correlation to be r = Consequently, the least squares line for predicting Y from X is This least squares line minimizes  i = 1 (x i – x)(y i – y) n n – 1 c ——. s x s y The sample correlation r is a measure of the strength and direction of a linear relationship for the sample in the same way that the correlation  is a measure of the strength and direction of a linear relationship for the two random variables X and Y.

plays the role of E(X) =  X, plays the role of E(Y) =  Y, plays the role of Var(X) =  X 2, plays the role of Var(Y) =  Y 2, and plays the role of Cov(X,Y) =  X  Y.  i = 1 x i = 1 — n x n  i = 1 y i = 1 — n y n  i = 1 (x i – x) 2 = sx2sx2 n 1 — n n – 1 —— n  i = 1 (y i – y) 2 = sy2sy2 n 1 — n n – 1 —— n  i = 1 (x i – x)(y i – y) = n 1 — n n – 1 —— c n

We define the sample covariance to be c =,and we define the sample correlation to be r = Consequently, the least squares line for predicting Y from X is This least squares line minimizes y = y + r s y — (x – x) s x or y = y – r s y — x + s x s y r — x s x  i = 1 (x i – x)(y i – y) n n – 1 c ——. s x s y The sample correlation r is a measure of the strength and direction of a linear relationship for the sample in the same way that the correlation  is a measure of the strength and direction of a linear relationship for the two random variables X and Y. [y i – (a + bx i )] 2.  i = 1 n a b

r = +1r close to +1r is positive r = –1r close to –1 r is negative r close to 0 r is negativer close to 0

Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Then the joint p.d.f. of Y 1, Y 2, …, Y n is exp —————————— =  2  [y i – (  0 +  1 x i )] 2 – ——————— 2  2 n  i = 1 exp ————————————  n (2  ) n/2 [y i – (  0 +  1 x i )] 2 – ———————— 2  2  i = 1 n for –  < y 1 < , –  < y 2 < , …, –  < y n <  If we treat this joint p.d.f. as a function L(  0,  1 ), that is, a function of the unknown parameters  0 and  1, then we can find the maximum likelihood estimates for  0 and  1 by maximizing the function L(  0,  1 ). It is clear the function L(  0,  1 ) will be maximized when is minimized. [y i – (  0 +  1 x i )] 2  i = 1 n

The previous result concerning the least squares line for predicting Y from X with a sample of data points tells us that the mle of  1 is  1 = and the mle of  0 is  0 = ^ ^ S y R — = s x  i = 1 (x i – x)(Y i – Y) n  i = 1 (x i – x) 2 n  i = 1 (x i – x)Y i n  i = 1 (x i – x) 2 n = =  n  j = 1 (x j – x) 2 n i = 1 (x i – x) YiYi Y – R S y — x = s x  i = 1 n Y i — n –  n  j = 1 (x j – x) 2 n i = 1 (x i – x) YiYi =  i = 1 n 1 — n –  j = 1 (x j – x) 2 n (x i – x) YiYi x x

1. (a) Suppose we are interested in predicting a person's height from the person's length of stride (distance between footprints). The following data is recorded for a random sample of 5 people: Length of Stride (inches) Height (inches) Find the equation of the least squares line for predicting a person's height from the person's length of stride. The slope of the least squares line is 120 —— = The intercept of the least squares line is 61.8 – (1.2)(18) = The least squares line can be written y = x.

(b)Suppose we assume that the height of humans has a normal distribution with mean  0 +  1 x and variance  2, where x is the length of stride. Find the maximum likelihood estimators for  0 and  1. The mle of  1 is 120 —— = The mle of  0 is 61.8 – (1.2)(18) = 40.2.

2. (a) Use Theorem (Class Exercise 5.5-1) to find the distribution of the maximum likelihood estimator of  1. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i.  n  j = 1 (x j – x) 2 n i = 1 (x i – x) YiYi  1 = ^ has a normal distribution with mean  n  j = 1 (x j – x) 2 n i = 1 (x i – x) (  0 +  1 x i ) =  n  j = 1 (x j – x) 2 n i = 1 (x i – x) [  0 +  1 x +  1 (x i – x)] =  n  j = 1 (x j – x) 2 n i = 1 (x i – x) [  0 +  1 x] +  n  j = 1 (x j – x) 2 n i = 1 (x i – x)  1 (x i – x) = 11  n  j = 1 (x j – x) 2 n i = 1 (x i – x) 2 =  1

and variance  n  j = 1 (x j – x) 2 n i = 1 (x i – x)  2 = 2 22  n  j = 1 (x j – x) 2 n i = 1 (x i – x) 2 2 = 22  i = 1 n

2. - continued (b) Use Theorem (Class Exercise 5.5-1) to find the distribution of the maximum likelihood estimator of  0.  0 = ^  i = 1 n 1 — n –  j = 1 (x j – x) 2 n (x i – x) YiYi x has a normal distribution with mean  i = 1 n 1 — n –  j = 1 (x j – x) 2 n (x i – x) x (  0 +  1 x i ) =  i = 1 n 1 — n  j = 1 (x j – x) 2 n (x i – x) x (  0 +  1 x i ) =  i = 1 n (  0 +  1 x i ) –  0 +  1 x – x  n  (x j – x) 2 n i = 1 (x i – x) (  0 +  1 x i ) = j = 1 We already found this in part (a).

 0 +  1 x –  1 x = 00 and variance  i = 1 n 1 — n –  j = 1 (x j – x) 2 n (x i – x) x 22 2 =  i = 1 n 1 — n 2 22 –  j = 1 (x j – x) 2 n 2 (x i – x) x n  j = 1 (x j – x) 2 n (x i – x) x2x2 = 22 1 — n + x2x2  (x i – x) 2 i = 1 n

Suppose we treat the joint p.d.f. as a function L(  0,  1,  2 ), that is, a function of the three unknown parameters (instead of just two). Then, analogous to Text Example 6.1-3, we find that the maximum likelihood estimates for  0 and  1 are the same as previously derived, and that the maximum likelihood estimator for  2 is [Y i – (  0 +  1 x i )] 2  i = 1 n n ^^ Recall: If Y 1, Y 2, …, Y n are independent with each having a N( ,  2 ) distribution (i.e., a random sample from a N( ,  2 ) distribution), then Y =has adistribution, has adistribution, and  i = 1 YiYi n n N( , ) 2— n2— n (n – 1)S 2 ———–  2  2 (n – 1)

the random variables Y and are (n – 1)S 2 ———–  2 independent. Analogous results for the more general situation previously considered can be proven using matrix algebra. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Then  1 has a distribution,  0 has a distribution, ^ N(  1, ) 22  (x i – x) 2 ^ N(  0,) i = 1 n 22 1 — n + x2x2  (x i – x) 2 i = 1 n

has adistribution, random variablesand are independent, and random variablesand are independent. [Y i – (  0 +  1 x i )] 2  i = 1 n 22 ^^  2 (n – 2) 11 ^ [Y i – (  0 +  1 x i )] 2  i = 1 n 22 ^^ 00 ^ [Y i – (  0 +  1 x i )] 2  i = 1 n 22 ^^

 (Y i – Y) 2 = n (Y i – Y) 2 +  i = 1 n ^ (Yi – Yi)2(Yi – Yi)2  n ^ This is called the total sum of squares and is denoted SST. This is called the regression sum of squares and is denoted SSR. This is called the error (residual) sum of squares and is denoted SSE. Since, as we have noted, SSE /  2 has a  2 (n – 2) distribution, we say that the df (degrees of freedom) associated with SSE is n – 2. If Y 1, Y 2, …, Y n all have the same mean, that is, if  1 = 0, then SST /  2 has a  2 (n – 1) distribution; consequently, the df associated with SST is n – 1. If Y 1, Y 2, …, Y n all have the same mean, that is, if  1 = 0, then it can be shown that SSR and SSE are independent, and that SSR /  2 has a  2 (1) distribution; consequently, the df associated with SSR is 1. For each i = 1, 2, …, n, we define the random variable Y i =  0 +  1 x i, that is, Y i is the predicted value corresponding to x i. With appropriate algebra, it can be shown that ^^^

3. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Prove that SST = SSR + SSE. First, we observe that for i = 1, 2, …, n, Y i =  0 +  1 x i = ^^^ Y +  1 (x i – x) ^ SST =  (Y i – Y) 2 = n i = 1  n  1 (x i – x) + (Y i – Y) –  1 (x i – x) = ^^ 2  n i = 1 [  1 (x i – x)] 2 + ^ [(Y i – Y) –  1 (x i – x)] 2 + ^ 2  1 (x i – x)[(Y i – Y) –  1 (x i – x)] = ^^  n i = 1 ^ [  1 (x i – x)] 2 +  n i = 1 [Y i – Y –  1 (x i – x)] 2 + ^ 2  1 (x i – x)(Y i – Y) –  1 2 (x i – x) 2 = ^^  n i = 1  n

(Y i – Y) 2 +  i = 1 n ^ (Y i – Y i ) 2 +  i = 1 n ^ 2  1 (x i – x)(Y i – Y) –  1 2 (x i – x) 2 = ^^  n i = 1  n SSR + SSE +  i = 1 (x i – x)(Y i – Y) n  i = 1 (x i – x) 2 n 2 (x i – x)(Y i – Y) –  n i = 1  (x i – x)(Y i – Y) n  i = 1 (x i – x) 2 n (xi – x)2(xi – x)2  n i = = SSR + SSE

A mean square is a sum of squares divided by its degrees of freedom. The error (residual) mean square is MSE =. SSE —— n – 2 The regression mean square is MSR =. SSR —— 1 y Total variation in Y is based on SST. Variation in Y explained by the linear relationship with X is based on SSR. Variation in Y explained by random error is based on SSE.

If Y 1, Y 2, …, Y n all have the same mean, that is, if  1 = 0, then E(SSR /  2 ) =andE If  1  0, then E(SSR /  2 ) = SSE /  2 ——— = n – (Y i – Y) 2  i = 1 n ^ [  1 (x i – x)] 2 ^ E ————— =  2  i = 1 n E —————– =  2 (xi – x)2(xi – x)2  i = 1 n ————  2 E(  1 2 ) = ^ (xi – x)2(xi – x)2  i = 1 n ————  2 Var(  1 ) + [E(  1 )] 2 = ^^ 1 +  1 2 > 1. (xi – x)2(xi – x)2  i = 1 n ————  2

This suggests that if  1 = 0, then the ratio of SSR /  2 to is expected to be close to one, but if  1  0, then this ratio is expected to be SSE /  2 ——— n – 2 larger than one. If  1 = 0, then since SSR and SSE are independent, then the ratio of SSR /  2 to must have an distribution. SSE /  2 ——— n – 2 f(,)1n – 2 Consequently, we can perform the hypothesis test H 0 :  1 = 0vs.H 1 :  1  0. by using the test statistic F = We reject H 0 (in favor of H 1 ) when f > f  (1, n – 2). SSR /  2 ———— = SSE /  2 ——— n – 2 MSR ——–. MSE The linear relationship (correlation) between X and Y is not significant. The linear relationship (correlation) between X and Y is significant.

The calculations in a regression leading up to the f statistic are often organized into an analysis of variance (ANOVA) table such as the following: SourcedfSSMSfp-value Regression Error Total It can be shown that the squared sample correlation is R 2 = which is often called the proportion of variation in Y explained by X. The standard error of estimate is defined to be S XY = SSR1MSR SSEn – 2MSE n – 1 MSR / MSE SSR ——. SST  MSE. (This is done in Class Exercise #7.)

1. - continued (c) Find the sums of squares SSR, SSE, and SST; then construct the ANOVA table and perform the corresponding f test with  = 0.05, find and interpret the squared sample correlation, and find the standard error of estimate. SSR = (Y i – Y) 2 =  i = 1 n ^  n ^ (x i – x) 2 = 1212 (1.2) 2 (100) = 144 SST = (Y i – Y) 2 =  i = 1 n SSE =SST – SSR =174.8 – 144 = 30.8

SourcedfSSMSfp-value Regression Error Total < p < 0.05 Since f = > f 0.05 (1,3) = 10.13, we reject H 0. We conclude that the slope in the linear regression of height on stride length is different from zero (0.025< p-value < 0.05), and the results suggest that this slope is positive. (Note: We could alternatively conclude that the linear relationship or correlation is significant, and that the results suggest a positive linear relationship or correlation.) R 2 = 144 / = 82.4% of the variation in height is explained by stride length. The standard error of estimate is S XY =  = 3.20 inches.

8. (a) The prediction of grip strength from age for right-handed males is of interest. It is assumed that for any age x in years, where 10  x  25, Y = “grip strength in pounds” has a N(  x,  2 ) distribution where  x =  0 +  1 x. For a random sample of right-handed males, the following data are recorded: Age (years) Grip Strength (lbs.) Obtain the calculations below from a calculator and from SPSS. x =n =y =  (x i – x) 2 = i = 1 n  (y i – y) 2 = i = 1 n (x i – x)(y i – y) =  n i = r =

8. - continued (b) (c) Find the equation of the least squares line from a calculator and from SPSS.  1 =  0 = ^ ^ 2 26 The least squares line can be written ^ y = x. Write a one-sentence interpretation of the slope in the least squares line and a one-sentence interpretation of the intercept in the least squares line. Grip strength appears to increase on average by about 2 pounds with each increase of one year in age. The intercept is the mean grip strength at age zero, which makes no sense in this situation.

Find r 2, and write a one-sentence interpretation. (d) (e) Find the standard error of estimate. r 2 = About 59.3% of the variation in grip strength is explained by age. s = 8.390

8. - continued (f) Construct the ANOVA table and perform the corresponding f test with  = SSR = (Y i – Y) 2 =  i = 1 n ^  n ^ (x i – x) 2 = 1212 (2) 2 (256) = 1024 SST = (Y i – Y) 2 =  i = 1 n 1728 SSE =SST – SSR =1728 – 1024 = 704 SourcedfSSMSfp-value Regression Error Total p < 0.01 The test statistic is f =14.55

The critical region with  = 0.05 isf  = f 0.05 (1,10) Since f = > f 0.05 (1,10) = 4.96, we reject H 0. We conclude that the slope in the linear regression of grip strength on age is different from zero (p < 0.01), and the results suggest that this slope is positive. (Note: We could alternatively conclude that the linear relationship or correlation is significant, and that the results suggest a positive linear relationship or correlation.) 0 The p-value is smaller than 0.01 (from the table) or p = (from the SPSS output).

4. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Show that the maximum likelihood estimator of  2 is not unbiased; then, find a constant multiple of this maximum likelihood estimator which is an unbiased estimator of  2. [Y i – (  0 +  1 x i )] 2  i = 1 n n ^^ E=E= [Y i – (  0 +  1 x i )] 2  i = 1 n 22 ^^ 2— n 2— n  2 — (n – 2) n An unbiased estimator of  2 is n —— n – 2 [Y i – (  0 +  1 x i )] 2  i = 1 n n ^^ = (Yi – Yi)2(Yi – Yi)2  n ^ ————— n – 2 =MSE SSE —— = n – 2

11 ^ – 11   (x i – x) 2 i = 1 n SSE /  2 ——— n – 2 = 11 ^ – 11 MSE  (x i – x) 2 i = 1 n has adistribution.t(n – 2) 00 ^ – 00  1 — n + x2x2  (x i – x) 2 i = 1 n SSE /  2 ——— n – 2 = 00 ^ – 00 has adistribution.t(n – 2) 1 — n + x2x2  (x i – x) 2 i = 1 n MSE

5. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. For a given value x 0, use Theorem (Class Exercise 5.5-1) to find the distribution of Y | x 0 =  0 +  1 x 0 (the predicted value of Y corresponding to the value x 0 ). ^^^ Y | x 0 =  0 +  1 x 0 = ^^^ Y +  1 (x 0 – x) = ^  i = 1 n Y i — n +  n  j = 1 (x j – x) 2 n i = 1 (x i – x) YiYi = (x 0 – x)  i = 1 n 1 — n +  j = 1 (x j – x) 2 n (x i – x) YiYi (x 0 – x) has a normal distribution with

mean E(  0 +  1 x 0 ) = and variance  i = 1 n 1 — n +  j = 1 (x j – x) 2 n (x i – x) (x 0 – x) 22 2 = We can find this with algebra analogous to that in Class Exercise #2(b). 22 1 — n +  (x i – x) 2 n (x 0 – x) 2 i = 1 ^^  0 +  1 x 0 In Class Exercise #2(b), this was a minus sign. In Class Exercise #2(b), this was just x.

For a given value x 0, we define Y | x 0 =  0 +  1 x 0 to be the predicted value of Y corresponding to the value x 0 ; this predicted value has a ^^^ N(, )  0 +  1 x 0 22 1 — n +  (x i – x) 2 n (x 0 – x) 2 i = 1 random variablesand are independent. [Y i – (  0 +  1 x i )] 2  i = 1 n 22 ^^ Y | x 0 =  0 +  1 x 0 ^^^ –  1 — n +  (x i – x) 2 i = 1 n SSE /  2 ——— n – 2 = has adistribution.t(n – 2) 1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2  0 +  1 x 0 ^^ (  0 +  1 x 0 ) –  0 +  1 x 0 ^^ (  0 +  1 x 0 ) We assume min{x 1, x 2, …, x n }  x 0  max{x 1, x 2, …, x n }, since prediction outside the range of the data may not be valid. distribution, and

6. (a) Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Derive a 100(1 –  )% confidence interval for the slope  1. 11 ^ – 11 MSE  (x i – x) 2 i = 1 n  t  /2 (n – 2) = 1 –  P – t  /2 (n – 2)  P  1 – t  /2 (n – 2)   1   1 + t  /2 (n – 2) ^ MSE  (x i – x) 2 i = 1 n MSE  (x i – x) 2 i = 1 n ^ = 1 – 

If  1(0) is a hypothesized value for  1, then derive the test statistic and rejection regions corresponding to the one sided and two sided hypothesis tests for testing H 0 :  1 =  1(0) with significance level . (b) The test statistic is T = 11 ^ –  1(0) MSE  (x i – x) 2 i = 1 n For H 1 :  1 <  1(0), the rejection region ist  – t  (n – 2). For H 1 :  1 >  1(0), the rejection region ist  t  (n – 2). For H 1 :  1   1(0), the rejection region is|t|  t  /2 (n – 2).

8. - continued (g) Perform the t test for H 0 :  1 = 0.8 vs. H 1 :  1  0.8 with  = The test statistic is t = The two-sided critical region with  = 0.05 is|t|  = t (10) 11 ^ –0.8 MSE  (x i – x) 2 i = 1 n = 2 – = – The p-value is between 0.02 and 0.05 (from the table).

Since t = > t (10) = 2.228, we reject H 0. We conclude that the slope in the linear regression of grip strength on age is different from 0.8 lbs. (0.02 < p < 0.05), and the results suggest that this slope is greater than 0.8 lbs. (Note: We could alternatively conclude that the average change in grip strength is different from 0.8 lbs. per year, and that this change is greater than 0.8 lbs. per year.)

8. - continued (h) Considering the results of the hypothesis tests in parts (f) and (g), explain why a 95% confidence interval for the slope in the regression would be of interest. Then find and interpret the confidence interval.  1 – t  /2 (n – 2)   1   1 + t  /2 (n – 2) ^ MSE  (x i – x) 2 i = 1 n MSE  (x i – x) 2 i = 1 n ^ Since rejecting H 0 in part (f) suggests that the hypothesized zero slope is not correct, and rejecting H 0 in part (g) suggests that the hypothesized slope of 0.8 is not correct, a 95% confidence interval will provide us with some information about the value of the slope, which estimates the average change in grip strength with an increase of one year in age. 2  (2.228) 70.4 —— and 3.168

We are 95% confident that the slope in the regression to predict grip strength from age is between and lbs.

6. - continued (c) Derive a 100(1 –  )% confidence interval for the intercept  0. 00 ^ – 00 1 — n + x2x2  (x i – x) 2 i = 1 n MSE  t  /2 (n – 2) = 1 –  P – t  /2 (n – 2)  P  0 – t  /2 (n – 2) ^ 1 — n + x2x2  (x i – x) 2 i = 1 n MSE  0 + t  /2 (n – 2) ^ 1 — n + x2x2  (x i – x) 2 i = 1 n MSE   0  = 1 – 

If  0(0) is a hypothesized value for  0, then derive the test statistic and rejection regions corresponding to the one sided and two sided hypothesis tests for testing H 0 :  0 =  0(0) with significance level . (d) The test statistic is T = For H 1 :  0 <  0(0), the rejection region ist  – t  (n – 2). For H 1 :  0 >  0(0), the rejection region ist  t  (n – 2). For H 1 :  0   0(0), the rejection region is|t|  t  /2 (n – 2). 00 ^ –  0(0) 1 — n + x2x2  (x i – x) 2 i = 1 n MSE

8. - continued (i) Perform the t test for H 0 :  0 = 0 vs. H 1 :  0  0 with  = The test statistic is t = 00 ^ – 00 1 — n + x2x2  (x i – x) 2 i = 1 n MSE = 26–0 1 — = 18 2 —— The two-sided critical region with  = 0.05 is |t|  = t (10) – The p-value is between 0.02 and 0.05 (from the table) or p = (from the SPSS output).

Since t = > t (10) = 2.228, we reject H 0. We conclude that the intercept in the linear regression of grip strength on age is different from zero (0.02 < p < 0.05), and the results suggest that this intercept is positive.

8. - continued (j) Considering the results of the hypothesis test in part (i), explain why a 95% confidence interval for the intercept in the regression would be of interest. Then find and interpret the confidence interval. Since rejecting H 0 in part (i) suggests that the hypothesized zero intercept is not correct, a 95% confidence interval will provide us with some information about the value of the intercept.  0 – t  /2 (n – 2) ^ 1 — n + x2x2  (x i – x) 2 i = 1 n MSE   0   0 + t  /2 (n – 2) ^ 1 — n + x2x2  (x i – x) 2 i = 1 n MSE

26  (2.228) 1 — —— and We are 95% confident that the intercept in the regression to predict grip strength from age is between and lbs.

6. - continued (e) For a given value x 0, we can call E(Y | x 0 ) =  0 +  1 x 0 the mean of Y corresponding to the value x 0, and an unbiased estimator of this mean is Y | x 0 =  0 +  1 x 0 (from Class Exercise #5). Derive a 100(1 –  )% confidence interval for E(Y | x 0 ) =  0 +  1 x 0. ^^^  t  /2 (n – 2) = 1 –  P – t  /2 (n – 2)  1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 –  0 +  1 x 0 ^^ (  0 +  1 x 0 )

P  0 +  1 x 0 – t  /2 (n – 2) ^  0 +  1 x 0 + t  /2 (n – 2) ^   0 +  1 x 0  = 1 –  1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 ^ ^

6. - continued (f) If  0 is a hypothesized value for E(Y | x 0 ) =  0 +  1 x 0, then derive the test statistic and rejection regions corresponding to the one sided and two sided hypothesis tests for testing H 0 : E(Y | x 0 ) =  0 with significance level . The test statistic is T = For H 1 : E(Y | x 0 ) <  0, the rejection region ist  – t  (n – 2). For H 1 : E(Y | x 0 ) >  0, the rejection region ist  t  (n – 2). For H 1 : E(Y | x 0 )   0, the rejection region is|t|  t  /2 (n – 2). 1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 –  0 +  1 x 0 ^^ 00

8. - continued (k) Perform the t test for H 0 : vs. H 1 : with  = 0.05, that is, H 0 : E(Y | x = 20) = 80 vs. H 1 : E(Y | x = 20)  80 The mean grip strength for 20 year old right-handed males is 80 lbs. The mean grip strength for 20 year old right-handed males is different from 80 lbs. The test statistic is t = 1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 –  0 +  1 x 0 ^^ 00 = 66–80 1 — = (20 – 18) 2 ———— 256 – 5.304

The two-sided critical region with  = 0.05 is |t|  = t (10) – The p-value is less than 0.01 (from the table). Since |t| = > t (10) = 2.228, we reject H 0. We conclude that the mean grip strength for 20 year old right-handed males is different from 80 lbs. (p < 0.01), and the results suggest that this mean is less than 80 lbs.

8. - continued (l) Considering the results of the hypothesis test in part (k), explain why a 95% confidence interval for the mean grip strength for 20 year old right-handed males would be of interest. Then find and interpret the confidence interval. Since rejecting H 0 in part (k) suggests that the mean grip strength for 20 year old right-handed males is not 80 lbs., a 95% confidence interval will provide us with some information about this mean.  0 +  1 x 0 – t  /2 (n – 2) ^  0 +  1 x 0 + t  /2 (n – 2) ^   0 +  1 x 0  1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 1 — n +  (x i – x) 2 i = 1 n MSE (x 0 – x) 2 ^ ^

66  (2.228) and We are 95% confident that the mean grip strength for 20 year old right-handed males is between and lbs. 1 — (20 – 18) 2 ———— 256 Return to #6, part (g)

For a given value x 0, let Y 0 be a random variable independent of Y 1, Y 2, …, Y n and having a N(  0,  2 ) where  0 =  0 +  1 x 0, that is, Y 0 is a “new” random observation. Use Theorem (Class Exercise 5.5-1) to derive a 100(1 –  )% prediction interval for the value of Y 0. ^^ Y 0 – (  0 +  1 x 0 ) has a normal distribution with mean  0 +  1 x 0 – (  0 +  1 x 0 ) = 0 and variance  2 +. 22 1 — n +  (x i – x) 2 n (x 0 – x) 2 (g) i = 1  — n +  (x i – x) 2 i = 1 n SSE /  2 ——— n – 2 = has adistribution.t(n – 2) +  (x i – x) 2 i = 1 n MSE ^^ Y 0 – (  0 +  1 x 0 ) ^^ (x 0 – x) — n

6. (g) - continued +  (x i – x) 2 i = 1 n MSE ^^ Y 0 – (  0 +  1 x 0 ) (x 0 – x) — n – t  /2 (n – 2)  t  /2 (n – 2) = 1 –  P

 Y 0  P (  0 +  1 x 0 ) – t  /2 (n – 2) +  (x i – x) 2 i = 1 n MSE (x 0 – x) — n (  0 +  1 x 0 ) + t  /2 (n – 2) +  (x i – x) 2 i = 1 n MSE (x 0 – x) — n = 1 –  ^^ ^^

8. - continued (m) Find and interpret a 95% prediction interval for the grip strength for a 20 year old right-handed male.  Y 0  (  0 +  1 x 0 ) – t  /2 (n – 2) +  (x i – x) 2 i = 1 n MSE (x 0 – x) — n (  0 +  1 x 0 ) + t  /2 (n – 2) +  (x i – x) 2 i = 1 n MSE (x 0 – x) — n ^^ ^^ 66  (2.228) — (20 – 18) 2 ———— 256

46.40 and We are 95% confident that the grip strength for a randomly selected 20-year old right-handed male will be between and lbs. OR At least 95% of 20-year old right-handed males have a grip strength between and lbs. (n) For what age group of right-handed males will the confidence interval for mean grip strength and the prediction interval for a particular grip strength both have the smallest length? 18 year olds

6. - continued (h) Consider the two sided hypothesis test H 0 :  1 = 0 vs. H 1 :  1  0 with significance level , which is one of the hypothesis tests defined in part (b). Prove that the square of the t test statistic for this hypothesis test is equal to the f test statistic in the one-way ANOVA. 11 ^ –0 MSE  (x i – x) 2 i = 1 n 2 = 1212 ^ MSE  (x i – x) 2 i = 1 n = MSR ——– MSE We see this from the derivation in Class Exercise #3.

7. Suppose Y 1, Y 2, …, Y n are independent with respective N(  1,  2 ), N(  2,  2 ), …, N(  n,  2 ) distributions. Let x 1, x 2, …, x n be fixed values not all equal, and suppose that for i = 1, 2, …, n,  i =  0 +  1 x i. Prove that R 2 = SSR ——. SST R 2 = = S y 2 s x 2 R 2 — — = s x 2 S y 2 1212 ^  (x i – x) 2 i = 1 n  (Y i – Y) 2 i = 1 n SSR —— SST

8. - continued (o) Use SPSS to graph the least squares line on a scatter plot, and comment on how appropriate the linear model seems to be. The linear model appears to be reasonable.

9.Voters in a state are surveyed. Each respondent is assigned an identification number (ID), and the following information about each is recorded: sex, area of residence (RES), political party affiliation (POL), number of children (CHL), age, yearly income (INC), job satisfaction score (JSS) where 0 = totally dissatisfied and 10 = totally satisfied, weekly hours spent watching television (TVH), and weekly hours spent listening to radio (RAD). The resulting data is as follows: ID SEX RES POL CHL AGE INC JSS TVH RAD 08 M Suburban Republican 53534, F Urban Democrat 22028, FSuburban Independent 43571, M Rural Independent 74135, M Urban Republican 33955, M Urban Democrat 35975, F Urban Democrat 12026, M RuralOther 45230, F RuralOther 24427, M Urban Republican 04653, F Urban Republican 24045,

12 F SuburbanOther 23434, F Urban Republican 22430, M Urban Democrat 46278, M SuburbanOther 34468, F Rural Republican 64429, F Suburban Democrat 43840, M Rural Independent 94739, M Urban Democrat 34460, F Urban Democrat 14549, F RuralOther 25639, F Suburban Republican 03233, M SuburbanOther F Rural Independent 34125, M Suburban Republican 65061, F Rural Republican 25941, F SuburbanOther 34444, M RuralOther 36245, M Suburban Republican 25364, M Rural Democrat 85939,

The data is stored in the SPSS data file survey with income entered in units of thousands of dollars. The prediction of yearly income from age is of interest, and the 30 individuals selected for the data set are treated as a random sample for simple linear regression. It is assumed that for any age x in years, where 20  x, Y = “yearly income” has a N(  x,  2 ) distribution with  x =  0 +  1 x continued (a) Use the Analyze > Regression> Linear options in SPSS to select the variable income for the Dependent slot and select the variable age for the Independent(s) section. To have the mean and standard deviation displayed for the dependent and independent variables, click on the Statistics button, and select the Descriptives option. To add predicted values and residuals to the data file, click on the Save button, select the Unstandardized option in the Predicted Values section, and select the Unstandardized option in the Residuals section. Do this exercise for homework.

10.Use the fact that the random variable SSE —— has adistribution  2 to derive a 100(1 –  )% prediction interval for  2.  2 (n–2) SSE P < —— < =1 –   2  2 1 –  /2 (n–2)  2  /2 (n–2)  2 P > —— > =1 –  SSE 1 —————  2 1 –  /2 (n–2) 1 ————  2  /2 (n–2) P >  2 > =1 –  SSE —————  2 1 –  /2 (n–2) SSE ————  2  /2 (n–2) P <  2 < =1 –  SSE —————  2 1 –  /2 (n–2) SSE ————  2  /2 (n–2)