Statistical inference for the slope and intercept in SLR

Slides:



Advertisements
Similar presentations
Objectives 10.1 Simple linear regression
Advertisements

Inference for Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
1 Difference Between the Means of Two Populations.
t scores and confidence intervals using the t distribution
Chapter 9 Hypothesis Testing.
Chapter 12 Section 1 Inference for Linear Regression.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
More About Significance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Statistical Inferences Based on Two Samples Chapter 9.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 Lecture 4 Main Tasks Today 1. Review of Lecture 3 2. Accuracy of the LS estimators 3. Significance Tests of the Parameters 4. Confidence Interval 5.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Section 10.1 Confidence Intervals
Interval Estimation and Hypothesis Testing Prepared by Vera Tabakova, East Carolina University.
Chapter 8 Parameter Estimates and Hypothesis Testing.
Inferences from sample data Confidence Intervals Hypothesis Testing Regression Model.
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Inference About Means Chapter 23. Getting Started Now that we know how to create confidence intervals and test hypotheses about proportions, it’d be nice.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
BPS - 5th Ed. Chapter 231 Inference for Regression.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Dr.Theingi Community Medicine
Copyright © Cengage Learning. All rights reserved. 9 Inferences Based on Two Samples.
Chapter 8: Estimating with Confidence
Lecture 11: Simple Linear Regression
More on Inference.
Statistical Intervals Based on a Single Sample
Chapter 8: Estimating with Confidence
Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.
Statistical Quality Control, 7th Edition by Douglas C. Montgomery.
3. The X and Y samples are independent of one another.
Multiple Regression Analysis: Inference
Inference for Regression
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
CHAPTER 12 More About Regression
Chapter 8: Estimating with Confidence
...Relax... 9/21/2018 ST3131, Lecture 3 ST5213 Semester II, 2000/2001
Slides by JOHN LOUCKS St. Edward’s University.
Chapter 9 Hypothesis Testing.
CHAPTER 29: Multiple Regression*
More on Inference.
Chapter 9 Hypothesis Testing.
Problems: Q&A chapter 6, problems Chapter 6:
Chapter 8: Estimating with Confidence
Interval Estimation and Hypothesis Testing
Chapter 8: Estimating with Confidence
Basic Practice of Statistics - 3rd Edition Inference for Regression
CHAPTER 12 More About Regression
Lecture 7 Sampling and Sampling Distributions
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
CHAPTER 12 More About Regression
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
2/5/ Estimating a Population Mean.
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
Chapter 8: Estimating with Confidence
The basic idea of Analysis of Variance (ANOVA) approach to Regression analysis In the previous topics, we have developed the basic regression model.
Interval Estimation of mean response
F test for Lack of Fit The lack of fit test..
Presentation transcript:

Statistical inference for the slope and intercept in SLR In this topic, we first study the parameter beta0 and beta1 in the regression model, and learn how to compute the confidence interval and perform hypothesis test about them.

Yi = β0 + β1Xi + 𝜀i 𝜀 𝑖 for i = 1, 2, . . . , n β0 is the intercept. Simple Linear Regression Model Yi = β0 + β1Xi + 𝜀i for i = 1, 2, . . . , n Simple Linear Regression Model Parameters β0 is the intercept. β1 is the slope. 𝜀 𝑖 are independent, normally distributed random errors with mean 0 and variance σ2, ∼ N (0, σ2) Throughout this topic and the remainder of Simple Linear Regression topics (unless otherwise stated), we assume that the normal error regression model is applicable. In this regression model, beta0 and beta1 are parameters that defines a linear function, Xi are known constants, and the random error epsilon are independent and follows Normal distribution. 𝜀 𝑖 iid

Inference for the slope, β1 Recall that, b 1 = Σ X i − 𝑋 𝑌 𝑖 − 𝑌 Σ 𝑋 𝑖 − 𝑋 2 which we can rewrite as, =Σ c i Y i − 𝑌 =Σ c i Y i − 𝑌 Σ 𝑐 𝑖 =Σ 𝑐 𝑖 𝑌 𝑖 ~ Normal where 𝑐 𝑖 = X i − 𝑋 Σ 𝑋 𝑖 − 𝑋 2 It can be proved that, 𝐸 𝑏 1 = 𝛽 1 𝑎𝑛𝑑 𝑉𝑎𝑟 𝑏 1 = 𝜎 2 1 Σ 𝑋 𝑖 − 𝑋 2 (denoted by 𝜎 2 𝑏 1 ), therefore By replacing the parameter 𝜎 2 with 𝑀𝑆𝐸, the unbiased estimator of 𝜎 2 𝑏 1 , we obtain the point estimator The point estimator b1 was given in topic 1 and shown here. b1 is a linear combination of the response variable Yi, and Yi has normal distribution. So b1 is also normally distributed. This sampling distribution of b1 describes the different values of b1 obtained during repeated sampling while X is kept as the same value. The mean of b1 is beta1 And the variance of b1 is sigma square divided by SSX. By replacing the parameter sigma2 with MSE, we obtain the point estimates. 𝑠 2 𝑏 1 = 𝑀𝑆𝐸 Σ 𝑋 𝑖 − 𝑋 2 𝑠 𝑏 1 = 𝑀𝑆𝐸 Σ 𝑋 𝑖 − 𝑋 2

𝒕 ∗ = 𝒃 𝟏 −𝜷 𝟏 𝒔 𝒃 𝟏 ~𝐭(𝐧−𝟐) 𝒃 𝟏 −𝜷 𝟏 𝝈 𝒃 𝟏 ~𝐍(𝟎,𝟏) Sampling distribution of 𝒃𝟏− 𝜷 𝟏 𝒔 𝒃 𝟏 , denoted by 𝑡 ∗ 𝒕 ∗ = 𝒃 𝟏 −𝜷 𝟏 𝒔 𝒃 𝟏 ~𝐭(𝐧−𝟐) 𝒃 𝟏 −𝜷 𝟏 𝝈 𝒃 𝟏 ~𝐍(𝟎,𝟏) There are n − 2 degrees of freedom because 2 parameters must be estimated to obtain the numerator for s2: SSE = Σ [Yi − (b 0 − b1Xi)]2 Since b1 is normally distributed, we know that the standardized statistic b1-beta1 over sigma b1 is a standard normal variable. Ordinarily, we need to estimate sigma b1 by s b1, the standard error, and hence are interested in the distribution of b1-beta1 over s b1 as shown. When a statistic is standardized but the denominator is an estimated standard deviation rather than the true standard deviation, it is called a studentized statistic, or t. t follows a t distribution with a df of n-2. Two degree of freedom are lost here because two parameters (beta0 and beta1) need to be estimated first.

Confidence Interval for the slope β1 Since 𝑡 ∗ = 𝑏 1 −𝛽 1 𝑠 𝑏 1 ~ 𝑡(𝑛−2) 𝑃 𝑡 𝛼 2 ;𝑛−2 ≤ 𝑏 1 −𝛽 1 𝑠 𝑏 1 ≤𝑡 1− 𝛼 2 ;𝑛−2 =1−𝛼 Where 𝑡( 𝛼 2 ;𝑛−2) denotes the 𝛼 2 100 percentile of the t distribution with 𝑛−2 degrees of freedom. Because of the symmetry of the 𝑡 distribution around its mean 0, it follows that: 𝑡 𝛼 2 ;𝑛−2 =−𝑡(1− 𝛼 2 ;𝑛−2) Since the test statistic follows a t distribution, we can make the following probability statement. Where 𝑡( 𝛼 2 ;𝑛−2) denotes the 𝛼 2 100 percentile of the t distribution with 𝑛−2 degrees of freedom. Because of the symmetry of the 𝑡 distribution around its mean 0, it follows that the upper percentile and lower percentile are the same value, one is positive and the other is negative. 𝑡 𝛼 2 ;𝑛−2 =−𝑡(1− 𝛼 2 ;𝑛−2) Rearranging the inequalities in the probability statement, we obtain the formula of confidence interval of beta1. In general, the confidence interval is Point estimate ± Margin error, where Margin error (denoted by ME) = t * standard error Hence the 1−𝛼 confidence interval for 𝛽 1 are: 𝒃 𝟏 ±𝒕 𝟏− 𝜶 𝟐 ;𝒏−𝟐 𝒔{ 𝒃 𝟏 } Point estimate ± Margin error, where Margin error (denoted by ME) = t * standard error

Significance Tests for 𝜷 𝟏 𝐻𝑜: 𝛽 1 = 𝛽 1 ∗ 𝐻𝑎: 𝛽 1 ≠ 𝛽 1 ∗ The test statistic 𝑡 ∗ = (𝑏 1 − 𝛽 1 ∗ )/𝑠{ 𝑏 1 }~𝑡(𝑛−2) For two sided test Reject H0 if | t∗| ≥ tc, tc = tn−2(1 − α/2) Or, reject H0 if 𝑝−𝑣𝑎𝑙𝑢𝑒 ≤ α For one sided test Since the test statistic follows a t distribution, the test concerning beta1 is a regular t test, and should have been covered in your previous statistical course. Here I assume everyone has a good understanding on how to perform the test., and just present the major reject rules here. Reject H0 if | t∗| ≥ tc, tc = tn−2(1 − α) Or, reject H0 if 𝑝−𝑣𝑎𝑙𝑢𝑒 ≤ α

Inference for the intercept, β0 𝑏 0 = 𝑌 − 𝑏 1 𝑋 It can be proved that, 𝐸 𝑏 0 = 𝛽 0 𝑎𝑛𝑑 𝑉𝑎𝑟 𝑏 0 = 𝜎 2 [ 1 𝑛 + 𝑋 2 Σ 𝑋 𝑖 − 𝑋 2 ] (denoted by 𝜎 2 𝑏 0 ), therefore By replacing the parameter 𝜎 2 with 𝑀𝑆𝐸, the unbiased estimator of 𝜎 2 𝑏 0 , we obtain the point estimator 𝑠 2 𝑏 0 =𝑀𝑆𝐸[ 1 𝑛 + 𝑋 2 Σ 𝑋 𝑖 − 𝑋 2 ] 𝑠 𝑏 0 = 𝑀𝑆𝐸[ 1 𝑛 + 𝑋 2 Σ 𝑋 𝑖 − 𝑋 2 ] Now let’s switch to the intercept, beta0. The point estimate of bo is the average of y minus b1 times the average of X. Its sampling distribution is the different values of b0 that would be obtained with repeated sampling with one X value. The mean is the true beta0, and the variance is also related to the residual variance, sigma square, as shown here. Like the slope term b1, bo is also an unbiased estimator. By replacing the parameter sigma square with MSE, we get the standard error, denoted by s of b0. Analogous to the theorem for b1, we use the similar t test to get the inference for the intercept, beta0. Analogous to theorem for 𝑏 1 , 𝑡 ∗ = (𝑏 0 − 𝛽 0 )/𝑠{ 𝑏 0 } ~ 𝑡(𝑛−2)

Confidence Interval for β0 𝑏 0 ±𝑡 1− 𝛼 2 ;𝑛−2 𝑠{ 𝑏 0 } Significance Tests for 𝜷 𝟎 𝐻𝑜: 𝛽 0 = 𝛽 0 ∗ 𝐻𝑎: 𝛽 0 ≠ 𝛽 0 ∗ The test statistic 𝑡 ∗ = (𝑏 0 − 𝛽 0 ∗ )/𝑠{ 𝑏 0 }~𝑡(𝑛−2)

Comments on the inference assumptions Both 𝑏 1 and 𝑏 0 follow Normal distribution because they are based on Yi, which are themselves independent and normally distributed. As long as the Yi are close to normal, inferences (CIs and hypothesis tests) based on the t distribution will be approximately correct, even with small sample sizes. In general, the CLT ensures that b 0 and b 1 are asymptotically normal as long as the random errors are independently and identically distributed (iid). Therefore, inferences based on the t distribution will be approximately correct as long as n is large enough. 𝑏 1 = 𝛴 𝑐 𝑖 𝑌 𝑖 𝑏 0 = 𝑌 − 𝑏 1 𝑋 That is: Y has a symmetric distribution without outliers That is, when Y follows any form of distribution Regarding the assumptions or limitations on performing confidence interval and hypothesis test on the parameter beta0 and beta1. The sampling distribution of b0 and b1 are both normal since they are computed from Y. When the sample size is small, as long as Y is close to normal, the T-method will be approximately correct. [B] Note that the requirement “close to normal” means Y may not be a normal distribution, but at least it should has a symmetric distribution with no outliers. When the sample size is big, b0 and b1 are asymptotically normal as long as the random errors are independently and identically distributed. [B] This means that when Y is not close to Normal, with skew pattern and outliers, you will need a big sample size to ensure the T method is appropriate. There is no rule for how big it is, n of 25 to 40 is a good starting point in general case for using the T test.

Comments on the inference assumptions Often, the value of the intercept is not of direct interest, so there is no need to calculate CIs or hypothesis tests on β0. Because it is just a single value of Y when X=0 and will be of no much value to predict other Y values. Caution again: the linear regression model might not be appropriate when the scope of the model is extended to X=0 𝜎 2 1 Σ 𝑋 𝑖 − 𝑋 2 , Because 𝜎 2 𝑏 1 = we can increase the precision of the estimator, i.e., reduce this sigma by increasing the dilation in X, i.e., bigger Σ 𝑋 𝑖 − 𝑋 2 Now that you have the basic theory layout, I want to mention some practical issues. In linear model, slope typically means the changes in Y when X changes, For example, in the diamond’s case, b1=3721, for every one more carat, the price will go up $3721 on average. The intercept, on the other hand, is usually of no direct interest since it means a single value of Y when X equals 0. Think about “what is the price when carat=0”? . [B] the linear regression model might not be appropriate when the X scope of model is extended to 0 Second, because the variance of residual of the slope is sigma square over SSX. One way to increase precision of the estimator is to reduce the variance of residual, or to increase the SSX. [B] To do this, when we design an experiment, collect X variables in a wider and random range. For example, if we could only study 100 diamond rings, try collect different weights of diamond ring rather than many rings with similar weight. The precision of the estimators also depends on other issues such as sample size and number of parameters. Collect X variables in a wider and random range The precision of the estimators also depends on the difference between sample size and the number of functional parameters (βs) to be estimated.

The diamond weight and price example: Confidence interval for the slope 𝜷 𝟏 𝑏 1 ±𝑡 1− 𝛼 2 ;𝑛−2 𝑠{ 𝑏 1 } Where 𝛼=0.05, 𝑛=48 From lm output Now compute the confidence interval and hypothesis test for the slope beta1. We will show both how to do it from R and by hand. In the diamond example, b1 is estimated be 3721 and residual standard error s to be 31.84. Use a significant level of 0.05, As shown in the lower right codes, the confint function compute the confidence level in R. The first function parameter is the lm model; the second is your target parameter, here we want to compute the confidence interval for the slope, i.e., weight. By default, it will show you both confidence interval for the intercept and slope. The third function parameter is the confidence level, or 1-alpha. From confint output

Where 𝛼=0.05, 𝑛=48, 𝑑𝑓=46 𝑟𝑜𝑢𝑛𝑑 𝑑𝑜𝑤𝑛 𝑡𝑜 40 Now see how to compute by hand with the t table. When using t table, we don’t always have the degree of freedom, or pvalues, and need to estimate. For example, the df of 46 is not available, we need to round down to the closest value, 40. The reason of using a smaller df is to have a larger t value and wider interval, and be precise. In the case, the t value for 40 df is 2.021. We can use this value to computer the confidence interval. In R, the qt funciton gives the t value for 46 df, 2.013. You can use boh values in the homework. But only use the t table in the exam. Or use R

The diamond weight and price example: Confidence interval for the slope 𝜷 𝟏 𝑏 1 ±𝑡 1− 𝛼 2 ;𝑛−2 𝑠{ 𝑏 1 } Where 𝛼=0.05, 𝑛=48 =3721±𝟐.𝟎𝟏𝟑 (81.79) = 3556.4 , 3885.65 From lm output 𝑴𝑺𝑬= 𝒔 𝟐 = 𝟑𝟏.𝟖𝟒 𝟐 =𝟏𝟎𝟏𝟑.𝟖 𝑠 𝑏 1 = 𝑀𝑆𝐸 Σ 𝑋 𝑖 − 𝑋 2 = 1013.8 𝑠 𝑋 2 𝑛−1 = 1013.8 0.0568 2 48−1 = 1013.8 0.152 =81.7 We now know the b1 is 3721, t value is 2.013 or 2.021 (t able). The last thing is the standard error of the b1. In the R output, the standard error is provided as 81.79. You can compute it with the formula, using MSE of 1014 and SSX of 0.152. Note that here I also show how to computer SSX from the standard deviation, SSX= Sx square times (n-1), where Sx is the usual standard deviation of X. This is a useful trick especially for the exam. In case you forgot from your past stat course, I suggest you copy this onto the cheat sheet. [B] Finally, the confidence interval to be 3556.4 and 3885.65, the average price increases by at least 3556 and at most 3889 dollars when the diamond is 1 carat heavier. From confint output Conclusion: we are 95% confident that, the average price will increase by at least 3556 and at most 3889 when the weight increase by 1 carat

The diamond weight and price example: hypothesis test for the slope 𝜷 𝟏 𝐻𝑜: 𝛽 1 =0 𝑣𝑠 𝐻𝑎: 𝛽 1 ≠0 𝑇ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑡 𝑠 = 𝑏 1 −0 𝑆 𝑏 1 = 3721−0 81.79 =45.5 𝑇ℎ𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒=2𝑃 𝑇>45.5 <0.0001, or <0.001 using T table 𝑆𝑖𝑛𝑐𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒<𝑡ℎ𝑒 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙, 𝛼=0.05, 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝐻𝑜. Next consider a two sided test for beta1. The test statistic ts is 45.5, from the lm output in R, the last two columns give you the t value and the pvalue. When we report the result on the p value, any value that is lower than 0.0001 can be reported as <0.0001. [B] Since the pvalue is very small, we reject the hypothesis, and conclude that the slope is significantly different from 0. The result is also consistent with the confidence interval computed on the previous page and does not include 0. From lm output Consistent with CI of 𝜷 𝟏 when CI does not include 0 (all positives)

Estimate p value , two sided test, 𝑑𝑓=40, 𝑡 𝑠 =45.5 𝑡 𝑠 =45.5>3.551 𝑝𝑣𝑎𝑙𝑢𝑒<0.001 Now estimate the pvalue with T table. Use a degree of freedom of 40, the largest t value is 3.551. The corresponding two sided P is 0.001. This means that the probability is only 0.001 of the test statistic in this distribution has a value that is larger than 3.551 (> 3.551, or <-3.551). The probability will be smaller if the test statistic gets higher. Our test statistic is 45.5 which is much higher than 3.551., the the pvalue is estimated to be less than 0.001.

The diamond weight and price example: hypothesis test for the slope 𝜷 𝟏 (one sided test) Comment: R output is usually for the two sided test, and can be adjusted for one sided test. 1. 𝐻𝑜: 𝛽 1 =0 𝑣𝑠 𝐻𝑎: 𝛽 1 >0 𝑇ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑡 𝑠 = 𝑏 1 −0 𝑆 𝑏 1 = 3721−0 81.79 =45.5 P value 𝑇ℎ𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒=𝑃 𝑇>45.5 < 0.00005 or < 0.0005 using T table 𝑆𝑖𝑛𝑐𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒<𝑡ℎ𝑒 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙, 𝛼=0.05, 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝐻𝑜. 𝑡 𝑠 =45.5 Estimate p value , one sided test, 𝑑𝑓=40, 𝑡 𝑠 =45.5 Suppose we want to do a one sided test. The test statistic is the same as long as data is the same. T is also 45.5. In a one sided test beta1>0, the bigger the ts is, the more likely the sample is, and more reason to reject Ho. Hence the pvalue is the area to the right. This pvalue is also the smaller area and can be directly access from T table. As shown, since ts is greater than 3.551 the largest, it happens in a chance less than 0.0005. The p value is <0.0005 𝑡 𝑠 =45.5>3.551 𝑝𝑣𝑎𝑙𝑢𝑒<0.0005

The diamond weight and price example: hypothesis test for the slope 𝜷 𝟏 (one sided test) Comment: R output is usually for the two sided test, and can be adjusted for one sided test. 2. 𝐻𝑜: 𝛽 1 =0 𝑣𝑠 𝐻𝑎: 𝛽 1 <0 𝑇ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑡 𝑠 = 𝑏 1 −0 𝑆 𝑏 1 = 3721−0 81.79 =45.5 P value 𝑇ℎ𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒=𝑃 𝑇<45.5 >1−0.00005=0.99995 𝑜𝑟 >0.9995 𝑢𝑠𝑖𝑛𝑔 𝑇 𝑡𝑎𝑏𝑙𝑒 𝑆𝑖𝑛𝑐𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒>𝑡ℎ𝑒 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙, 𝛼=0.05, 𝑑𝑜 𝑛𝑜𝑡 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝐻𝑜. 𝑡 𝑠 =45.5 One the other hand, if the hypothesis is in the opposite side, in this case, beta1<0. The test statistic doesn’t change since the sample data is the same. We will have the opposite analogy to the previous example. That is, the smaller the ts is, the more likely the sample is, and more reason to reject Ho. Hence the pvalue is the area to the left. Now the pvalue is left or the bigger area, which is 1-the smaller area. Recall that the t- table only gives you the smaller area, which is <0.0005, we obtain pvalue by subtracting from 1. Hence the pvalue >1-0.0005, or pvalue > 0.9995.

𝑇ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑡 𝑠 = 𝑏 0 −0 𝑆 𝑏 0 = −259.63−0 17.32 =−14.99 The diamond weight and price example: hypothesis test for the intercept 𝜷 𝟎 (self exercise) 𝐻𝑜: 𝛽 0 =0 𝑣𝑠 𝐻𝑎: 𝛽 0 ≠0 𝑇ℎ𝑒 𝑡𝑒𝑠𝑡 𝑠𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐: 𝑡 𝑠 = 𝑏 0 −0 𝑆 𝑏 0 = −259.63−0 17.32 =−14.99 𝑇ℎ𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒=2𝑃 |𝑇|>14.99 <0.0001 𝑆𝑖𝑛𝑐𝑒 𝑝 𝑣𝑎𝑙𝑢𝑒<𝑡ℎ𝑒 𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑡 𝑙𝑒𝑣𝑒𝑙, 𝛼=0.05, 𝑟𝑒𝑗𝑒𝑐𝑡 𝑡ℎ𝑒 𝐻𝑜. As a self practice, now try finding a confidence interval and hypothesis test of the intercept, beta0.

=−259.62±2.013 (17.31) = −294.487, −224.765 Answer: 𝑏 0 ± 𝑡 𝑐 𝑆 𝑏 0 The diamond weight and price example: Confidence interval for the intercept 𝜷 𝟎 (self exercise) Answer: 𝑏 0 ± 𝑡 𝑐 𝑆 𝑏 0 =−259.62±2.013 (17.31) = −294.487, −224.765 So 𝛽 0 <0 , what does it mean? The confidence interval of intercept is -294, and -224. But the intercept is the value of Y when X=0. Or the price of a diamond ring that not exist (X=0)? Actually, this is an example where we should consider what is the range of X (and hence Y) that actually should be considered meaningful or effective for a model. After discussing parameters beta0 and beta1 in the regression model, we will learn how to use the model to estimate the mean of Y or predict Y, in the next topic. It means nothing, this means we should consider the two extreme levels of predictors. The “effective range” of your model.