Inference about the Slope and Intercept

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

 Population multiple regression model  Data for multiple regression  Multiple linear regression model  Confidence intervals and significance tests.
Inference for Regression
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Linear regression models
Objectives (BPS chapter 24)
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Chapter 10 Simple Regression.
The Simple Regression Model
Chapter 11 Multiple Regression.
Introduction to Probability and Statistics Linear Regression and Correlation.
BCOR 1020 Business Statistics
1 732G21/732A35/732G28. Formal statement  Y i is i th response value  β 0 β 1 model parameters, regression parameters (intercept, slope)  X i is i.
Simple Linear Regression and Correlation
Simple Linear Regression Analysis
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Chapter 11 Simple Regression
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright © Cengage Learning. All rights reserved. 12 Simple Linear Regression and Correlation
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Simple Linear Regression ANOVA for regression (10.2)
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Lesson 14 - R Chapter 14 Review. Objectives Summarize the chapter Define the vocabulary used Complete all objectives Successfully answer any of the review.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Inference for regression - More details about simple linear regression IPS chapter 10.2 © 2006 W.H. Freeman and Company.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Week 21 Statistical Assumptions for SLR  Recall, the simple linear regression model is Y i = β 0 + β 1 X i + ε i where i = 1, …, n.  The assumptions.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Chapter 11 Linear Regression and Correlation. Explanatory and Response Variables are Numeric Relationship between the mean of the response variable and.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
Applied Regression Analysis BUSI 6220
Multiple Regression.
Inference about the slope parameter and correlation
The simple linear regression model and parameter estimation
Decomposition of Sum of Squares
Chapter 11: Simple Linear Regression
Simple Linear Regression
Inference about the Slope and Intercept
Review of Chapter 3 where Multiple Linear Regression Model:
Multiple Regression.
Simple Linear Regression
Inference about the Slope and Intercept
Model Comparison: some basic concepts
Statistical Assumptions for SLR
Review of Chapter 2 Some Basic Concepts: Sample center
Simple Linear Regression
Linear Regression and Correlation
Simple Linear Regression
Linear Regression and Correlation
Decomposition of Sum of Squares
Regression Models - Introduction
Statistical inference for the slope and intercept in SLR
Pearson Correlation and R2
Interval Estimation of mean response
Presentation transcript:

Inference about the Slope and Intercept Recall, we have established that the least square estimates b0 and b1 are linear combinations of the Yi’s. Further, we have showed that they are unbiased and have the following variances In order to make inference we assume that εi’s have a Normal distribution, that is εi ~ N(0, σ2). This in turn means that the Yi’s are normally distributed. Since both b0 and b1 are linear combination of the Yi’s they also have a Normal distribution. week 4

Inference for β1 in Normal Error Regression Model The least square estimate of β1 is b1, because it is a linear combination of normally distributed random variables (Yi’s) we have the following result: We estimate the variance of b1 by S2/SXX where S2 is the MSE which has n-2 df. Claim: The distribution of is t with n-2 df. Proof: week 4

Tests and CIs for β1 The hypothesis of interest about the slope in a Normal linear regression model is H0: β1 = 0. The test statistic for this hypothesis is We compare the above test statistic to a t with n-2 df distribution to obtain the P-value…. Further, 100(1-α)% CI for β1 is: week 4

Important Comment Similar results can be obtained about the intercept in a Normal linear regression model. See the book for more details. However, in many cases the intercept does not have any practical meaning and therefore it is not necessary to make inference about it. week 4

Example We have Data on Violent and Property Crimes in 23 US Metropolitan Areas.The data contains the following three variables: violcrim = number of violent crimes propcrim = number of property crimes popn = population in 1000's We are interested in the relationship between the size of the city and the number of violent crimes…. week 4

Prediction of Mean Response Very often, we would want to use the estimated regression line to make prediction about the mean of the response for a particular X value (assumed to be fixed). We know that the least square line is an estimate of Now, we can pick a point in the range in the regression line (Xh, Yh) then, is an estimate of Claim: Proof: This is the variance of the estimate of E(Y) when X = Xh. week 4

Confidence Interval for E(Yh) For a given Xh , a 100(1-α)% CI for the mean value of Y is where Note, the CI above will be wider the further Xh is from . week 4

Example Consider the snow gauge data. Suppose we wish to predict the mean loggain when the device was calibrated at density 0.5, that is, when Xh = 0.5…. week 4

Prediction of New Observation We want to use the regression line to predict a particular value of Y for a given X = Xh,new, a new point taken after n observation. The predicted value of a new point measured when X = Xh,new is Note, the above predicted value is the same as the estimate of E(Y) at Xh,new but it should have larger variance. The predicted value has two sources of variability. One is due to the regression line being estimated by b0+b1X. The second one is due to εh,new i.e., points don’t fall exactly on line. To calculated the variance of we look at the difference week 4

Prediction Interval for New Observation 100(1-α)% prediction interval for when X = Xh,new is This is not a confidence interval; CI’s are for parameters and we are estimating a value of a random variable. week 4

Confidence Bands for E(Y) Confidence bands capture the true mean of Y , E(Y) = β0+ β1X, everywhere over the range of the data. For this we use the Working-Hotelling procedure which gives us the following boundary values at any given Xh where F(2, n-2); α is the upper α –quantile from an F distribution with 2 and n-2 df. (Table B.4) week 4

Decomposition of Sum of Squares The total sum of squares (SS) in the response variable is The total SS can be decompose into two main sources; error SS and regression SS. The error SS is The regression SS is It is the amount of variation in Y’s that is explained by the linear relationship of Y with X. week 4

Claims First, SSTO = SSR +SSE, that is Proof:…. Alternative decomposition is Proof: Exercises. week 4

Analysis of Variance Table The decomposition of SS discussed above is usually summarized in analysis of variance table (ANOVA) as follow: Note that the MSE is s2 our estimate of σ2. week 4

Coefficient of Determination The coefficient of determination is It must satisfy 0 ≤ R2 ≤ 1. R2 gives the percentage of variation in Y’s that is explained by the regression line. week 4

Claim R2 = r2, that is the coefficient of determination is the correlation coefficient square. Proof:… week 4

Important Comments about R2 It is useful measure but… There is no absolute rule about how big it should be. It is not resistant to outliers. It is not meaningful for models with no intercepts. It is not useful for comparing models unless one set of predictors is a subset of the other. week 4

ANOVE F Test The ANOVA table gives us another test of H0: β1 = 0. The test statistics is Derivations … week 4