Download presentation
Presentation is loading. Please wait.
1
Regression Analysis: Statistical Inference
2
Simple Linear Regression Model (SLR)
Assume relationship to be linear y = 0 + 1x + Where y = dependent variable x = independent variable 0 = y-intercept 1 = slope = random error
3
Random Error Component ()
Makes this a probabilistic model... Represents uncertainty random variation not explained by x Deterministic Model = Exact relationship Example: Temperature: oF = 9/5 oC + 32 Assets = Liabilities + Equity Probabilistic Model = Det. Model + Error
4
Model Parameters 0 and 1 Estimated from the data
Data collected as a pair (x,y)
5
Model Assumptions E() = 0 Var() = 2 is normally distributed
I are independent Before performing regression analysis, these assumptions should be validated.
6
Assumptions for Regression
Unknown Relationship Y = b0 + b1X Recall that the model for the linear regression has the form Y=0+1X+. When you perform a regression analysis, several assumptions about the distribution of the error terms must be met to provide valid tests of hypothesis and confidence intervals. The assumptions are that the error terms 𝑒 ~ 𝑖.𝑖.𝑑 𝑁(0, 𝜎 2 ) have a mean of 0 at each value of the predictor variable are normally distributed at each value of the predictor variable have the same variance at each value of the predictor variable are independent, thus making them IID.
7
Scatter Plot of Correct Model
Y = X R2 = 0.67 To illustrate the importance of plotting data, consider the following four examples. In each example, the scatter plot of the data values is different. However, the regression equation and the R-square statistic are the same. In the first plot, a regression line adequately describes the data.
8
Scatter Plot of Curvilinear Model
Y = X R2 = 0.67 In the second plot, a simple linear regression model is not appropriate because you are fitting a straight line through a curvilinear relationship.
9
Scatter Plot of Outlier Model
Y = X R2 = 0.67 In the third plot, there seems to be an outlying data value that is affecting the regression line.
10
Scatter Plot of Influential Model
Y = X R2 = 0.67 In the fourth plot, the outlying data point dramatically changes the fit of the regression line. In fact, the slope would be undefined without the outlier.
11
Homogeneous Variance
12
Heterogeneous Variance
13
Model Assumptions (Cont.)
Recall N(0, 2) 2 is unknown and must be estimated Recall one-sample case In regression, we have the Mean Squared Error to estimate 2.
14
Degrees of Freedom (df)
In general, the df associated with the estimation of 2 in regression is n - (k + 1) where n = sample size k = number of independent variables “1” represents the intercept
15
Degrees of Freedom - Example
Model y = 0 + 1x1+ 2x2 + 3x3 + Degrees of Freedom associated with this model are
16
What Does MSE Mean? (see – Central Company Output)
Just like the sample variance, a more intuitive meaning would come from the standard deviation Approximately 95% of all predicted values should be between 2s
17
Inferences about 1 Goal is to model the relationship between x and y via y = 0 + 1x + What does it mean if there is no relationship?
18
Inferences about 1 (Cont.)
Graphically...
19
Inferences about 1 (Cont.)
What hypothesis are we interested in? We want to test whether 1 is significantly different from 0 That is, H0: 1 = 0 H1: 1 0
20
Inferences about 1 (Cont.)
Need sampling dist. of the est. for 1 FACT: For the model y = 0 + 1x + , with N(0, 2), the LS estimator of 1 is normal with a mean of 1 and a variance of 2/SSxx.
21
Inferences about 1 (Cont.)
Test Statistic Has (n-2) degrees of freedom for SLR 1 normally will be 0 because we just want to determine if there is a relationship between x and y
22
Hypothesis Test for 1 Null Hypothesis: H0: 1 = 0
Alternative Hypothesis H1: 1 < 0 H1: 1 > 0 H1: 1 0 Test Statistic Rejection Region - Rej. H0 if tobs < -t,df - Rej. H0 if tobs > t,df - Rej. H0 if tobs < -t/2,df or if tobs > t/2,df Decision and Conclusion in terms of problem
23
Confidence Interval for 1
A 100(1-)% confidence interval (CI) for 1 is given by Interpretation: We are 100(1-)% confident that the true mean change in response per unit change in x is within the LCL and the UCL for 1. What affects CI? confidence level sample size
24
Inferences about Slope - Example 1
The director of admissions of a small college administered a newly designed entrance test to 20 students selected at random from the new freshman class in a study to determine whether a student’s grade point average (GPA) at the end of the freshman year (y) can be predicted from the entrance test score (x).
25
Inferences about Slope - Example 1(Cont.)
Obtain the least squares estimates of 0 and 1, and state the estimated regression equation.
26
Inferences about Slope - Example 1(Cont.)
Obtain a 99% confidence interval for 1. Interpret your confidence interval. = 0.01, /2 = 0.005, df = = 18 t0.005,18 = 2.878 99% Confidence interval is: (0.4350)/3.0199 (0.4253, ) Interpretation: We are 99% confident that the true value of 1 will be contained in the above interval. Meaning: ?
28
F Test for Linear Regression Model
To test H0: 1= 2 = …= k = 0 versus Ha: At least one of the 1, 2, …, k is not equal to 0 Test Statistic: Reject H0 in favor of Ha if: F(model) > Fa or p-value < a Fa is based on k numerator and n-(k+1) denominator degrees of freedom.
31
The Partial F Test: Testing the Significance of a Portion of a Regression Model
To test H0: g+1= g+2 = …= k = 0 versus Ha: At least one of the g+1, g+2, …, k is not equal to 0 Partial F Statistic: Reject H0 in favor of Ha if: F > Fa or p-value < a Fa is based on k-g numerator and n-(k+1) denominator degrees of freedom.
32
Multiple Regression Salsberry Realty
33
Estimation & Prediction
The fitted SLR model is Estimating y at a given value of x, say xp, yields the same value as predicting y at a given value of xp. Difference is in precision of the estimate... the sampling errors
34
Estimation & Prediction (Cont.)
Sampling Error for the Estimate of the mean of y at xp Sampling Error for the Prediction of y at xp
35
Estimation & Prediction (Cont.)
A 100(1-)% Confidence Interval for y at x=xp is given by
36
Estimation & Prediction (Cont.)
A 100(1-)% Prediction Interval for y at x=xp is given by
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.