Download presentation
Presentation is loading. Please wait.
1
Simple Linear Regression
Statistics 515 Lecture
2
Example for Illustration
The human body takes in more oxygen when exercising than when it is at rest. To deliver oxygen to the muscles, the heart must beat faster. Heart rate is easy to measure, but measuring oxygen uptake requires elaborate equipment. If oxygen uptake (VO2) can be accurately predicted from heart rate (HR), the predicted values may replace actually measured values for various research purposes. Unfortunately, not all human bodies are the same, so no single prediction equation works for all people. Researchers can, however, measure both HR and VO2 for one person under varying sets of exercise conditions and calculate a regression equation for predicting that person’s oxygen uptake from heart rate. 12/3/2018 Simple Linear Regression
3
Data From An Individual
Goals in this illustration: Scatterplot: linear relationship or not? Obtain the best-fitting line using least-squares. To test whether the model is significant or not. To obtain a confidence interval for the regression coefficient. To obtain predictions. 12/3/2018 Simple Linear Regression
4
Simple Linear Regression
The Scatterplot 12/3/2018 Simple Linear Regression
5
Simple Linear Regression Model
1. Conditional on X=x, the response variable Y has mean equal to m(x) = a + bx. 2. a is the y-intercept; while b is the slope of the regression line, which could be interpreted as the change in the mean value per unit change in the independent variable. 3. For each X = x, the conditional distribution of Y is normal with mean m(x) and variance s2. 4. Y1, Y2, …, Yn are independent of each other. Shorthand: Yi = a + bxi + ei with ei IID N(0,s2) 12/3/2018 Simple Linear Regression
6
Least-Squares (LS) Regression
One of the goals in regression analysis is to estimate the parameters a, b, and s2 of the regression model. Denote by The estimate of the regression line, so that a estimates a, and b estimates b. Then for the observed values of X, which are x1, x2, …, xn, we may obtain the predicted values of the response variable Y for each of these X-values. These are: 12/3/2018 Simple Linear Regression
7
Simple Linear Regression
Predicted Values A good estimate of the regression line should produce predicted values that are close to the actual observed values of the response variable. That is, the set of deviations Should ideally be close (if not equal) to zeros. These deviations between observed and predicted values are also called as residuals. 12/3/2018 Simple Linear Regression
8
Principle of Least-Squares (LS)
In least-squares regression, the best-fitting regression line is that which will make the sum of these squared deviations or residuals as small as possible. Thus, the regression coefficients a and b are chosen in order to minimize the quantity: Using calculus, the values of a and b that will minimize this quantity are given by: 12/3/2018 Simple Linear Regression
9
Least-Squares Solution
12/3/2018 Simple Linear Regression
10
Estimating the Variance
12/3/2018 Simple Linear Regression
11
Interpretations of Quantities
SSE : measures variation not explained by the predictor variable. SSR : measures the amount of variation explained by the predictor variable. SYY: total variation in the Y-values. This is partitioned into SSR and SSE. R2 = (SSR)/(SYY) : coefficient of determination; indicates proportion of variation in Y-values explained by the predictor variable. MSE = (SSE)/(n-2) : is the mean-squared error. This provides an unbiased estimate of the common variance s2. 12/3/2018 Simple Linear Regression
12
Sampling Distributions of Estimators
To estimate the variance, s2 is replaced by the MSE. 12/3/2018 Simple Linear Regression
13
Simple Linear Regression
Testing Hypothesis To test the null hypothesis H0: b = b0 versus H1: b not equal to b0 we use the t-statistic given by: Which follows a t-distribution with degrees-of-freedom equal to n-2 under the null hypothesis. Thus, we reject H0 if |Tc| > tn-2;a/2. Similarly, for testing H0: a = a0, we use: 12/3/2018 Simple Linear Regression
14
Simple Linear Regression
Confidence Interval for Mean and Predicting the Value of Y of a new Unit Estimate of Mean and Predicted Value at x0: Variance: CI for m(x0): CI for Y(x0): 12/3/2018 Simple Linear Regression
15
Results of Regression Analysis (using Minitab)
Prediction Line P-value for regression P-Value Coefficient of Determination (MSR)/(MSE) 12/3/2018 Simple Linear Regression
16
Fitted Line on the Scatterplot
12/3/2018 Simple Linear Regression
17
Simple Linear Regression
Confidence Interval for Mean and Prediction Interval For predicting the mean value For predicting the value of the response 12/3/2018 Simple Linear Regression
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.