Download presentation
Presentation is loading. Please wait.
Published byNeil Jenkins Modified over 9 years ago
1
Bivariate Regression (Part 1) Chapter1212 Visual Displays and Correlation Analysis Bivariate Regression Regression Terminology Ordinary Least Squares Formulas Tests for Significance McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights reserved.
2
12A-2 Visual Displays and Correlation Analysis Begin the analysis of bivariate data (i.e., two variables) with a scatter plot.Begin the analysis of bivariate data (i.e., two variables) with a scatter plot. A scatter plot - displays each observed data pair (x i, y i ) as a dot on an X/Y grid - indicates visually the strength of the relationship between the two variablesA scatter plot - displays each observed data pair (x i, y i ) as a dot on an X/Y grid - indicates visually the strength of the relationship between the two variables Visual Displays Visual Displays
3
12A-3 Visual Displays and Correlation Analysis Visual Displays Visual Displays Figure 12.1
4
12A-4 Visual Displays and Correlation Analysis The sample correlation coefficient (r) measures the degree of linearity in the relationship between X and Y.The sample correlation coefficient (r) measures the degree of linearity in the relationship between X and Y. -1 < r < +1 r = 0 indicates no linear relationshipr = 0 indicates no linear relationship In Excel, use =CORREL(array1,array2), where array1 is the range for X and array2 is the range for Y.In Excel, use =CORREL(array1,array2), where array1 is the range for X and array2 is the range for Y. Correlation Analysis Strong negative relationship Strong positive relationship
5
12A-5 Visual Displays and Correlation Analysis Correlation Analysis
6
12A-6 Visual Displays and Correlation Analysis Correlation Analysis Strong Positive Correlation Weak Positive Correlation
7
12A-7 Visual Displays and Correlation Analysis Correlation Analysis Weak Negative Correlation Strong Negative Correlation
8
12A-8 Visual Displays and Correlation Analysis Correlation Analysis No Correlation Nonlinear Relation
9
12A-9 Visual Displays and Correlation Analysis r is an estimate of the population correlation coefficient (rho).r is an estimate of the population correlation coefficient (rho). To test the hypothesis H 0 : = 0, the test statistic is:To test the hypothesis H 0 : = 0, the test statistic is: The critical value t is obtained from Appendix D using = n – 2 degrees of freedom for any .The critical value t is obtained from Appendix D using = n – 2 degrees of freedom for any . Find the p-value using Excel’s function =TDIST(t,deg_freedom,tails) or MINITAB.Find the p-value using Excel’s function =TDIST(t,deg_freedom,tails) or MINITAB. Tests for Significance calc
10
12A-10 Visual Displays and Correlation Analysis Equivalently, you can calculate the critical value for the correlation coefficient usingEquivalently, you can calculate the critical value for the correlation coefficient using This method gives a benchmark for the correlation coefficient.This method gives a benchmark for the correlation coefficient. However, there is no p-value and is inflexible if you change your mind about .However, there is no p-value and is inflexible if you change your mind about . Tests for Significance critical
11
12A-11 Visual Displays and Correlation Analysis Step 1: State the Hypotheses Determine whether you are using a one or two- tailed test and the level of significance ( ). H 0 : = 0 H 1 : ≠ 0Step 1: State the Hypotheses Determine whether you are using a one or two- tailed test and the level of significance ( ). H 0 : = 0 H 1 : ≠ 0 Step 2: Specify the Decision Rule For degrees of freedom = n -2, look up the critical value t in Appendix D, then calculateStep 2: Specify the Decision Rule For degrees of freedom = n -2, look up the critical value t in Appendix D, then calculate Steps in Testing if = 0
12
12A-12 Visual Displays and Correlation Analysis Step 3: Calculate the Test StatisticStep 3: Calculate the Test Statistic Step 4: Make the Decision If the sample correlation coefficient r exceeds the critical value r , then reject H 0.Step 4: Make the Decision If the sample correlation coefficient r exceeds the critical value r , then reject H 0. If using the t statistic method, reject H 0 if t > t or if the p-value t or if the p-value < . Steps in Testing if = 0 calc
13
12A-13 Visual Displays and Correlation Analysis A quick test for significance of a correlation at =.05 is |r| > 2/ nA quick test for significance of a correlation at =.05 is |r| > 2/ n Quick Rule for Significance Table 12.1
14
12A-14 Visual Displays and Correlation Analysis Autocorrelation is a special type of correlation analysis useful in business for time series data.Autocorrelation is a special type of correlation analysis useful in business for time series data. The autocorrelation coefficient is the simple correlation between y t and y t-k where k is any lagThe autocorrelation coefficient is the simple correlation between y t and y t-k where k is any lag Autocorrelation
15
12A-15 Bivariate Regression Bivariate Regression analyzes the relationship between two variables.Bivariate Regression analyzes the relationship between two variables. It specifies one dependent (response) variable and one independent (predictor) variable.It specifies one dependent (response) variable and one independent (predictor) variable. This hypothesized relationship may be linear, quadratic, or whatever.This hypothesized relationship may be linear, quadratic, or whatever. What is Bivariate Regression? What is Bivariate Regression?
16
12A-16 Bivariate Regression Model Form Model Form Figure 12.6
17
12A-17 Regression Terminology Unknown parameters are 0 Intercept 1 SlopeUnknown parameters are 0 Intercept 1 Slope The assumed model for a linear relationship isThe assumed model for a linear relationship is y i = 0 + 1 x i + i for all observations (i = 1, 2, …, n) The error term is not observable, is assumed normally distributed with mean of 0 and standard deviation .The error term is not observable, is assumed normally distributed with mean of 0 and standard deviation . Models and Parameters Models and Parameters
18
12A-18 Regression Terminology The fitted model used to predict the expected value of Y for a given value of X isThe fitted model used to predict the expected value of Y for a given value of X is y i = b 0 + b 1 x i Models and Parameters Models and Parameters The fitted coefficients areThe fitted coefficients are b 0 the estimated intercept b 1 the estimated slope Residual is e i = y i - y i.Residual is e i = y i - y i. Residuals may be used to estimate , the standard deviation of the errors.Residuals may be used to estimate , the standard deviation of the errors. ^ ^
19
12A-19 Regression Terminology Step 1: - Highlight the data columns. - Click on the Chart Wizard and choose Scatter Plot - In the completed graph, click once on the points in the scatter plot to select the data - Right-click and choose Add Trend line - Choose Options and check Display EquationStep 1: - Highlight the data columns. - Click on the Chart Wizard and choose Scatter Plot - In the completed graph, click once on the points in the scatter plot to select the data - Right-click and choose Add Trend line - Choose Options and check Display Equation Fitting a Regression on a Scatter Plot in Excel Fitting a Regression on a Scatter Plot in Excel
20
12A-20 Regression Terminology Fitting a Regression on a Scatter Plot in Excel Fitting a Regression on a Scatter Plot in Excel Figure 12.8
21
12A-21 Regression Terminology
22
12A-22 Ordinary Least Squares Formulas The ordinary least squares method (OLS) estimates the slope and intercept of the regression line so that the residuals are small.The ordinary least squares method (OLS) estimates the slope and intercept of the regression line so that the residuals are small. The sum of the residuals = 0The sum of the residuals = 0 The sum of the squared residuals is SSEThe sum of the squared residuals is SSE Slope and Intercept Slope and Intercept
23
12A-23 Ordinary Least Squares Formulas The OLS estimator for the slope is:The OLS estimator for the slope is: The OLS estimator for the intercept is:The OLS estimator for the intercept is: Slope and Intercept Slope and Intercept or
24
12A-24 Ordinary Least Squares Formulas We want to explain the total variation in Y around its mean (SST for Total Sums of Squares)We want to explain the total variation in Y around its mean (SST for Total Sums of Squares) The regression sum of squares (SSR) is the explained variation in YThe regression sum of squares (SSR) is the explained variation in Y Assessing Fit Assessing Fit
25
12A-25 Ordinary Least Squares Formulas The error sum of squares (SSE) is the unexplained variation in YThe error sum of squares (SSE) is the unexplained variation in Y If the fit is good, SSE will be relatively small compared to SST.If the fit is good, SSE will be relatively small compared to SST. A perfect fit is indicated by an SSE = 0.A perfect fit is indicated by an SSE = 0. The magnitude of SSE depends on n and on the units of measurement.The magnitude of SSE depends on n and on the units of measurement. Assessing Fit Assessing Fit
26
12A-26 Ordinary Least Squares Formulas Coefficient of Determination Coefficient of Determination 0 < R 2 < 1 Often expressed as a percent, an R 2 = 1 (i.e., 100%) indicates perfect fit.Often expressed as a percent, an R 2 = 1 (i.e., 100%) indicates perfect fit. In a bivariate regression, R 2 = (r) 2In a bivariate regression, R 2 = (r) 2 R 2 is a measure of relative fit based on a comparison of SSR and SST.R 2 is a measure of relative fit based on a comparison of SSR and SST.
27
12A-27 Tests for Significance The standard error (s yx ) is an overall measure of model fit.The standard error (s yx ) is an overall measure of model fit. Standard Error of Regression Standard Error of Regression If the fitted model’s predictions are perfect (SSE = 0), then s yx = 0. Thus, a small s yx indicates a better fit.If the fitted model’s predictions are perfect (SSE = 0), then s yx = 0. Thus, a small s yx indicates a better fit. Used to construct confidence intervals.Used to construct confidence intervals. Magnitude of s yx depends on the units of measurement of Y and on data magnitude.Magnitude of s yx depends on the units of measurement of Y and on data magnitude.
28
12A-28 Tests for Significance Standard error of the slope:Standard error of the slope: Confidence Intervals for Slope and Intercept Confidence Intervals for Slope and Intercept Standard error of the intercept:Standard error of the intercept:
29
12A-29 Tests for Significance Confidence interval for the true slope:Confidence interval for the true slope: Confidence Intervals for Slope and Intercept Confidence Intervals for Slope and Intercept Confidence interval for the true intercept:Confidence interval for the true intercept:
30
12A-30 Tests for Significance If 1 = 0, then X cannot influence Y and the regression model collapses to a constant 0 plus random error.If 1 = 0, then X cannot influence Y and the regression model collapses to a constant 0 plus random error. The hypotheses to be tested are:The hypotheses to be tested are: Hypothesis Tests Hypothesis Tests
31
12A-31 Tests for Significance A t test is used with = n – 2 degrees of freedom The test statistics for the slope and intercept are:A t test is used with = n – 2 degrees of freedom The test statistics for the slope and intercept are: Hypothesis Tests Hypothesis Tests t n-2 is obtained from Appendix D or Excel for a given .t n-2 is obtained from Appendix D or Excel for a given . Reject H 0 if t > t or if p-value t or if p-value < . Slope: Intercept: calc
32
12A-32 Tests for Significance Using Excel Using Excel
33
12A-33 Tests for Significance Using MegaStat Using MegaStat
34
12A-34 Tests for Significance Using MINITAB Using MINITAB
35
12A-35 Tests for Significance Using MINITAB Using MINITAB
36
Applied Statistics in Business & Economics End of Chapter 12A 12A-36
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.