SE-280 Dr. Mark L. Hornick 1 Statistics Review Linear Regression & Correlation.

Slides:



Advertisements
Similar presentations
AP Statistics Section 3.2 C Coefficient of Determination
Advertisements

Lesson 10: Linear Regression and Correlation
R Squared. r = r = -.79 y = x y = x if x = 15, y = ? y = (15) y = if x = 6, y = ? y = (6)
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Regression Greg C Elvers.
 Coefficient of Determination Section 4.3 Alan Craig
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Scatter Diagrams and Linear Correlation
Simple Linear Regression
Introduction to Regression Analysis
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful IslamDr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful.
Correlation and Regression Analysis
Business Statistics - QBM117 Least squares regression.
Correlation and Regression Analysis
Relationships Among Variables
Linear Regression Analysis
Lecture 5 Correlation and Regression
Correlation vs. Causation What is the difference?.
Correlation and Regression A BRIEF overview Correlation Coefficients l Continuous IV & DV l or dichotomous variables (code as 0-1) n mean interpreted.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Copyright © Cengage Learning. All rights reserved.
SE-280 Dr. Mark L. Hornick 1 Prediction Intervals.
VCE Further Maths Least Square Regression using the calculator.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
Chapter 6 & 7 Linear Regression & Correlation
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
1.6 Linear Regression & the Correlation Coefficient.
1 Dr. Jerrell T. Stracener EMIS 7370 STAT 5340 Probability and Statistics for Scientists and Engineers Department of Engineering Management, Information.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Simple Linear Regression. The term linear regression implies that  Y|x is linearly related to x by the population regression equation  Y|x =  +  x.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
SE-280 Dr. Mark L. Hornick Multiple Regression (Cycle 4)
Welcome to MM570 Psychological Statistics Unit 4 Seminar Dr. Srabasti Dutta.
Chapter 6 (cont.) Difference Estimation. Recall the Regression Estimation Procedure 2.
Environmental Modeling Basic Testing Methods - Statistics III.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
CS 1813 Discrete Mathematics, Univ Oklahoma Copyright © 2000 by Rex Page 1 PSP Calculations least squares size, least squares time confidence interval.
CHAPTER 5 CORRELATION & LINEAR REGRESSION. GOAL : Understand and interpret the terms dependent variable and independent variable. Draw a scatter diagram.
2.5 Using Linear Models A scatter plot is a graph that relates two sets of data by plotting the data as ordered pairs. You can use a scatter plot to determine.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Regression. Outline of Today’s Discussion 1.Coefficient of Determination 2.Regression Analysis: Introduction 3.Regression Analysis: SPSS 4.Regression.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
STATISTICS 12.0 Correlation and Linear Regression “Correlation and Linear Regression -”Causal Forecasting Method.
Regression Analysis Deterministic model No chance of an error in calculating y for a given x Probabilistic model chance of an error First order linear.
Chapter 8 Linear Regression. Fat Versus Protein: An Example 30 items on the Burger King menu:
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Welcome to MM570 Psychological Statistics Unit 4 Seminar Dr. Bob Lockwood.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
Correlation and Regression Basic Concepts. An Example We can hypothesize that the value of a house increases as its size increases. Said differently,
Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
Regression and Correlation of Data Summary
10.3 Coefficient of Determination and Standard Error of the Estimate
Multiple Regression.
S519: Evaluation of Information Systems
R Squared.
The Least-Squares Line Introduction
Correlation and Regression
Simple Linear Regression
11C Line of Best Fit By Eye, 11D Linear Regression
R = R Squared
REGRESSION ANALYSIS 11/28/2019.
Presentation transcript:

SE-280 Dr. Mark L. Hornick 1 Statistics Review Linear Regression & Correlation

In subsequent labs, we’ll be predicting actual size or time using linear regression based on historical estimated size data from previous labs. Note: this example shows historical data for 13 labs

SE-280 Dr. Mark L. Hornick 3 Linear Regression prediction for Actual LOC vs Estimated LOC (Proxy LOC)

By fitting a regression line to historical data, we can compensate for estimating errors. Slope =  1 Offset =  0 Projected value (corrected estimate) Raw x estimate

To compute a new estimate, we use the regression line equation. x est = raw estimate y proj = projected value (corrected estimate)  0 = offset of regression line  1 = slope of regression line

These formulas are used to calculate the regression parameters.

SE-280 Dr. Mark L. Hornick 7 Correlation (r) is a measure of the strength of linear relationship between two sets of variables Value is +1 in the case of a (perfectly) increasing linear relationship Or -1 in the case of a perfectly decreasing relationship Some value in-between in all other cases Indicates the degree of linear dependence between the variables The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables r > 0.7 is considered “good” for PSP planning purposes

After calculating the regression parameters (  values), we can also calculate the correlation coefficient. To get the correlation coefficient (r), we first need to calculate r 2. With a single independent variable (x), we can get a signed correlation coefficient. In the general case, we only get the absolute value of the correlation coefficient (|r|); the "direction" of the correlation is determined by the sign of the   "slope" value. The correlation coefficient [0.0 to 1.0] is a measure of how well (high) or poorly (low) the historical data points fall on or near the regression line.

Let's look at an example of calculating the correlation. For future reference, these data points come from test case 4 of lab 2.

We have already discussed how to calculate the regression parameters (beta values).  0 =  1 =

If we evaluate the regression line equation at each x value, we get the predicted y values. y pred = x

To determine the correlation, we also need to calculate the mean y value (y). y= 6.07 (Mean of original y values)

Next, we need to sum the squares of two differences: (y – y) and (y pred – y). y – y y pred – y

Once we have the two sums, we can calculate the correlation coefficient. Just in case you are curious, statisticians label the sum-square values like this: Total sum of squares (variability) Sum of squares – predicted (explained) Sum of squares – error (unexplained) One more time, where do the "y pred " values come from?

Here are the actual numbers used to calculate the correlation in this example. X valuesY valuesY meanY predYpred-YavgY-YavgYpred-Y(Ypred-Yavg)^2(Y-Yavg)^2(Ypred-Y)^

SE-280 Dr. Mark L. Hornick 16 We said we needed historical data to make predictions based on regression analysis How do we know when it’s OK to use regression? 1. Quantity of data is satisfactory We must have at least three points!  It’s good to have a lot more  10 or more most recent projects are adequate 2. Quality of data is satisfactory Data points must correlate (r 2  0.5, |r|  0.707)  The means that your process must be stable (repeatable)

SE-280 Dr. Mark L. Hornick 17 We can also use linear regression to predict actual time. Other examples on page Actual Size (LOC) Actual Time (hrs) xkxk ykyk