Introduction to Regression

Slides:



Advertisements
Similar presentations
Kin 304 Regression Linear Regression Least Sum of Squares
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 ~ Curve Fitting ~ Least Squares Regression Chapter.
Overview Correlation Regression -Definition
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
LINEAR REGRESSION: What it Is and How it Works Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
LINEAR REGRESSION: What it Is and How it Works. Overview What is Bivariate Linear Regression? The Regression Equation How It’s Based on r.
Lecture 11 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Correlation 1. Correlation - degree to which variables are associated or covary. (Changes in the value of one tends to be associated with changes in the.
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
Correlation and Regression Analysis
Relationships Among Variables
Statistics for the Behavioral Sciences (5th ed.) Gravetter & Wallnau
Correlation & Regression
Introduction to Linear Regression and Correlation Analysis
Chapter 15 Correlation and Regression
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Statistics for the Behavioral Sciences, Sixth Edition by Frederick J. Gravetter and Larry B. Wallnau Copyright © 2004 by Wadsworth Publishing, a division.
Chapter 10 Lecture 2 Section: We analyzed paired data with the goal of determining whether there is a linear correlation between two variables.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Introduction to Regression Analysis. Dependent variable (response variable) Measures an outcome of a study  Income  GRE scores Dependent variable =
Regression Lesson 11. The General Linear Model n Relationship b/n predictor & outcome variables form straight line l Correlation, regression, t-tests,
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Correlation and Regression: The Need to Knows Correlation is a statistical technique: tells you if scores on variable X are related to scores on variable.
Chapter 14 Correlation and Regression
Simple Linear Regression The Coefficients of Correlation and Determination Two Quantitative Variables x variable – independent variable or explanatory.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
MGS4020_Minitab.ppt/Jul 14, 2011/Page 1 Georgia State University - Confidential MGS 4020 Business Intelligence Regression Analysis By Using Minitab Jul.
Stats Methods at IC Lecture 3: Regression.
Lecture Slides Elementary Statistics Twelfth Edition
Multiple Regression.
23. Inference for regression
REGRESSION G&W p
Correlation, Bivariate Regression, and Multiple Regression
CHAPTER 3 Describing Relationships
distance prediction observed y value predicted value zero
Review Guess the correlation
Correlation and Simple Linear Regression
Regression 11/6.
Regression 10/29.
Kin 304 Regression Linear Regression Least Sum of Squares
BPK 304W Regression Linear Regression Least Sum of Squares
Chapter 15 Linear Regression
BPK 304W Correlation.
Correlation and Simple Linear Regression
Lecture Slides Elementary Statistics Thirteenth Edition
Multiple Regression.
No notecard for this quiz!!
CHAPTER 3 Describing Relationships
CORRELATION ANALYSIS.
Correlation and Simple Linear Regression
Undergraduated Econometrics
Correlation and Regression
Simple Linear Regression and Correlation
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Regression & Prediction
CHAPTER 3 Describing Relationships
3.2 – Least Squares Regression
Introduction to Regression
Review I am examining differences in the mean between groups How many independent variables? OneMore than one How many groups? Two More than two ?? ?
Created by Erin Hodgess, Houston, Texas
MGS 3100 Business Analysis Regression Feb 18, 2016
CHAPTER 3 Describing Relationships
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Introduction to Regression

Figure 17.7 (p. 564) Predicting the variance in academic performance from IQ and SAT scores. The overlap between IQ and academic performance indicates that 40% of the variance in academic performance can be predicted from IQ scores. Similarly, 30% of the variance in academic performance can be predicted from SAT scores. However, IQ and SAT also overlap, so that SAT scores contribute an additional prediction of only 10% beyond what is already predicted by IQ. Predicting the variance in academic performance from IQ and SAT scores. The overlap between IQ and academic performance indicates that 40% of the variance in academic performance can be predicted from IQ scores. Similarly, 30% of the variance in academic performance can be predicted from SAT scores. However, IQ and SAT also overlap, so that SAT scores contribute an additional prediction of only 10% beyond what is already predicted by IQ.

1. We are investigating only linear relationships. 2. For each x value, y is a random variable having a normal (bell-shaped) distribution. All of these y distributions have the same variance. Also, for a given value of x, the distribution of y-values has a mean that lies on the regression line. (Results are not seriously affected if departures from normal distributions and equal variances are not too extreme.)

The regression equation is obtained by first finding the error (or distance) between the actual data points and the predicted values on the line. Each error is then squared to make the values consistently positive. The goal of regression is to find the equation that produces the smallest total amount of squared error. Thus, the regression equation produces the “best fitting” line for the data points.

The regression equation is defined by the slope constant, b = SP/SSX, and the Y-intercept, a = MY  bMX, producing a linear equation of the form Y = bX + a. The equation can be used to compute a predicted Y value for each of the X values in the data.

The simple concept is that each new variable provides more information and allows for more accurate predictions. Having two predictors in the equation will produce more accurate predictions (less error and smaller residuals) than can be obtained using either predictor by itself.

Figure 17.1 (p. 550) Hypothetical data showing the relationship between SAT scores and GPA with a regression line drawn through the data points. The regression line defines a precise, one-to-one relationship between each X value (SAT score) and its corresponding Y value (GPA).

Figure 17.2 (p. 551) Relationship between total cost and number of hours playing tennis. The tennis club charges a $25 membership fee plus $5 per hour. The relationship is described by a linear equation: Total cost = $5 (number of hours) + $25 Y = bX + a

Figure 17.3 (p. 553) The distance between the actual data point (Y) and the predicted point on the line (Ŷ) is defined as Y – Ŷ. The goal of regression is to find the equation for the line that minimizes these distances.

Figure 17. 4 (p. 555) The scatterplot for the data in Example 17 Figure 17.4 (p. 555) The scatterplot for the data in Example 17.1 is shown with the best-fitting straight line. The predicted Y values (Ŷ) are on the regression line. Unless the correlation is perfect (+1.00 or – 1.00), there will be some error between the actual Y values and the predicted Y values. The larger the correlation is, the less the error will be.

Figure 17.5 (p. 558) (a) Scatter plot showing data points that perfectly fit the regression equation Ŷ = 1.6X – 2. Note that the correlation is r = 1.00. (b) Scatter plot for the data from Example 17.1. Notice that there is error between the actual data points and the predicted Y values of the regression line.

Figure 17.6 (p. 563) The partitioning of SS and df for analysis of regression. The variability in the original Y scores (both SSY and dfY) is partitioned into two components: (a) the variability that is explained by the regression equation, and (b) the residual variability.

Table 17.1 (p. 563) A summary table showing the results from an analysis of regression.

Figure 17.7 (p. 564) Predicting the variance in academic performance from IQ and SAT scores. The overlap between IQ and academic performance indicates that 40% of the variance in academic performance can be predicted from IQ scores. Similarly, 30% of the variance in academic performance can be predicted from SAT scores. However, IQ and SAT also overlap, so that SAT scores contribute an additional prediction of only 10% beyond what is already predicted by IQ. Predicting the variance in academic performance from IQ and SAT scores. The overlap between IQ and academic performance indicates that 40% of the variance in academic performance can be predicted from IQ scores. Similarly, 30% of the variance in academic performance can be predicted from SAT scores. However, IQ and SAT also overlap, so that SAT scores contribute an additional prediction of only 10% beyond what is already predicted by IQ.

Table 17.2 (p. 566) Hypothetical data consisting of three scores for each person. Two of the scores, X1 and X2, are used to predict the Y score for each individual.

Table 17.3 (p. 567) The predicted Y values and the residuals for the data in Table 17.2 The predicted Y values were obtained using the values of X1 and X2 in the multiple-regression equation for each individual.

A significant F-ratio indicates that the regression equation predicts a significant portion (more than just chance) of the variance in the Y scores. What are the weights of each of the IVs? (look at the betas)