Regression [Example 3.] As a motivating example, suppose we are modelling sales data over time. SALES 3 5 4 5 6 7 TIME199019911992199319941995 We seek.

Slides:



Advertisements
Similar presentations
Coefficient of Determination- R²
Advertisements

Lesson 10: Linear Regression and Correlation
Chapter 12 Simple Linear Regression
Hypothesis Testing Steps in Hypothesis Testing:
1 Functions and Applications
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
1 Simple Linear Regression and Correlation The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES Assessing the model –T-tests –R-square.
Simple Linear Regression and Correlation
Definition  Regression Model  Regression Equation Y i =  0 +  1 X i ^ Given a collection of paired data, the regression equation algebraically describes.
Chapter 12 Simple Linear Regression
Chapter 10 Simple Regression.
Simple Linear Regression
Statistics for Business and Economics
Chapter 13 Introduction to Linear Regression and Correlation Analysis
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
SIMPLE LINEAR REGRESSION
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Simple Linear Regression and Correlation
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
SIMPLE LINEAR REGRESSION
Introduction to Linear Regression and Correlation Analysis
MAT 254 – Probability and Statistics Sections 1,2 & Spring.
CPE 619 Simple Linear Regression Models Aleksandar Milenković The LaCASA Laboratory Electrical and Computer Engineering Department The University of Alabama.
Simple Linear Regression Models
Correlation and Regression
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Statistics for Business and Economics Dr. TANG Yu Department of Mathematics Soochow University May 28, 2007.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Regression. Idea behind Regression Y X We have a scatter of points, and we want to find the line that best fits that scatter.
AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Production Planning and Control. A correlation is a relationship between two variables. The data can be represented by the ordered pairs (x, y) where.
BIOL 582 Lecture Set 11 Bivariate Data Correlation Regression.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Lecture 10: Correlation and Regression Model.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
1 Simple Linear Regression and Correlation Least Squares Method The Model Estimating the Coefficients EXAMPLE 1: USED CAR SALES.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
AP STATISTICS LESSON 3 – 3 (DAY 2) The role of r 2 in regression.
CORRELATION-REGULATION ANALYSIS Томский политехнический университет.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Chapter 13 Linear Regression and Correlation. Our Objectives  Draw a scatter diagram.  Understand and interpret the terms dependent and independent.
The simple linear regression model and parameter estimation
Linear Regression and Correlation Analysis
Simple Linear Regression
Quantitative Methods Simple Regression.
Prepared by Lee Revere and John Large
AP STATISTICS LESSON 3 – 3 (DAY 2)
SIMPLE LINEAR REGRESSION
SIMPLE LINEAR REGRESSION
Presentation transcript:

Regression [Example 3.] As a motivating example, suppose we are modelling sales data over time. SALES TIME We seek the straight line “Y = m X + c” that best approximates the data. By “best” in this case, we mean the line which minimizes the sum of squares of vertical deviations of points from the line: SS =  ( Y i - [ mX i + c ] ) 2. Setting the partial derivatives of SS with respect to m and c to zero leads to the “normal equations”  Y = m  X + n.c, where n = # points  X.Y= m  X 2 + c  X. Let 1990 correspond to Year 0. X.X X X.Y Y Y.Y X Y Y = m X + c m X i + c YiYi 0 XiXi Time Sales

Example 3 - Workings. The normal equations are: 30 = 15 m + 6 c => 150 = 75 m + 30 c 87 = 55 m + 15 c 174 = 110 m + 30 c =>24 = 35 m=>30 = 15 (24 / 35) + 6 c=> c = 23/7 Thus the regression line of Y on X is Y = (24/35) X + (23/7) and to plot the line we need two points, so X = 0 => Y = 23/7and X = 5 => Y = (24/35) /7 = 47/7. It is easy to see that ( X, Y ) satisfies the normal equations, so that the regression line of Y on X passes through the “Center of Gravity” of the data. By expanding terms, we also get  ( Y i - Y ) 2 =  ( Y i - [ m X i + c ] ) 2 +  ( [ m X i + c ] - Y ) 2 Total SumErrorSumRegression Sum of Squaresof Squaresof Squares SST = SSE + SSR In regression, we refer to the X variable as the independent variable and Y as the dependent variable. X Y YiYi mX i +C Y X Y

Correlation The coefficient of determination r 2 ( which takes values in the range 0 to 1) is a measure of the proportion of the total variation that is associated with the regression process: r 2 = SSR/ SST = 1 - SSE / SST. The coefficient of correlation r ( which takes values in the range -1 to +1 ) is more commonly used as a measure of the degree to which a mathematical relationship exists between X and Y. It can be calculated from the formula: r =  ( X - X ) ( Y - Y )   ( X - X ) 2 ( Y - Y ) 2 = n  X Y -  X  Y  { n  X 2 - (  X ) 2 } { n  Y 2 - (  Y ) 2 } Example. In our case r = {6(87) - (15)(30)}/  { 6(55) - (15) 2 } { 6(160) - (30) 2 } = r = - 1r = + 1r = 0

Collinearity If the value of the correlation coefficient is greater than 0.9 or less than - 0.9, we would take this to mean that there is a mathematical relationship between the variables. This does not imply that a cause-and-effect relationship exists. Consider a country with a slowly changing population size, where a certain political party retains a relatively stable percentage of the poll in elections. Let X = Number of people that vote for the party in an election Y = Number of people that die due to a given disease in a year Z = Population size. Then, the correlation coefficient between X and Y is likely to be close to 1, indicating that there is a mathematical relationship between them (i.e.) X is a function of Z and Y is a function of Z also. It would clearly be silly to suggest that the incidence of the disease is caused by the number of people that vote for the given political party. This is known as the problem of collinearity. Spotting hidden dependencies between distributions can be difficult. Statistical experimentation can only be used to disprove hypotheses, or to lend evidence to support the view that reputed relationships between variables may be valid. Thus, the fact that we observe a high correlation coefficient between deaths due to heart failure in a given year with the number of cigarettes consumed twenty years earlier does not establish a cause-and- effect relationship. However, this result may be of value in directing biological research in a particular direction.