Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.

Slides:



Advertisements
Similar presentations
Chapter 8 Linear Regression © 2010 Pearson Education 1.
Advertisements

Scatter Diagrams and Linear Correlation
MA-250 Probability and Statistics
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
2-5 : Normal Distribution
Describing the Relation Between Two Variables
5-1 bivar. Unit 5 Correlation and Regression: Examining and Modeling Relationships Between Variables Chapters Outline: Two variables Scatter Diagrams.
REGRESSION What is Regression? What is the Regression Equation? What is the Least-Squares Solution? How is Regression Based on Correlation? What are the.
Copyright © 2014, 2013, 2010 and 2007 Pearson Education, Inc. Chapter Describing the Relation between Two Variables 4.
CHAPTER 3 Describing Relationships
REGRESSION Predict future scores on Y based on measured scores on X Predictions are based on a correlation from a sample where both X and Y were measured.
Chapters 10 and 11: Using Regression to Predict Math 1680.
Inference for regression - Simple linear regression
EC339: Lecture 6 Chapter 5: Interpreting OLS Regression.
Data Handling & Analysis BD7054 Scatter Plots Andrew Jackson
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
Statistics PSY302 Quiz One Spring A _____ places an individual into one of several groups or categories. (p. 4) a. normal curve b. spread c.
10B11PD311 Economics REGRESSION ANALYSIS. 10B11PD311 Economics Regression Techniques and Demand Estimation Some important questions before a firm are.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Statistics Bivariate Analysis By: Student 1, 2, 3 Minutes Exercised Per Day vs. Weighted GPA.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
1 Data Analysis Linear Regression Data Analysis Linear Regression Ernesto A. Diaz Department of Mathematics Redwood High School.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 3 Association: Contingency, Correlation, and Regression Section 3.3 Predicting the Outcome.
Example: set E #1 p. 175 average ht. = 70 inchesSD = 3 inches average wt. = 162 lbs.SD = 30 lbs. r = 0.47 a)If ht. = 73 inches, predict wt. b)If wt. =
CHAPTER 3 Describing Relationships
© 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license.
Chapters 8 Linear Regression. Correlation and Regression Correlation = linear relationship between two variables. Summarize relationship with line. Called.
Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Ch. 11 R.M.S Error for Regression error = actual – predicted = residual RMS(error) for regression describes how far points typically are above/below the.
Department of Mathematics
Lecture 9 Sections 3.3 Objectives:
Inference for Regression
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Regression and Residual Plots
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 12 More About Regression
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Topic 8 Correlation and Regression Analysis
Chapter 3: Describing Relationships
Statistics PSY302 Review Quiz One Spring 2017
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Presentation transcript:

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics for Economist Chap 7. The Error for Regression 1.Difference between Actual and Predict values 2.Computing RMSE Using the Correlation. 3.The Residual Plot 4.The Vertical Strips 5.Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 2/24 INDEX 1 Difference between Actual and Predict Values 2 Computing RMSE Using the Correlation 3 The Residual Plot 4 The Vertical Strips 5 Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 3/24 1. Difference between Actual and Predict Values Root-Mean-Square-Error (RMSE) Root-Mean-Square Error (RMSE) Standard Error of Estimate Standard Error of Regression Actual value Estimate Error 회귀직선

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 4/24 Estimation error1 height141cm. average weight of height 141cm is 38.7kg residual = actual weight – predicted weight = 54.5kg – 38.7kg = +15.8kg 67.4kg – 84.0kg = -16.6kg Residual of A Residual of B Korean men 4514 with age Average height = 167.5cm - SD of height = 8.5cm - Average weight = 63.5kg - SD of weight = 11.9kg - Correlation coefficient = Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 5/24 Estimation error actual weight – predicted weight generally called, residual. The overall size of these errors in measured by taking their root mean square. Vertical distance from the line Estimation error 2 predicted error actual weight height 1. Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 6/24 A typical point on a scatter plot is above or below the regression line by 8.9kg. (vertical distance) meaning The divisor degrees of freedom = = 4512 Computing the errors are based on the regression line. The regression line is defined by slope and intercept (lowering the degree of freedom) Computing the RMSE 1. Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 7/24 Group average  height of the regression line Distance from the center(RMSE) The Normal curve. Following rule. Regression line & RMSE vs. Average & SD 1. Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 8/24 Regression and rule of thumb 68% regression 1RMSE 95% regression 2RMSE About 68% of the points on a scatter diagram will be within 1RMSE of the regression line; about 95% of them will be within 2RMSE. 1. Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 9/24 Elementary method for RMSE actual y residual= (actual y) – (average y) estimate = (average y) x Estimate y ignoring x → a horizontal line for estimates. This elementary RMSE is SDy. 1. Difference between Actual and Predict Values

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 10/24 INDEX 1 Difference between Actual and Predict Values 2 Computing RMSE Using the Correlation 3 The Residual Plot 4 The Vertical Strips 5 Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 11/24 2. Computing RMSE Using the Correlation RMSE of the regression line and SDy yy xx RMSE SD y Regression lines Average y RMSE of regression is about RMSE of regression < SDy  because the regression line get closer to the points than the horizontal line. ref: Regression line is for ‘ much closer to the more scatters ’. r = 1 → RMSE = 0 r = -1 → RMSE = 0 r = 0 → RMSE  SD y Degrees of freedom

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 12/24 RMSE and Correlation coefficient Correlation coefficient Measures spread relative to the SD without units. RMSE Measures vertically spread around the regression line in absolute y-terms. We can get the RMSE from SDy using the correlation coefficient.. 2. Computing RMSE Using the Correlation

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 13/24 Regression analysis and correlation coefficient  r describes the clustering of the points around the SD line, relative to the SDs  Associated with each 1SD increase in x there is an increase of only r SDs in y, on the average  r determines the accuracy of the regression predictions, through the formula RMSE =  SD y.  RMSE describes how the regression line summarize data well. 2. Computing RMSE Using the Correlation

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 14/24 INDEX 1 Difference between Actual and Predict Values 2 Computing RMSE Using the Correlation 3 The Residual Plot 4 The Vertical Strips 5 Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 15/24 3. The Residual Plot Plotting the Residual Plot  The residuals average out to 0.  The regression line for the residual plot is horizontal x-axis. The reason is that all the trend up or down has been taken out of the residual, and is in the residuals.

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 16/24 A residual with a strong pattern With a mistake to use a regression line, such a pattern appears. The residual plot should not have a strong pattern. 3. The Residual Plot

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 17/24 INDEX 1 Difference between Actual and Predict Values 2 Computing RMSE Using the Correlation 3 The Residual Plot 4 The Vertical Strips 5 Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 18/ The Vertical Strips Scatter plot and histogram inside the vertical strips The two histograms have similar shapes, and their SDs are nearly the same. Group with height about 165cm people Group with height about 170 cm people

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 19/24 Homoscedasticity and Heteroscedasticity HomoscedasticityHeteroscedasticity All the vertical strips in a scatter plot show similar amounts of spread and the SDs of weight are not related to x-value. The size of it is about RMSE. The SDs of income in groups vary to the vertical strips. In this case, the RMSE of the regression line only gives a sort of average error across all the different x- values. 4. The Vertical Strips

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 20/24 INDEX 1 Difference between Actual and Predict Values 2 Computing RMSE Using the Correlation 3 The Residual Plot 4 The Vertical Strips 5 Approximating to the Normal Curve Inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 21/24 5.Approximating to the Normal Curve inside a Vertical Strip Impossible to approximate Estimates are meaningless themselves, The errors does not follow normal curve. The regression method uwing RMSE is off by different amounts in different parts of the scatter plot.

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 22/24 Ex) Midterm and final scores of econometrics in spring semester year 2002 midterm average = 27.9 midterm SD = 8.5 final average = 56.4 final SD = 13.8 r = 0.49 an oval shaped scatter plot. (1) What percentage of students got 66 or over on the final? (2) What percentage of students whose midterm score is 33 got 66 or over on the final? example1 5.Approximating to the Normal Curve inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 23/24 example 1 (1)Even Midterm related statistics or correlation coefficient are not necessary. z=0.7 By standard normal curve, 24% ☞ ☞ 5.Approximating to the Normal Curve inside a Vertical Strip

Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics Statistics & Econometrics STATISTICS 24/24 example 1 (2) We get new average using the regression analysis, new SD from RMSE of regression line. Regression Analysis Method 1. Midterm score is above the average by 0.6 SDx. 2. r= 0.49; 0.6  0.49 = Final score is above by 0.3 SDy = New average is = z = 0.5 By standard normal curve, 31 % 5.Approximating to the Normal Curve inside a Vertical Strip