Accuracy of Prediction How accurate are predictions based on a correlation?

Slides:



Advertisements
Similar presentations
The Simple Regression Model
Advertisements

Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
R Squared. r = r = -.79 y = x y = x if x = 15, y = ? y = (15) y = if x = 6, y = ? y = (6)
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Wednesday, October 6 Correlation and Linear Regression.
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Linear Regression. Start by exploring the data Construct a scatterplot  Does a linear relationship between variables exist?  Is the relationship.
 Coefficient of Determination Section 4.3 Alan Craig
For Explaining Psychological Statistics, 4th ed. by B. Cohen
Regression What is regression to the mean?
Cal State Northridge  320 Andrew Ainsworth PhD Regression.
Regression and Correlation
Topics: Inferential Statistics
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Correlation-Regression The correlation coefficient measures how well one can predict X from Y or Y from X.
Statistics Psych 231: Research Methods in Psychology.
Simple Regression correlation vs. prediction research prediction and relationship strength interpreting regression formulas –quantitative vs. binary predictor.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
1 MF-852 Financial Econometrics Lecture 6 Linear Regression I Roy J. Epstein Fall 2003.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Regression-Prediction The regression-prediction equations are the optimal linear equations for predicting Y from X or X from Y.
Business Statistics - QBM117 Least squares regression.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Central Limit Theorem The Normal Distribution The Standardised Normal.
CHAPTER 3 Describing Relationships
Lecture 5: Simple Linear Regression
Linear Regression and Correlation Topic 18. Linear Regression  Is the link between two factors i.e. one value depends on the other.  E.g. Drivers age.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Mean, Variance, and Standard Deviation for Grouped Data Section 3.3.
Correlation 10/30. Relationships Between Continuous Variables Some studies measure multiple variables – Any paired-sample experiment – Training & testing.
Example of Simple and Multiple Regression
Lecture 15 Basics of Regression Analysis
From Last week.
Forecasting Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
David Corne, and Nick Taylor, Heriot-Watt University - These slides and related resources:
Essentials of Marketing Research
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
Statistics for Business and Economics Chapter 10 Simple Linear Regression.
Bootstrapping (And other statistical trickery). Reminder Of What We Do In Statistics Null Hypothesis Statistical Test Logic – Assume that the “no effect”
EDUC 200C Section 3 October 12, Goals Review correlation prediction formula Calculate z y ’ = r xy z x for a new data set Use formula to predict.
Simple Linear Regression One reason for assessing correlation is to identify a variable that could be used to predict another variable If that is your.
Introductory Statistics for Laboratorians dealing with High Throughput Data sets Centers for Disease Control.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Mean and Standard Deviation of Discrete Random Variables.
Lab 3b: Distribution of the mean
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 3 Section 2 – Slide 1 of 27 Chapter 3 Section 2 Measures of Dispersion.
Regression Lines. Today’s Aim: To learn the method for calculating the most accurate Line of Best Fit for a set of data.
Chapter 6.3 The central limit theorem. Sampling distribution of sample means A sampling distribution of sample means is a distribution using the means.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Educ 200C Wed. Oct 3, Variation What is it? What does it look like in a data set?
Statistics Presentation Ch En 475 Unit Operations.
STA291 Statistical Methods Lecture LINEar Association o r measures “closeness” of data to the “best” line. What line is that? And best in what terms.
Sampling Fundamentals 2 Sampling Process Identify Target Population Select Sampling Procedure Determine Sampling Frame Determine Sample Size.
Linear Prediction Correlation can be used to make predictions – Values on X can be used to predict values on Y – Stronger relationships between X and Y.
Class 22. Understanding Regression EMBS Part of 12.7 Sections 1-3 and 7 of Pfeifer Regression note.
Regression Analysis: A statistical procedure used to find relations among a set of variables B. Klinkenberg G
Regression and Correlation of Data Summary
Measures of Dispersion
Regression and Correlation
6-3The Central Limit Theorem.
Statistical Methods For Engineers
Regression Computer Print Out
Fundamental Statistics for the Behavioral Sciences, 4th edition
Chapter 3 Variability Variability – how scores differ from one another. Which set of scores has greater variability? Set 1: 8,9,5,2,1,3,1,9 Set 2: 3,4,3,5,4,6,2,3.
Means and Variances of Random Variables
Introduction to Regression
Regression & Prediction
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Accuracy of Prediction How accurate are predictions based on a correlation?

Accuracy depends on r XY  If we know nothing about an individual (e.g., we try to predict the IQ of a randomly selected person), we should guess the mean.  If we always guess the mean, then the variance tells us the average “cost” of our guesses.  However, if we use X to predict Y, we can reduce this cost by r-squared.

On Sale: How Accurate?  By squaring the correlation, we know what percentage of variance will be reduced by using X to predict Y.  If r = 1 or r = -1, the squared value is 1. These are both cases of perfect prediction, like 100% off.  If r = ½ or r = -½, the squared correlation is ¼ or.25. This means that a correlation of.5 only reduces the cost by 25%.

Variance of Residuals: the “standard error of regression”  The average squared deviation between the guess and the actual value of Y is called the variance of residuals (errors)  You compute it by multiplying the original variance of Y by (1 – r 2 ), where r is the correlation between X and Y.  The standard error of regression is the square root of this variance.

Sample Problem  Suppose we use sister’s IQ to predict brother’s IQ. The means of X and Y are both 100, and the standard deviations are both 15.  The variance of predicting Joe’s IQ if we don’t know Jane’s IQ is 225.  The correlation is.5, so the variance of the residuals is (1-.25)(225) =

Standard Deviation of Errors  Take the square root of the variance of residuals to compute the standard error of regression, i.e., the standard deviation of differences between predicted and obtained.  For our problem, the square root is 12.99, approximately 13.  Knowing Sister’s IQ reduces the standard deviation of residuals from 15 to 13.

Summary  If Jane has an IQ of 130, we predict her brother to have an IQ of 115.  However, not all brothers of sisters with such IQ will be exactly 115.  On average, they will have a mean IQ of 115, with a standard deviation of 13.  The probability that Joe has a higher IQ than his sister is only about 12%.