Alternatively, dependent variable and independent variable. Alternatively, endogenous variable and exogenous variable.

Slides:



Advertisements
Similar presentations
7.1 Seeking Correlation LEARNING GOAL
Advertisements

Chapter 3 Bivariate Data
Scatter Diagrams and Linear Correlation
AP Statistics Chapters 3 & 4 Measuring Relationships Between 2 Variables.
Chapter 2: Looking at Data - Relationships /true-fact-the-lack-of-pirates-is-causing-global-warming/
Optical illusion ? Correlation ( r or R or  ) -- One-number summary of the strength of a relationship -- How to recognize -- How to compute Regressions.
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
Linear Regression MARE 250 Dr. Jason Turner.
BPS - 5th Ed. Chapter 51 Regression. BPS - 5th Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Ch 2 and 9.1 Relationships Between 2 Variables
Basic Practice of Statistics - 3rd Edition
Correlation & Regression
Chapter 5 Regression. Chapter 51 u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We.
Chapter 5 Regression. Chapter outline The least-squares regression line Facts about least-squares regression Residuals Influential observations Cautions.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
2.4: Cautions about Regression and Correlation. Cautions: Regression & Correlation Correlation measures only linear association. Extrapolation often produces.
Looking at data: relationships - Caution about correlation and regression - The question of causation IPS chapters 2.4 and 2.5 © 2006 W. H. Freeman and.
1 Chapter 3: Examining Relationships 3.1Scatterplots 3.2Correlation 3.3Least-Squares Regression.
Stat 13, Thur 5/24/ Scatterplot. 2. Correlation, r. 3. Residuals 4. Def. of least squares regression line. 5. Example. 6. Extrapolation. 7. Interpreting.
3.3 Least-Squares Regression.  Calculate the least squares regression line  Predict data using your LSRL  Determine and interpret the coefficient of.
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u Objective: To quantify the linear relationship between an explanatory variable (x) and.
Relationships If we are doing a study which involves more than one variable, how can we tell if there is a relationship between two (or more) of the.
Stat 13, Tue 5/29/ Drawing the reg. line. 2. Making predictions. 3. Interpreting b and r. 4. RMS residual. 5. r Residual plots. Final exam.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 3 Describing Relationships 3.2 Least-Squares.
Examining Bivariate Data Unit 3 – Statistics. Some Vocabulary Response aka Dependent Variable –Measures an outcome of a study Explanatory aka Independent.
CHAPTER 5 Regression BPS - 5TH ED.CHAPTER 5 1. PREDICTION VIA REGRESSION LINE NUMBER OF NEW BIRDS AND PERCENT RETURNING BPS - 5TH ED.CHAPTER 5 2.
Chapter 5 Regression. u Objective: To quantify the linear relationship between an explanatory variable (x) and response variable (y). u We can then predict.
Relationships Scatterplots and correlation BPS chapter 4 © 2006 W.H. Freeman and Company.
MARE 250 Dr. Jason Turner Linear Regression. Linear regression investigates and models the linear relationship between a response (Y) and predictor(s)
Chapter 2 Examining Relationships.  Response variable measures outcome of a study (dependent variable)  Explanatory variable explains or influences.
Business Statistics for Managerial Decision Making
BPS - 3rd Ed. Chapter 51 Regression. BPS - 3rd Ed. Chapter 52 u To describe the change in Y per unit X u To predict the average level of Y at a given.
Stat 1510: Statistical Thinking and Concepts REGRESSION.
Correlation  We can often see the strength of the relationship between two quantitative variables in a scatterplot, but be careful. The two figures here.
Regression Chapter 5 January 24 – Part II.
CHAPTER 5: Regression ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
Describing Relationships. Least-Squares Regression  A method for finding a line that summarizes the relationship between two variables Only in a specific.
Part II Exploring Relationships Between Variables.
Regression and Correlation of Data Correlation: Correlation is a measure of the association between random variables, say X and Y. No assumption that one.
Chapter 5: 02/17/ Chapter 5 Regression. 2 Chapter 5: 02/17/2004 Objective: To quantify the linear relationship between an explanatory variable (x)
Chapter 12: Correlation and Linear Regression 1.
Essential Statistics Regression
CHAPTER 3 Describing Relationships
Cautions about Correlation and Regression
Chapter 2 Looking at Data— Relationships
residual = observed y – predicted y residual = y - ŷ
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Unit 4 Vocabulary.
Least-Squares Regression
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Regression
Chapter 3: Describing Relationships
Least-Squares Regression
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
CHAPTER 3 Describing Relationships
Chapter 3: Describing Relationships
Chapter 3: Describing Relationships
Basic Practice of Statistics - 3rd Edition Lecture Powerpoint
Chapter 3: Describing Relationships
Honors Statistics Review Chapters 7 & 8
Presentation transcript:

Alternatively, dependent variable and independent variable. Alternatively, endogenous variable and exogenous variable.

Association versus causation

Scatterplots

Weeks since beginning of semester Percentage of computers used in computer labs free

Stata Exercise 1

Stata Exercise 2 Suppose we were considering the effect of hiring more people into the firm. On average, what total billings can we expect from a staff of 50? 150?

Stata Exercise 3

Stata Exercise 4

Stata Exercise 5 Adding Categorical Values to a Scatterplot Often it is useful to have a way of distinguishing groups of data in a scatterplot

Stata Exercise 6

Transforming Data Data analysts often look for a transformation of the data that simplifies the overall pattern. The transformation typically involves turning a non-Normally distributed variable into a more-or-less Normally distributed variable. Stata Exercise 7

Categorical Explanatory Variable What if the explanation for the numbers is not another number but the category? For example, investing in a particular sector of the economy might be great in some years or terrible in others. Stata Exercise 8

More scatterplots Relations between competitors Stata Exercise 9

Correlation

Which one has the stronger correlation?

r = covariance(x,y) / [stdev(x)*stdev(y)] r = (1/(n-1)) * sum of [(standardized values of x) (standardized values of] y)

Correlation The r coefficient between measures of height and weight is positive because people who are of above-average height tend to be of above-average weight … so if the z-score for height is large, the z-score for weight tends to be large. r = (1/(n-1)) * sum of [(standardized values of x) (standardized values of] y) Correlation applet at

Stata Exercise 11

Correlation Correlation coefficients, as well as scatterplots can be used for comparisons. For example, how well did Vanguard International Growth Fund (an investment vehicle) do compared to an average of the stocks in Europe, Australasia and the Far East? Stata Exercise 12

Correlation Doesn’t tell you anything about causality Variables must be numerical It is indifferent to units of measurement r>0 means positive association; r<0, negative -1 < r < 1. r = -1 means a perfectly straight downward-sloping line. r=0 means no relation. r only measures linear relations r is not resistant to outliers Stata Exercise 13

Regression

The Linear Regression Model Errors have a mean 0 and a constant sd of  and are independent of x.

(66.5’’, $20,000) (76.5’’, $35,600) (61.5’’, $12,200) y – 20,000 = 1560 (x ) y = – 84, x Sketch a scatterplot of the data consistent with this line $37,694 95% of values

Draw the best-fitting line through the circles

Mark with an “X” the average “y” value for each “x” value. Then draw the best-fitting line through the Xs

Regression (unlike correlation) is sensitive to your determination of which variable is explanatory and which response. Sales = a + b(item) Item = a + b(sales) Fact 1 Stata Exercise 14

Facts 2 and 3 If x changes by one standard deviation of x, y changes by r standard deviations of y. – E.g., s x = 1, s y = 2, and r = If x changes by 1, y will change by 2*0.61 = 1.22 The regression line goes through the point – The point-slope form of the line requires only the information on this slide to draw a line.

Fact 4 Correlation r is related to the slope of the regression line and therefore to the relation between x and y. Actually, the square of r, that is, R 2 is the fraction of the variation in y that is explained by the variation in x.

Because most of the variation in gas consumption is explained by temperature, the R 2 of this regression is very high.

tbill98tbill98_hatresiduals Excel Exercise 1

Stata Exercises 15 and 16

With influential observations Without influential observation 21

Stata Exercise 17

Cautions about Correlation and Regression Don’t extrapolate too far Correlations are stronger for averages than for individuals Beware of lurking (latent, hidden, excluded, neglected) variables Association is not causation – Establishing causation takes a lot of work (see p. 139).