Correlation. The statistic: Definition is called Pearsons correlation coefficient.

Slides:



Advertisements
Similar presentations
Forecasting Using the Simple Linear Regression Model and Correlation
Advertisements

Hypothesis Testing Steps in Hypothesis Testing:
Inference for Regression
Regression Analysis Simple Regression. y = mx + b y = a + bx.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Correlation and Regression
Simple Linear Regression and Correlation (Part II) By Asst. Prof. Dr. Min Aung.
The General Linear Model. The Simple Linear Model Linear Regression.
2.2 Correlation Correlation measures the direction and strength of the linear relationship between two quantitative variables.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
9. SIMPLE LINEAR REGESSION AND CORRELATION
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
PPA 415 – Research Methods in Public Administration
The Simple Regression Model
SIMPLE LINEAR REGRESSION
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 11 Notes Class notes for ISE 201 San Jose State University.
Introduction to Probability and Statistics Linear Regression and Correlation.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
SIMPLE LINEAR REGRESSION
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
So are how the computer determines the size of the intercept and the slope respectively in an OLS regression The OLS equations give a nice, clear intuitive.
Relationships Among Variables
Lecture 5 Correlation and Regression
Correlation & Regression
Correlation and Regression
Correlation and Linear Regression
Linear Regression.
SIMPLE LINEAR REGRESSION
Marketing Research Aaker, Kumar, Day and Leone Tenth Edition
Introduction to Linear Regression and Correlation Analysis
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Regression Analysis (2)
Simple Linear Regression Models
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Anthony Greene1 Correlation The Association Between Variables.
EQT 272 PROBABILITY AND STATISTICS
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
Linear Regression Hypothesis testing and Estimation.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
Introduction to Linear Regression
Chapter 10 Correlation and Regression
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Introduction to Probability and Statistics Thirteenth Edition Chapter 12 Linear Regression and Correlation.
Scatterplot and trendline. Scatterplot Scatterplot explores the relationship between two quantitative variables. Example:
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
MARKETING RESEARCH CHAPTER 18 :Correlation and Regression.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 13-1 Introduction to Regression Analysis Regression analysis is used.
Practice You collect data from 53 females and find the correlation between candy and depression is Determine if this value is significantly different.
Environmental Modeling Basic Testing Methods - Statistics III.
Creating a Residual Plot and Investigating the Correlation Coefficient.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Hypothesis testing and Estimation
Multivariate Data. Descriptive techniques for Multivariate data In most research situations data is collected on more than one variable (usually many.
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Chapter 7 Calculation of Pearson Coefficient of Correlation, r and testing its significance.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
CORRELATION ANALYSIS.
Summarizing Data Graphical Methods. Histogram Stem-Leaf Diagram Grouped Freq Table Box-whisker Plot.
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
REGRESSION AND CORRELATION SIMPLE LINEAR REGRESSION 10.2 SCATTER DIAGRAM 10.3 GRAPHICAL METHOD FOR DETERMINING REGRESSION 10.4 LEAST SQUARE METHOD.
Linear Regression Hypothesis testing and Estimation.
Inference about the slope parameter and correlation
Chapter 11: Simple Linear Regression
Hypothesis testing and Estimation
Comparing k Populations
Hypothesis testing and Estimation
Presentation transcript:

Correlation

The statistic: Definition is called Pearsons correlation coefficient

1.-1 ≤ r ≤ 1, |r| ≤ 1, r 2 ≤ 1 2.|r| = 1 (r = +1 or -1) if the points (x 1, y 1 ), (x 2, y 2 ), …, (x n, y n ) lie along a straight line. (positive slope for +1, negative slope for -1) Properties

Proof Uses the Cauchy-Schwarz inequality

Let then and if v i = bu i for some b and i = 1, 2, …, n. Cauchy-Schwarz Inequality

Let then This is a quadratic function of b and has a minimum when Proof:

or hence

Thus and i.e. v i = b min u i for i = 1, 2, …, n. if

Finally or i.e.

Also i.e. if and only if or

Note: and

Properties of Pearson’s correlation coefficient r 1.The value of r is always between –1 and If the relationship between X and Y is positive, then r will be positive. 3.If the relationship between X and Y is negative, then r will be negative. 4.If there is no relationship between X and Y, then r will be zero. 5.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope. 6.The value of r will be +1 if the points, ( x i, y i ) lie on a straight line with positive slope.

r =1

r = 0.95

r = 0.7

r = 0.4

r = 0

r = -0.4

r = -0.7

r = -0.8

r = -0.95

r = -1

The test for independence (zero correlation) The test statistic: Reject H 0 if |t| > t a/2 (df = n – 2) H 0 : X and Y are independent H A : X and Y are correlated The Critical region This is a two-tailed critical region, the critical region could also be one-tailed

Example In this example we are studying building fires in a city and interested in the relationship between: 1. X = the distance of the closest fire hall and the building that puts out the alarm and 2. Y = cost of the damage (1000$) The data was collected on n = 15 fires.

The Data

Scatter Plot

Computations

Computations Continued

The correlation coefficient The test for independence (zero correlation) The test statistic: We reject H 0 : independence, if |t| > t = H 0 : independence, is rejected

Relationship between Regression and Correlation

Recall and since

The test for independence (zero correlation) Uses the test statistic: H 0 : X and Y are independent H A : X and Y are correlated Note: and

1.The test for independence (zero correlation) H 0 : X and Y are independent H A : X and Y are correlated are equivalent The two tests 2.The test for zero slope H 0 :  = 0. H A :  ≠ 0

The Coefficient of Determination

The Residual Sum of Squares in Regression Note:

Proof Total Variance in Y = Variance Unexplained +Variance Explained

Proportion of Variance Unexplained = Proportion of Variance Explained = 1 - Proportion of Variance Unexplained = r 2 r 2 is called the Coefficient of Determination

92.3% = Proportion of Variance in Y (Cost of Damage) explained by X (distance to closes fire hall). Proportion of Variance Unexplained = 1 - r 2 r = Example: Fire Example r 2 = the Coefficient of Determination = = = = (7.7%)