Correlation1.  The variance of a variable X provides information on the variability of X.  The covariance of two variables X and Y provides information.

Slides:



Advertisements
Similar presentations
Chapter 3, Numerical Descriptive Measures
Advertisements

Chapter 16: Correlation.
Lesson 10: Linear Regression and Correlation
Correlation and regression Dr. Ghada Abo-Zaid
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.
Correlation CJ 526 Statistical Analysis in Criminal Justice.
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
CJ 526 Statistical Analysis in Criminal Justice
Ka-fu Wong © 2007 ECON1003: Analysis of Economic Data Lesson2-1 Lesson 2: Descriptive Statistics.
Chap 3-1 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 3 Describing Data: Numerical.
17 Correlation. 17 Correlation Chapter17 p399.
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1 Statistics for Business and Economics 7 th Edition Chapter 2 Describing Data:
1 1 Slide © 2003 South-Western/Thomson Learning TM Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Correlation and Covariance
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Basic Statistical Concepts Psych 231: Research Methods in Psychology.
The Simple Regression Model
Basic Statistical Concepts
Statistics Psych 231: Research Methods in Psychology.
Intro to Statistics for the Behavioral Sciences PSYC 1900
Variability Measures of spread of scores range: highest - lowest standard deviation: average difference from mean variance: average squared difference.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Correlation A correlation exists between two variables when one of them is related to the other in some way. A scatterplot is a graph in which the paired.
2.3. Measures of Dispersion (Variation):
3-3 Measures of Variation. Definition The range of a set of data values is the difference between the maximum data value and the minimum data value. Range.
Basic Statistical Concepts Part II Psych 231: Research Methods in Psychology.
Chap 3-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 3 Describing Data: Numerical Statistics for Business and Economics.
Correlation and Regression Analysis
Example 16.3 Estimating Total Cost for Several Products.
Lecture 3-2 Summarizing Relationships among variables ©
Introduction to Linear Regression and Correlation Analysis
Correlation and regression 1: Correlation Coefficient
Economics 173 Business Statistics Lecture 2 Fall, 2001 Professor J. Petry
Chapter 3 - Part B Descriptive Statistics: Numerical Methods
CORRELATION & REGRESSION
Covariance and correlation
Correlation.
Chapter 14 – Correlation and Simple Regression Math 22 Introductory Statistics.
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
MEASURES of CORRELATION. CORRELATION basically the test of measurement. Means that two variables tend to vary together The presence of one indicates the.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Figure 15-3 (p. 512) Examples of positive and negative relationships. (a) Beer sales are positively related to temperature. (b) Coffee sales are negatively.
Lecture on Correlation and Regression Analyses. REVIEW - Variable A variable is a characteristic that changes or varies over time or different individuals.
Examining Relationships in Quantitative Research
CORRELATIONS: TESTING RELATIONSHIPS BETWEEN TWO METRIC VARIABLES Lecture 18:
Statistical analysis Outline that error bars are a graphical representation of the variability of data. The knowledge that any individual measurement.
Correlation & Regression Chapter 15. Correlation It is a statistical technique that is used to measure and describe a relationship between two variables.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
Sampling distributions rule of thumb…. Some important points about sample distributions… If we obtain a sample that meets the rules of thumb, then…
Chapter 3, Part B Descriptive Statistics: Numerical Measures n Measures of Distribution Shape, Relative Location, and Detecting Outliers n Exploratory.
Scatter Diagrams scatter plot scatter diagram A scatter plot is a graph that may be used to represent the relationship between two variables. Also referred.
Measures of Association: Pairwise Correlation
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
CHAPTER 2: Basic Summary Statistics
Chapter 15: Correlation. Correlations: Measuring and Describing Relationships A correlation is a statistical method used to measure and describe the relationship.
Applied Quantitative Analysis and Practices LECTURE#10 By Dr. Osman Sadiq Paracha.
LESSON 5 - STATISTICS & RESEARCH STATISTICS – USE OF MATH TO ORGANIZE, SUMMARIZE, AND INTERPRET DATA.
Copyright © 2012 by Nelson Education Limited. Chapter 12 Association Between Variables Measured at the Ordinal Level 12-1.
Statistical analysis.
Correlation and Covariance
Business and Economics 6th Edition
Statistical analysis.
Correlation and Covariance
CHAPTER 2: Basic Summary Statistics
Business and Economics 7th Edition
Presentation transcript:

Correlation1

 The variance of a variable X provides information on the variability of X.  The covariance of two variables X and Y provides information on the related variability of X and Y together.  Note the similarity of the structure of the formulas. Instead of relating X to itself as in the variance, X is related to the other variable Y. Correlation2

 There is an Excel function to calculate covariance: =COVAR(range1, range2)  Unfortunately for most common purposes, Excel does not calculate the sample covariance but instead calculates what is known as the population covariance.  Therefore, in order to transform Excel’s covariance calculation into the more useful sample covariance, it is necessary multiply Excel’s covariance calculation by the factor n/(n-1). Correlation3

 Covariance measures how much Y and X tend to vary in the same direction  High positive covariance means the highest values of Y tend to occur along with the highest values of X  However, it’s hard to interpret because it has no standard scale of reference. A covariance of 300,000 could be trivial while another of 2.1 fairly substantial. Correlation4

A more useful expression of this relationship between X and Y is to express it as a percentage of the standard deviations of X and Y. This percentage is known as the “standardized” covariance, or the correlation coefficient (correlation for short), and is commonly denoted by the variable r. In Excel, the correlation formula is =CORREL(range1, range2) Correlation5

 The correlation coefficient (r) measures how much Y and X tend to vary in the same direction on a standard scale. (Varying in the same direction is implicitly a linear relationship.)  It will always be between -1 and +1 r = +1 implies a perfect positive relationship r = –1 implies a perfect negative relationship r = 0 implies no linear relationship exists! Correlation6

 Since it is unlikely that any real social data will have either a perfect positive correlation (r=1) or a perfect negative correlation (r=-1), how does an analyst know if there is “enough” correlation.  A simple rule of the thumb is that a “correlation value” of less than 30% suggests no linear relationship, whereas a “correlation value” of more than 70% suggests a strong linear relationship. Everything in between is, say, “somewhat of a relationship”. Correlation7

8

 The hypotheses are: H 0 : correlation = 0 Versus H 1 : correlation ≠ 0  Approximate the standard error using the formula:  Calculate the T-statistic, n-2 dof. The formula is: Correlation9

 Suppose for a sample of size 20, the sample covariance between two variables X and Y is 87, the sample variance of X is 100 and the sample variance of Y is 400. Is there a statistically significant linear relationship? Correlation10

Correlation 11 A linear relationship between the two variables is statistically significant at the 10% level but not at the 5% level. (1.734 and 2.101)

Correlation12

 Correlation is most useful for quickly considering possible relationships between many different variables.  Suppose for example that the analysis is examining 10 different variables: X 1 … X 10  Using Excel’s correl function would require entering 45 such calculations.  A better exploratory (one-time) way is use Excel’s built-in Data-Analysis Toolpak. Correlation13

Correlation14 Complete the dialog box Leads to the results

 The correlation coefficient measures linearity.  If there is a nonlinear relationship, r will underestimate the predictive power of the relationship between the two variables. Correlation15

 Rank correlation measures how two variables are related in a more general way.  A high rank correlation says that large values of X tend to occur with large values of Y, and low with low, whether or not the relationship is linear.  Generally this type of correlation test might be applied to data that is highly skewed. In other words, there are a significant amount of very extreme values. Correlation16

 Compute the ranks for the set of X values, then for the Y values, low to high.  Compute the differences of the ranks and the square of the differences.  The statistic then is: Correlation 17 For simplicity, if some values are tied, interpolate the ranks and use the formula above. In this case, technically, the previous correlation calculation should be applied to the rankings rather than the formula above.

 The hypotheses are: H 0 : correlation = 0 Versus H 1 : correlation ≠ 0  Approximate the standard error using the formula:  Calculate the T-statistic, n-2 dof. The formula is: Correlation18

Correlation19