Presentation is loading. Please wait.

Presentation is loading. Please wait.

Correlation, covariance

Similar presentations


Presentation on theme: "Correlation, covariance"— Presentation transcript:

1 Correlation, covariance
Association Correlation, covariance

2 Which of the following shows a scatterplot of a positive correlation?
B C D E

3 Which of the following is an example in which the correlation coefficient does not represent the relationship? A. A&B B. B C. C D. D&E E. E

4 What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1

5 What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1

6 What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1

7 What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1

8 Expected values, covariance, correlation and expected values
Introduction to Bivariate Regression Changes: -Updated ANES example using data ANES.dta -Generated new class dataset from the QOG data- class_qog.dta -New figures – code introduced along the way.

9 Does talking about politics cause people to be more interested in politics? Or does an interest in politics cause people to talk about politics?

10 Is this a causal relationship?
Talking about politics Interest in Politics Talking about politics Interest in Politics

11 How often do you talk about politics?
Times/Week Freq. Percent Cum. Total Times/Week Freq. Percent Cum. Total Data=ANES, Stata code: tab talkpol

12 How interested are you in politics?
Frequency Percent Cumulative 1. Not interested at all 2. Slightly interested 3. Moderately interested 4. Very interested 5. Extremely interested Total data=ANES, Stata code: tab intpol

13 Review standard deviation and variance
Variance: for each unit or observation, it is the distance from the mean squared and then divide by the number of units. Standard deviation – squareroot of variance since variance is in squared units, it doesn’t make any sense. The standard deviation can be understood in terms of the original measurement unit

14 Calculating variance and standard deviations
Obs. Prestige Deviation^2

15 Review: Units, mean, variance and standard deviation
Variable Obs. Mean Variance Std. Dev. Talking politics Interest politics

16 Expected value v. probability
If our population set of numbers is: 1,1,3,3,17, then the expected value is 5, even though P(5) = 0. Suppose we know that E(X) = 5 with the equation y = 5 + 7x. What is E(Y)?

17 Expected values Interest in politics What is the expected value?
Missing 26 Obs 1380 Mean 3.26 Std. Dev Var What is the expected value? What is the range? Mode? Why are there 26 missing? Talk politics Missing 7 Obs 1399 Mean Std. Dev Variance What is the expected value? Why is the standard deviation and variance so high?

18 Causation Time ordering Covariation

19 Co-variation from variation?
(xi - xmean)^2/n average distance between the mean of x and each x value, squared aka (xi - xmean) (xi - xmean)/n

20 Covariation? (xi - xmean) * (yi - ymean) / n-1

21 Covariation covariance can take any value negative infinity to positive infinity

22 Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1 When x and y are high at the same time and x and y are low at the same time, then the covariance is positive They are both higher than their means and so the products being added together are positive

23 Plot showing positive covariance

24 Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1 When x is low when y is high and vice versa, then the covariance is negative They are both higher than their means and so the products being added together are negative

25 Plot showing negative covariance
R code: library(foreign) #Choose the file `class_qog.dta' myFile <- file.choose() dat <- read.dta(myFile,header=TRUE) attach(dat) #Make Scatterplot scatterplot(wdi_mort~gle_gdp, reg.line=lm, smooth=TRUE, spread=TRUE, boxplots='xy', span=0.5, data=dat) Stata Code: twoway (lfitci wdi_mort gle_gdp) (scatter wdi_mort gle_gdp) R code: library(foreign) #Choose the file `class_qog.dta' myFile <- file.choose() dat <- read.dta(myFile,header=TRUE) attach(dat) #Make Scatterplot scatterplot(wdi_mort~gle_gdp, reg.line=lm, smooth=TRUE, spread=TRUE, boxplots='xy', span=0.5, data=dat) Stata Code: twoway (lfitci wdi_mort gle_gdp) (scatter wdi_mort gle_gdp)

26 Intuitive explanation
(xi - xmean) * (yi - ymean) / n When sometimes: x and y are high at the same time and x and y are low at the same time And about half of the other time x is low when y is high and vice versa Then the covariance is about 0 High positive numbers are added to high negative numbers

27 Plot showing no covariance

28 Covariance is a function of…
Variance (standard deviation) of x Variance (standard deviation) of y Relationship between x and y

29 How can you compare a covariance of 132 and 134,847?
134, 847 could be high variance of x, high variance of y, high variance of both variables, or a high relationship between x and y? Not that helpful?

30 Divide by the standard deviation of x * the standard deviation of y
How can you change the covariance to a number that tells you only the magnitude of the relationship between x and y? Divide by the standard deviation of x * the standard deviation of y Correlation =1/(n-1) ((x-xmean)*(y-ymean) /Sd(x) * sd (y)) Pearson r ranges from -1 to +1 Weak correlation = .1 moderate correlation = .4 strong correlation = .7


Download ppt "Correlation, covariance"

Similar presentations


Ads by Google