Download presentation
Presentation is loading. Please wait.
1
Correlation, covariance
Association Correlation, covariance
2
Which of the following shows a scatterplot of a positive correlation?
B C D E
3
Which of the following is an example in which the correlation coefficient does not represent the relationship? A. A&B B. B C. C D. D&E E. E
4
What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1
5
What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1
6
What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1
7
What is the correlation of the following scatterplot?
B. .1 C. .9 D. -.9 E. -.1
8
Expected values, covariance, correlation and expected values
Introduction to Bivariate Regression Changes: -Updated ANES example using data ANES.dta -Generated new class dataset from the QOG data- class_qog.dta -New figures – code introduced along the way.
9
Does talking about politics cause people to be more interested in politics? Or does an interest in politics cause people to talk about politics?
10
Is this a causal relationship?
Talking about politics Interest in Politics Talking about politics Interest in Politics
11
How often do you talk about politics?
Times/Week Freq. Percent Cum. Total Times/Week Freq. Percent Cum. Total Data=ANES, Stata code: tab talkpol
12
How interested are you in politics?
Frequency Percent Cumulative 1. Not interested at all 2. Slightly interested 3. Moderately interested 4. Very interested 5. Extremely interested Total data=ANES, Stata code: tab intpol
13
Review standard deviation and variance
Variance: for each unit or observation, it is the distance from the mean squared and then divide by the number of units. Standard deviation – squareroot of variance since variance is in squared units, it doesn’t make any sense. The standard deviation can be understood in terms of the original measurement unit
14
Calculating variance and standard deviations
Obs. Prestige Deviation^2
15
Review: Units, mean, variance and standard deviation
Variable Obs. Mean Variance Std. Dev. Talking politics Interest politics
16
Expected value v. probability
If our population set of numbers is: 1,1,3,3,17, then the expected value is 5, even though P(5) = 0. Suppose we know that E(X) = 5 with the equation y = 5 + 7x. What is E(Y)?
17
Expected values Interest in politics What is the expected value?
Missing 26 Obs 1380 Mean 3.26 Std. Dev Var What is the expected value? What is the range? Mode? Why are there 26 missing? Talk politics Missing 7 Obs 1399 Mean Std. Dev Variance What is the expected value? Why is the standard deviation and variance so high?
18
Causation Time ordering Covariation
19
Co-variation from variation?
(xi - xmean)^2/n average distance between the mean of x and each x value, squared aka (xi - xmean) (xi - xmean)/n
20
Covariation? (xi - xmean) * (yi - ymean) / n-1
21
Covariation covariance can take any value negative infinity to positive infinity
22
Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1 When x and y are high at the same time and x and y are low at the same time, then the covariance is positive They are both higher than their means and so the products being added together are positive
23
Plot showing positive covariance
24
Intuitive explanation
(xi - xmean) * (yi - ymean) / n-1 When x is low when y is high and vice versa, then the covariance is negative They are both higher than their means and so the products being added together are negative
25
Plot showing negative covariance
R code: library(foreign) #Choose the file `class_qog.dta' myFile <- file.choose() dat <- read.dta(myFile,header=TRUE) attach(dat) #Make Scatterplot scatterplot(wdi_mort~gle_gdp, reg.line=lm, smooth=TRUE, spread=TRUE, boxplots='xy', span=0.5, data=dat) Stata Code: twoway (lfitci wdi_mort gle_gdp) (scatter wdi_mort gle_gdp) R code: library(foreign) #Choose the file `class_qog.dta' myFile <- file.choose() dat <- read.dta(myFile,header=TRUE) attach(dat) #Make Scatterplot scatterplot(wdi_mort~gle_gdp, reg.line=lm, smooth=TRUE, spread=TRUE, boxplots='xy', span=0.5, data=dat) Stata Code: twoway (lfitci wdi_mort gle_gdp) (scatter wdi_mort gle_gdp)
26
Intuitive explanation
(xi - xmean) * (yi - ymean) / n When sometimes: x and y are high at the same time and x and y are low at the same time And about half of the other time x is low when y is high and vice versa Then the covariance is about 0 High positive numbers are added to high negative numbers
27
Plot showing no covariance
28
Covariance is a function of…
Variance (standard deviation) of x Variance (standard deviation) of y Relationship between x and y
29
How can you compare a covariance of 132 and 134,847?
134, 847 could be high variance of x, high variance of y, high variance of both variables, or a high relationship between x and y? Not that helpful?
30
Divide by the standard deviation of x * the standard deviation of y
How can you change the covariance to a number that tells you only the magnitude of the relationship between x and y? Divide by the standard deviation of x * the standard deviation of y Correlation =1/(n-1) ((x-xmean)*(y-ymean) /Sd(x) * sd (y)) Pearson r ranges from -1 to +1 Weak correlation = .1 moderate correlation = .4 strong correlation = .7
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.