Download presentation
Presentation is loading. Please wait.
Published byVictor Haynes Modified over 9 years ago
1
Correlation and Covariance
2
Overview Continuous Categorical Histogram Scatter Boxplot Predictor Variable (X-Axis) Height Outcome, Dependent Variable (Y-Axis)
3
Correlation Covariance is High: r ~1 Covariance is Low: r ~0
4
It varies between -1 and +1 0 = no relationship It is an effect size ±.1 = small effect ±.3 = medium effect ±.5 = large effect Coefficient of determination, r 2 By squaring the value of r you get the proportion of variance in one variable shared by the other. Things to Know about the Correlation
5
Variables Y X’s Height Independent Variables Dependent Variables Y X4 X3 X2X1
7
Little Correlation
8
Correlation is For Linear Relationships
9
Outliers Can Skew Correlation Values
10
Correlation and Regression Are Related
11
Covariance Y X Persons 2,3, and 5 look to have similar magnitudes from their means
12
Covariance Calculate the error [deviation] between the mean and each subject’s score for the first variable (x). Calculate the error [deviation] between the mean and their score for the second variable (y). Multiply these error values. Add these values and you get the cross product deviations. The covariance is the average cross-product deviations:
13
Covariance AgeIncomeEducation 743 418 635 861 857 729 533 958 745 822 952 842 923 847 314 313 826 125 317 633 Do they VARY the same way relative to their own means? 2.47
14
It depends upon the units of measurement. E.g. the covariance of two variables measured in miles might be 4.25, but if the same scores are converted to kilometres, the covariance is 11. One solution: standardize it! normalize the data Divide by the standard deviations of both variables. The standardized version of covariance is known as the correlation coefficient. It is relatively unaffected by units of measurement. Limitations of Covariance
15
The Correlation Coefficient
16
Correlation Covariance is High: r ~1 Covariance is Low: r ~0
17
Correlation
18
Need inter-item/variable correlations >.30
19
Character Vector: b <- c("one","two","three") numeric vector character vector Numeric Vector: a <- c(1,2,5.3,6,-2,4) Matrix: y<-matrix(1:20, nrow=5,ncol=4) Dataframe: d <- c(1,2,3,4) e <- c("red", "white", "red", NA) f <- c(TRUE,TRUE,TRUE,FALSE) mydata <- data.frame(d,e,f) names(mydata) <- c("ID","Color","Passed") List: w <- list(name="Fred", age=5.3) Data Structures Framework Source: Hadley Wickham
20
Correlation Matrix
21
Correlation and Covariance
22
Revisiting the Height Dataset
23
Galton: Height Dataset cor(heights) Error in cor(heights) : 'x' must be numeric Initial workaround: Create data.frame without the Factors h2 <- data.frame(h$father,h$mother,h$avgp,h$childNum,h$kids) cor() function does not handle Factors Later we will RECODE the variable into a 0, 1 Excel correl() does not either
24
Histogram of Correlation Coefficients +1
25
Correlations Matrix: Both Types library(car) scatterplotMatrix(heights) Zoom in on Gender
26
Correlation Matrix for Continuous Variables chart.Correlation(num2) PerformanceAnalytics package
27
Categorical: Revisit Box Plot Factors/Categorical work with Boxplots; however some functions are not set up to handle Factors Note there is an equation here: Y = mx b Correlation will depend on spread of distributions
28
Manual Calculation: Note Stdev is Lower Note that with 0 and 1 the Delta from Mean are low; and Standard Deviation is Lower. Whereas the Continuous Variable has a lot of variation, spread.
29
Categorical: Recode! Gender recoded as a 0= Female 1 = Male @correl does not work with Factor Variables Formula now works!
30
Correlation: Continuous & Discrete More examples of cor.test()
31
Correlation Regression
32
Continuous Categorical Continuous Categorical Histogram Scatter Bar Cross Table Boxplot Predictor Variable (X-Axis) Pie Mosaic Cross Table Linear Regression Logistic Regression Regression Model Parents Height Gender Frequency 0 1 Outcome, Dependent Variable (Y-Axis) Mean, Median, Standard Deviation Proportions Summary
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.