Download presentation
Presentation is loading. Please wait.
Published byPierce Park Modified over 9 years ago
1
Introduction to Correlation
2
Correlation – when a relationship exists between two sets of data The news is filled with examples of correlation ◦ If you eat so many helpings of tomatoes… ◦ One alcoholic beverage a day… ◦ Driving faster than the speed limit… ◦ Women who smoke during pregnancy… ◦ If you eat only fast food for 30 days… ◦ If your parents did not have offspring, then you won’t either (huh?)
3
Make an XY scatterplot of the data, putting one variable on the x-axis and one variable on the y-axis. Insert a linear trendline on the graph and include the R 2 value Interpret the results
4
The higher the R 2 value, the better If you only have a few data points, then you need a higher R 2 value in order to conclude there is a correlation Crude estimate: R 2 > 0.5, most people say there is a correlation; R 2 < 0.3, the correlation is essentially non-existent R 2 between 0.3 and 0.5?? Gray area!
5
Look at: ◦ CigarettesBirthweight.xls ◦ SpeedLimits.xls ◦ HeightWeight.xls ◦ Grades.xls ◦ WineConsumption.xls ◦ BreastCancerTemperature.xls
6
In SPSS, click on Analyze -> Correlate -> Bivariate Select the two columns of data you want to analyze (move them from the left box to the right box) You can actually pick more than two columns, but we’ll keep it simple for now
7
Make sure the checkbox for Pearson Correlation Coefficients is checked Click OK to run the correlation You should get an output window something like the following slide
8
The correlation between height and weight is 0.861 The Pearson Correlation value is not the same as Excel’s R-squared value; it can be positive or negative
9
Positive correlation: as the values of one variable increase, the values of a second variable increase (values from 0 to 1.0) Negative correlation: as the values of one variable increase, the values of a second variable decrease (values from 0 to -1.0) Note: The SPSS R value will be greater than Excel’s R 2 value! R=.5 equivalent to R 2 =.25
10
There is a negative correlation between TV viewing and class grades—students who spend more time watching TV tend to have lower grades (or, students with higher grades tend to spend less time watching TV).
11
Positive correlationNegative correlation
12
When looking for correlation, positive correlation is not necessarily greater than negative correlation Which correlation is the greatest? -.34.72-.81.40-.12
13
If two variables are correlated, then we can predict one based on the other But correlation does NOT imply cause! It might be the case that having more education causes a person to earn a higher income. It might be the case that having higher income allows a person to go to school more. There could also be a third variable. Or a fourth. Or a fifth…
14
Causality – one variable, say A, actually causes the change in B. In the absence of any other evidence, data from observational studies simply cannot be used to establish causation.
15
Common underlying cause or causes – most important one – A is correlated to B, but there is a third factor C (the common underlying cause) that causes the changes in both A and B. Example: as ice cream sales go up, so do crime rates.
16
What Can We Conclude? Sheer coincidence – the two variables have nothing in common, but they create a strong R or R 2 value Both variables are changing over time – divorce rates are going up and so are drug- offenses. Is an increase in divorce causing more people to use drugs (and get caught)?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.