Download presentation
Presentation is loading. Please wait.
1
Scatterplots, Association and Correlation
“If he was all on the same scale as his foot, he must certainly have been a giant.” - Sherlock Holmes The Adventure of Wisteria Lodge.
2
Scatterplots, Association and Correlation
y = 0.9x +18 Using function fitting tools (regression)
3
Scatterplots, Association and Correlation
Response Variable Explanatory Variable Direction: Positive (vs. negative) relationship
4
Scatterplot and line of fit: Calculator example
Calculator Example: Tuition at a University Data: 6546, 6996, 6996, 7350, 7500, 7978, 8377, 8710, 9110, 9411, 9800 Y = x r = .99
5
Scatterplot and line of fit: By hand
“Eyeball” a line of fit. Pick two representative Points (points on your line). Use y – y0 =m(x – x0) Using (6, 8377) and (7,8710): M = ( )/(7-6)=333 Y – 8710 = 333(x – 7) Y = 333x Y = 3x
6
Scatterplot and line of fit: Comparision of methods
22 178 25 173 20 150 24 169 166 29 170 163 23 179 28 167 27 157 161 Looks like our “eyeball” was off a good bit, when compared to the software. Y = 3x vs y =1.3x + 135
7
Correlation Measures the linear association between two variables. Correlation Coefficient Sum the product of the x and y z-score for each ordered pair and divide by n-1
8
Correlation Correlation Coefficient x y x-xbar y-ybar
(x-xbar)(y-ybar)/(sx*sy)=zx*zy 6 5 -8 -2 0.7613 10 3 -4 14 7 0.0000 19 8 1 0.2379 21 12 1.6652 sum 3.4256 sum/(n-1) 0.8564
9
Correlation Measures the linear association between two variables. A correlation greater than 0.8 is generally described as strong, whereas a correlation less than 0.5 is generally described as weak.
10
Correlation Conditions for using the correlation coefficient:
Measures the linear association between two variables. Conditions for using the correlation coefficient: Quantitative Variables Linear Outliers are not distorting correlation
11
Correlation Sometimes we can make non-linear data linear
This is the data for f-stops for a camera. Here we square the data to make it linear. Another transformation that statisticians sometimes use involves logs .
12
Correlation vs. Causation
Discuss possible lurking variables: Possibly poor decisions based on incorrect causation assumptions: The number of AP courses a student takes and SAT performance strongly correlate, so we should enroll all students in more AP courses. Studies show that the number of police in an area positively correlates with the amount of gang activity, so we should reduce the number of policemen to reduce gang activity. Homework and student performance have a positive correlation, so all teachers should assign more homework every night in every course. The amount of damage caused by a fire has a positive correlation with the number of firemen at the scene. To reduce the amount of damage due to fires, we should send less firemen.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.