Download presentation
Presentation is loading. Please wait.
Published bySuzanna Brooks Modified over 9 years ago
1
Introduction to Statistics Introduction to Statistics Correlation Chapter 15 April 23-28, 2009 Classes #27-28
2
Correlation A statistical technique that is used to measure and describe a relationship between two variables –For example: GPA and TD’s scored Statistics exam scores and amount of time spent studying
3
Notation A correlation requires two scores for each individual –One score from each of the two variables –They are normally identified as X and Y
4
Three characteristics of X and Y are being measured… The direction of the relationship –Positive or negative The form of the relationship –Usually linear form The strength or consistency of the relationship –Perfect correlation = 1.00; no consistency would be 0.00 –Therefore, a correlation measures the degree of relationship between two variables on a scale from 0.00 to 1.00.
5
Assumptions There are 3 main assumptions… –1. The dependent and independent are normally distributed. We can test this by looking at the histograms for the two variables –2. The relationship between X and Y is linear. We can check this by looking at the scattergram –3. The relationship is homoscedastic. We can test homoscedasticity by looking at the scattergram and observing that the data points form a “roughly symmetrical, cigar-shaped pattern” about the regression line. If the above 3 assumptions have been met, then we can use correlation and test r for significance
6
Pearson r The most commonly used correlation Measures the degree of straight-line relationship Computation: r = SP / (SS X )(SS Y )
7
Example 1 X 30 38 52 90 95 305 Y 160 180 210 240 970 X 2 900 1,444 2,704 8,100 9,025 22,173 Y 2 25,600 32,400 44,100 57,600 192,100 XY 4,800 6,840 9,360 18,900 22,800 62,700 ( X) ( X 2 ) ( Y) ( Y 2 ) ( XY)
8
Example 1 SS X SS X = X2 X2 X2 X2 - ( X) 2 ( X) 2 = 22,173 - 305 2 305 2 = n 5 = 22,173 - 93025/5 = 22,173 - 18,605 = 3,568 SS Y = Y2 Y2 - ( Y) 2 = 192,100 - 970 2 = n 5 = 192,100 - 940,900/5 = 192,100 - 188,180 = 3,920
9
Example 1 SP = XY XY - ( X)( Y) ( X)( Y) = n 62,700 - (305)(970) 5 = 62,700 - 295,850/5 = 62,700 - 59,170 = 3,530
10
Example 1 r = SP / (SS X )(SS Y ) = 3,530 / (3,568)(3,920) = 3,530 / 13,986,560 = 3,530 / 3,739.861 =.944
11
Coefficient of Determination (r 2 ) The value r 2 is called the coefficient of determination because it measures the proportion in variability in one variable that can be determined from the relationship with the other variable –For example: A correlation of r =.42 (or r = -.42) means that r 2 =.17 (or 17%) of the variability in the Y scores can be predicted from the relationship with the X scores
12
Coefficient of Determination (r 2 ) and Interpret: The coefficient of determination is r 2 =.891. Education, by itself, explains 89.1% of the variation in voter turnout.
13
Example 2 A researcher predicts that there is a high correlation between years of education and voter turnout –She chooses Alamosa, Boston, Chicago, Detroit, and NYC to test her theory
14
Example 2 The scores on each variable are displayed in table format: –Y = % Turnout –X = Years of Education CityXY Alamosa11.955 Boston12.160 Chicago12.765 Detroit12.868 NYC13.070
15
Scatterplot The relationship between X and Y is linear.
16
Make a Computational Table XYX2X2 Y2Y2 XY 11.955 12.160 12.765 12.868 13.070 ∑ X =∑Y =∑ X 2 =∑Y 2 = ∑XY =
17
Find Pearson’s r and Interpret:
18
Pearson’s r Had the relationship between % college educated and turnout, r =.32. –This relationship would have been positive and weak to moderate. Had the relationship between % college educated and turnout, r = -.12. –This relationship would have been negative and weak.
19
Find the Coefficient of Determination (r 2 ) and Interpret:
20
Hypothesis Testing with Pearson We can have a two-tailed hypothesis: H o : ρ = 0.0 H 1 : ρ ≠ 0.0 We can have a one-tailed hypothesis: H o : ρ = 0.0 H 1 : ρ 0.0) Note that ρ (rho) is the population parameter, while r is the sample statistic
21
Find r critical See Table B.6 (page 537) –You need to know the alpha level –You need to know the sample size –See that we always will use: df = n-2
22
Find r calculated See previous slides for formulas
23
Make you decision… r calculated < r critical then Retain H 0 r calculated > r critical then Reject H 0
24
Always include a brief summary of your results: Was it positive or negative? Was it significant ? Explain the correlation Explain the variation –Coefficient of Determination (r 2 )
25
Credits http://campus.houghton.edu/orgs/psychology/stat15b.ppt#267,2,Review http://publish.uwo.ca/~pakvis/Interval.ppt#276,17,Practical Example using Healey P. 418 Problem 15.1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.