Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Statistics Introduction to Statistics Correlation Chapter 15 April 23-28, 2009 Classes #27-28.

Similar presentations


Presentation on theme: "Introduction to Statistics Introduction to Statistics Correlation Chapter 15 April 23-28, 2009 Classes #27-28."— Presentation transcript:

1 Introduction to Statistics Introduction to Statistics Correlation Chapter 15 April 23-28, 2009 Classes #27-28

2 Correlation A statistical technique that is used to measure and describe a relationship between two variables –For example: GPA and TD’s scored Statistics exam scores and amount of time spent studying

3 Notation A correlation requires two scores for each individual –One score from each of the two variables –They are normally identified as X and Y

4 Three characteristics of X and Y are being measured… The direction of the relationship –Positive or negative The form of the relationship –Usually linear form The strength or consistency of the relationship –Perfect correlation = 1.00; no consistency would be 0.00 –Therefore, a correlation measures the degree of relationship between two variables on a scale from 0.00 to 1.00.

5 Assumptions There are 3 main assumptions… –1. The dependent and independent are normally distributed. We can test this by looking at the histograms for the two variables –2. The relationship between X and Y is linear. We can check this by looking at the scattergram –3. The relationship is homoscedastic. We can test homoscedasticity by looking at the scattergram and observing that the data points form a “roughly symmetrical, cigar-shaped pattern” about the regression line. If the above 3 assumptions have been met, then we can use correlation and test r for significance

6 Pearson r The most commonly used correlation Measures the degree of straight-line relationship Computation: r = SP / (SS X )(SS Y )

7 Example 1 X 30 38 52 90 95 305 Y 160 180 210 240 970 X 2 900 1,444 2,704 8,100 9,025 22,173 Y 2 25,600 32,400 44,100 57,600 192,100 XY 4,800 6,840 9,360 18,900 22,800 62,700 (  X) (  X 2 ) (  Y) (  Y 2 ) (  XY)

8 Example 1 SS X SS X = X2 X2 X2 X2 - (  X) 2 (  X) 2 = 22,173 - 305 2 305 2 = n 5 = 22,173 - 93025/5 = 22,173 - 18,605 = 3,568 SS Y = Y2 Y2 - (  Y) 2 = 192,100 - 970 2 = n 5 = 192,100 - 940,900/5 = 192,100 - 188,180 = 3,920

9 Example 1 SP =  XY  XY - (  X)(  Y) (  X)(  Y) = n 62,700 - (305)(970) 5 = 62,700 - 295,850/5 = 62,700 - 59,170 = 3,530

10 Example 1 r = SP / (SS X )(SS Y ) = 3,530 / (3,568)(3,920) = 3,530 / 13,986,560 = 3,530 / 3,739.861 =.944

11 Coefficient of Determination (r 2 ) The value r 2 is called the coefficient of determination because it measures the proportion in variability in one variable that can be determined from the relationship with the other variable –For example: A correlation of r =.42 (or r = -.42) means that r 2 =.17 (or 17%) of the variability in the Y scores can be predicted from the relationship with the X scores

12 Coefficient of Determination (r 2 ) and Interpret: The coefficient of determination is r 2 =.891. Education, by itself, explains 89.1% of the variation in voter turnout.

13 Example 2 A researcher predicts that there is a high correlation between years of education and voter turnout –She chooses Alamosa, Boston, Chicago, Detroit, and NYC to test her theory

14 Example 2 The scores on each variable are displayed in table format: –Y = % Turnout –X = Years of Education CityXY Alamosa11.955 Boston12.160 Chicago12.765 Detroit12.868 NYC13.070

15 Scatterplot The relationship between X and Y is linear.

16 Make a Computational Table XYX2X2 Y2Y2 XY 11.955 12.160 12.765 12.868 13.070 ∑ X =∑Y =∑ X 2 =∑Y 2 = ∑XY =

17 Find Pearson’s r and Interpret:

18 Pearson’s r Had the relationship between % college educated and turnout, r =.32. –This relationship would have been positive and weak to moderate. Had the relationship between % college educated and turnout, r = -.12. –This relationship would have been negative and weak.

19 Find the Coefficient of Determination (r 2 ) and Interpret:

20 Hypothesis Testing with Pearson We can have a two-tailed hypothesis: H o : ρ = 0.0 H 1 : ρ ≠ 0.0 We can have a one-tailed hypothesis: H o : ρ = 0.0 H 1 : ρ 0.0) Note that ρ (rho) is the population parameter, while r is the sample statistic

21 Find r critical See Table B.6 (page 537) –You need to know the alpha level –You need to know the sample size –See that we always will use: df = n-2

22 Find r calculated See previous slides for formulas

23 Make you decision… r calculated < r critical then Retain H 0 r calculated > r critical then Reject H 0

24 Always include a brief summary of your results: Was it positive or negative? Was it significant ? Explain the correlation Explain the variation –Coefficient of Determination (r 2 )

25 Credits http://campus.houghton.edu/orgs/psychology/stat15b.ppt#267,2,Review http://publish.uwo.ca/~pakvis/Interval.ppt#276,17,Practical Example using Healey P. 418 Problem 15.1


Download ppt "Introduction to Statistics Introduction to Statistics Correlation Chapter 15 April 23-28, 2009 Classes #27-28."

Similar presentations


Ads by Google