Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Week 3 Association and correlation handout & additional course notes available at 15-10-2007 Trevor Thompson.

Similar presentations


Presentation on theme: "1 Week 3 Association and correlation handout & additional course notes available at 15-10-2007 Trevor Thompson."— Presentation transcript:

1 1 Week 3 Association and correlation handout & additional course notes available at http://homepages.gold.ac.uk/aphome 15-10-2007 Trevor Thompson

2 2 Overview 1) What are tests of association and which test do I use? 2) Associations within categorical data - descriptives (frequency tables) - descriptives (frequency tables) - the chi-square test - the chi-square test - Howell (2002) Chap 6 & 9. ‘Statistical Methods for Psychology’ 3) Associations within continuous data - descriptives (scatterplots) - descriptives (scatterplots) - Spearmans and Pearsons ‘r’ - Spearmans and Pearsons ‘r’

3 3 What is association/correlation? To examine whether there is a relationship between variables Variables are either associated or independent (which is null hypothesis?) Causation vs. association depends on the experimental design not the test used

4 4 Which test to use? Test selection depends on data: Categorical data – Chi-square Ordinal (ranked) data - Spearmans rho Interval/ratio data - Pearsons r Other less commonly used tests exist (tetrachoric, kendall’s tau, phi etc) – see Howell Logistic regression covered in later lecture

5 5 Which test to use - examples Is there an association between height and weight? Pearson’s r Is there an association between 50 cities ranked for ‘livability’ 10 years ago and these cities ranked for ‘livability’ today? Spearman’s rho Is there an association between gender (male / female) and yogurt preference (light / dark)? Chi-square test

6 6 Pearson’s chi-square test for categorical data -descriptives -assumptions -chi-square significance test Research question: Is gender associated with preference for a specifically coloured yogurt?

7 7 Chi-square test Data entry each row should represent responses of one participant Compute contingency (frequency) table n-way table denotes number of variables gender & yogurt is 2-way table Tables also described in terms of how many levels of each variable. So 3*2 table would represent one variable with 3 levels & one variable with 2 levels gender & yogurt preference is 2*2 table

8 8 Chi-square test Descriptives Contingency tables: Probable association Probable independence (no association) Possible association?

9 9 Chi-square test Assumptions 1. Observations must be independent 2. Observations must be mutually exclusive responses should only fall into cell. E.g. prefer either dark or light yogurt – not both 3. Inclusion of non-occurrences include all responses (e.g. both ‘yes’ and ‘no’ ) - otherwise can be misleading 4. Cell size Expected cell size>5

10 10 Chi-square test Significance testing Are two variables significantly associated? Run Pearson’s chi-square

11 11 Chi-square test Pearsons  2 statistic ) Gender & yogurt preference significantly associated (  2=6.67, p<.05) Is this in the expected direction? Our hypothesis was 2-tailed. If 1-tailed (e.g. females will prefer light yogurts) then check contingency table for direction Can halve p-value if 1-tailed – but only if variables have 2 levels

12 12 Chi-square test Degrees of freedom df = (R-1) * (C-1) where r=rows, c=columns df = (R-1) * (C-1) where r=rows, c=columns Yates’ Continuity correction Yates’ Continuity correction Only applicable to 2 * 2 tables Only applicable to 2 * 2 tables (O ‑ E) 2 in formula to {|0-E| -0.5} 2 (O ‑ E) 2 in formula to {|0-E| -0.5} 2 Not really needed Not really needed

13 13 Chi-square test Likelihood ratio Likelihood ratio An alternative test for associations of categorical data An alternative test for associations of categorical data For large samples, likelihood ratio=Pearson chi-square For large samples, likelihood ratio=Pearson chi-square For small samples, chi-square test may be more accurate For small samples, chi-square test may be more accurate Likelihood ratio is useful when for multi-dimensional associations – covered in Logistic regression lecture Likelihood ratio is useful when for multi-dimensional associations – covered in Logistic regression lecture

14 14 Chi-square test Odds-ratio (OR) estimate How large is our significant association? Odds of: females choosing light relative to dark? 2/1 Odds of: females choosing light relative to dark? 2/1 & males choosing light relative to dark? 1/2 Odds ratio= a/b Odds ratio= a/b c/d -or equivalently, OR=(ad)/(bc) Odds ratio: What is likelihood of choosing a light yogurt for females relative to males? 4/1 Odds ratio: What is likelihood of choosing a light yogurt for females relative to males? 4/1

15 15 Chi-square test – underlying logic Pearson  2 = ∑ (O-E) 2 Pearson  2 = ∑ (O-E) 2 E  2 statistic represents deviation of actual observed data differs from that expected by chance  2 statistic represents deviation of actual observed data differs from that expected by chance Calculating  2 Calculating  2 Step 1 -Calculate expected frequencies Step 1 -Calculate expected frequencies Prob of choosing light yogurt? O=observed frequency E=expected frequency ½ (30/60) ½ ¼ [Joint prob = p1 x p2] Prob of being female? Prob of being female & prefer light yogurt? So if N=60, expected freq for each cell =15 (60 x ¼)

16 16 Chi-square test – underlying logic Step 2. Observed frequencies Step 2. Observed frequencies Bigger deviations between observed and chance-expected cell sizes, the greater the likelihood of a significant association Bigger deviations between observed and chance-expected cell sizes, the greater the likelihood of a significant association  2 = ∑ (O-E) 2 = (20-15) 2 + (10-15) 2 + (10-15) 2 + (20-15) 2  2 = ∑ (O-E) 2 = (20-15) 2 + (10-15) 2 + (10-15) 2 + (20-15) 2 E15 15 15 15 =6.67, same as in SPSS output E15 15 15 15 =6.67, same as in SPSS output

17 17 Chi-square test – underlying logic Corresponding probability value of  2 =6.67 is p=.01 (meaning a value of 6.67 occurs 1/100 by chance) Corresponding probability value of  2 =6.67 is p=.01 (meaning a value of 6.67 occurs 1/100 by chance) Above chi-square distribution shows values of chi-square statistic that would be obtained by chance in repeated sampling Above chi-square distribution shows values of chi-square statistic that would be obtained by chance in repeated sampling Distribution of  2 changes according to df Distribution of  2 changes according to df

18 18 Correlation and regression Detailed coverage of correlation/regression in lectures 8 & 9 Detailed coverage of correlation/regression in lectures 8 & 9 When X & Y are continuous variables, we use Pearson’s correlation-coefficient ‘r’ (or equivalent Spearman’s rho for ranked data) When X & Y are continuous variables, we use Pearson’s correlation-coefficient ‘r’ (or equivalent Spearman’s rho for ranked data) Correlation vs. regression Correlation vs. regression i. correlation used to index strength of association regression used in prediction ii. (historically) If X is fixed then regression, if X is random then correlation

19 19 Correlation and regression Descriptives DescriptivesScatterplot Correlation (r) related to degree to which the points cluster around line (0 to 1 or -1) Correlation (r) related to degree to which the points cluster around line (0 to 1 or -1) Regression line is “line of best fit” Regression line is “line of best fit”

20 20 Correlation and regression Significance testing Significance testing Pearsons product-moment correlation Null hyp is population r=0, with r normally distributed To evaluate significance of ‘r’ convert to ‘t’ Assumptions of normality and homogeneity of variance apply – covered in detail in lecture 6 t = r * √(N – 2) (1 – r 2 ) r=0; no correlation r=+1 or -1; max correlation

21 21 Summary Selection of appropriate test depends on data Selection of appropriate test depends on data Chi-square test - explanation of output Chi-square test - explanation of output Chi-square test - underlying logic Chi-square test - underlying logic Correlation and regression Correlation and regression


Download ppt "1 Week 3 Association and correlation handout & additional course notes available at 15-10-2007 Trevor Thompson."

Similar presentations


Ads by Google