Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Chi-Square.

Similar presentations


Presentation on theme: "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Chi-Square."— Presentation transcript:

1 Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Chi-Square

2 Chi-Square Analyses Chi-Square tests are used to analyze categorical (as opposed to continuous or ranked) data. Chi-Square tests are used to analyze categorical (as opposed to continuous or ranked) data. Both independent and depended variables are on nominal scales Both independent and depended variables are on nominal scales Data in cells represent frequencies as opposed to measured scores on variables. Data in cells represent frequencies as opposed to measured scores on variables.

3 One Classification Variable: Chi-Square Goodness-of-Fit Test Sometimes, we may be interested in determining if a specific category for a nominal variable occurs more frequently than would be expected by chance alone. Sometimes, we may be interested in determining if a specific category for a nominal variable occurs more frequently than would be expected by chance alone. For example, are people more likely to be right-handed than left-handed? Is there a significant preference for salty as opposed to sweet or spicy snacks? For example, are people more likely to be right-handed than left-handed? Is there a significant preference for salty as opposed to sweet or spicy snacks? We can answer such questions by comparing observed frequencies with theoretically predicted ones. We can answer such questions by comparing observed frequencies with theoretically predicted ones.

4 Example We have a sample of 99 participants and ask them to choose one of 3 snacks (salty, sweet, spicy). We have a sample of 99 participants and ask them to choose one of 3 snacks (salty, sweet, spicy). The null hypothesis would be that no divergent preferences exist – each option is as likely to be selected. The null hypothesis would be that no divergent preferences exist – each option is as likely to be selected. Expected frequencies are the number of observations expected if the null is true. Expected frequencies are the number of observations expected if the null is true. This would imply that the expected frequencies would be 33 for each type of snack. This would imply that the expected frequencies would be 33 for each type of snack.

5 Example We can then compare the actual versus predicted preferences. We can then compare the actual versus predicted preferences. Observed: Observed: 45 (Salty), 26 (Sweet), 28 (Spicy) 45 (Salty), 26 (Sweet), 28 (Spicy) Expected: Expected: 33 (Salty), 33 (Sweet), 33 (Spicy) 33 (Salty), 33 (Sweet), 33 (Spicy) Our task now is to determine if the deviation from expected frequencies is unlikely to represent sampling error. Our task now is to determine if the deviation from expected frequencies is unlikely to represent sampling error.

6 Chi-Square Test The logic of the Chi- Square test is straightforward. The logic of the Chi- Square test is straightforward. We calculate the size of the squared deviations scaled by the average size of the expected values. We calculate the size of the squared deviations scaled by the average size of the expected values. For example, if we had expected only 10 observations and found 20, that is a large discrepancy. If we had expected 100 and found 110, it is much less consequential. For example, if we had expected only 10 observations and found 20, that is a large discrepancy. If we had expected 100 and found 110, it is much less consequential.

7 Chi-Square Test

8 Is it Significant? Of course, we now have to determine the likelihood of this value. We do so by referring to the Chi-Square distribution. Of course, we now have to determine the likelihood of this value. We do so by referring to the Chi-Square distribution. df=#groups-1 df=#groups-1 Like t and F, Chi-Square distribution is a family of distributions whose shape changes as a function of df’s. Like t and F, Chi-Square distribution is a family of distributions whose shape changes as a function of df’s. It is positively skewed, especially for small df’s. It is positively skewed, especially for small df’s.

9 Is it Significant? We can see that the critical value for a df=2 test at alpha =.05 is 5.99. We can see that the critical value for a df=2 test at alpha =.05 is 5.99. We can reject the null and state that there seems to be a significant preference for salty snacks. We can reject the null and state that there seems to be a significant preference for salty snacks.

10 Two Classification Variables A more common use occurs with 2 variables (often iv and dv). A more common use occurs with 2 variables (often iv and dv). For example, does a political advertisement that makes you angry result in greater votes for a candidate than a more neutral one? For example, does a political advertisement that makes you angry result in greater votes for a candidate than a more neutral one? Have participants watch one type of ad and then record their voting behavior. Have participants watch one type of ad and then record their voting behavior.

11 Data Formula is the same to calculate chi-square. Expected frequencies are calculated as the product of the row and column total (i.e., marginal totals) divided by the total sample size N.

12 Results df = (R-1)(C-1)

13 Effect Size Most common is Cramer’s Phi. Most common is Cramer’s Phi. Cramer’s squared gives an index of the amount of variance explained (similar to eta sqaured): Cramer’s squared gives an index of the amount of variance explained (similar to eta sqaured):

14 Chi-Square and Proportions Chi-Square tests can be used to analyze proportions if you convert the proportions to actual frequencies. Chi-Square tests can be used to analyze proportions if you convert the proportions to actual frequencies.

15 Chi-Square Assumptions All data are independent. All data are independent. No participant can be included more than once. No participant can be included more than once. As a rule of thumb, the expected frequencies for all cells should be no smaller than 5. As a rule of thumb, the expected frequencies for all cells should be no smaller than 5.


Download ppt "Intro to Statistics for the Behavioral Sciences PSYC 1900 Lecture 17: Chi-Square."

Similar presentations


Ads by Google