Download presentation
Presentation is loading. Please wait.
1
Qualitative data – tests of association
The Chi-Square Distribution and Test for Independence Hypothesis testing between two or more categorical variables Sporiš Goran, PhD.
2
Chi-Square Distribution
The chi-square distribution results when independent variables with standard normal distributions are squared and summed.
3
Chi-square Degrees of freedom
df = (r-1) (c-1) Where r = # of rows, c = # of columns Thus, in any 2x2 contingency table, the degrees of freedom = 1. As the degrees of freedom increase, the distribution shifts to the right and the critical values of chi-square become larger.
4
Chi-Square Test of Independence
5
Using the Chi-Square Test
Often used with contingency tables (i.e., crosstabulations) E.g., gender x student The chi-square test of independence tests whether the columns are contingent on the rows in the table. In this case, the null hypothesis is that there is no relationship between row and column frequencies. H0: The 2 variables are independent.
6
Requirements for Chi-Square test
Must be a random sample from population Data must be in raw frequencies Variables must be independent Categories for each I.V. must be mutually exclusive and exhaustive
7
Example Crosstab: Gender x Student
Student Not Student Total Males 46 (40.97) 71 (76.02) 117 Females 37 (42.03) 83 (77.97) 120 154 237 Observed Expected
8
Special Cases Fisher’s Exact Test Strength of Association
When you have a 2 x 2 table with expected frequencies less than 5. Strength of Association Some use Cramer’s V (for any two nominal variables) or Phi (for 2 x 2 tables) to give a value of association between the variables.
9
Two chi square tests Goodness of fit Independence One variable
Determines how well the sample proportions match a pre-specified distribution Independence Two variables Determines whether there is a relationship between two variables
10
Steps in hypothesis testing
State the hypotheses null research Select an alpha level and determine the critical value Compute the test statistic Make a decision
11
Test for goodness of fit
Forms of the null hypothesis No preference There is no difference in proportions among the categories Participants do not prefer one category over another Example: Pepsi: 50%, Coke 50% No difference from a comparison population There is no difference between the sample distribution and a known (population) distribution Example: ND: 20% Bl, 75% Br, 5% R US: 20% Bl, 75% Br, 5% R
12
Test for goodness of fit
Null hypothesis Specifies a distribution of proportions Research hypothesis Specifies that the distribution will be different than that indicated in the null hypothesis
13
Calculating the test statistic
Observed frequencies the number of individuals from the sample who are classified in a particular category fo Expected frequencies the number of individuals from the sample who are expected to be classified in a particular category fe
14
Calculating the test statistic
Coin flip: What percentage of people will predict heads? tails? Heads Tails Percentages 50% Proportions .5
15
Calculating the test statistic
Expected frequency = fe = pn n = 50 (sample size) fe = .5 x 50 = 25 Expected Heads Tails Proportions .5 Frequencies 25
16
Calculating the test statistic
Question: The last five flips were tails. What do you predict for the next flip? Heads Tails Observed 35 15 Expected 25
17
Calculating the test statistic
Heads Tails Observed 35 15 Expected 25 x2 = ∑ (fo - fe)2 fe Steps find the difference between fo and fe for each category square the difference divide the squared difference by fe sum the values from all categories
18
x2 = ∑ (fo - fe)2 = 4 + 4 = 8 Heads Tails Observed (fo) 35 15
Expected (fe) 25 fo - fe 10 -10 (fo - fe)2 100 (fo - fe)2/fe 4 x2 = ∑ (fo - fe)2 = = 8 fe
19
Chi square distribution
Critical range x2 Low chi square Hi chi square
20
Critical values for chi square distribution
Table B.8
21
Test for goodness of fit
The greater the number of categories, the greater the likelihood of a large observed chi square value Degrees of freedom (df) The number of values that are free to vary df = C – 1 C = the number of categories
22
Chi square distribution
23
Critical values for chi square distribution
24
Chi square distribution
5% 3.84 x2 Critical value (df = 1, = .05) = 3.84
25
Goodness of fit Make a decision
Critical value = 3.84 with df = 1 and = .05. Observed chi square = 8.0 8.0 > 3.84 Observed chi square is greater than critical value We reject the null hypothesis Conclude that category frequencies are different People were more likely to predict heads than tails
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.