Download presentation
Presentation is loading. Please wait.
Published byDana Gaines Modified over 8 years ago
1
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n A Chi-square test for homogeneity allows us to examine how the distribution of two categorical variables differ for two or more populations.
2
Different Chi-square Tests n n Goodness of fit: one variable in one population. n n Homogeneity: one variable in two or more populations (groups) n n Association/independence: 2 variables in one population.
3
Chi-square Goodness of Fit n In the test, we compare observed counts from the expected counts that would be expected if H o is true. n The more the difference between observed and expected counts, the greater the evidence to reject the null hypothesis. n Chi-square statistic is a measure of how far the observed counts are from the expected counts.
4
Chi-square Goodness of Fit n
5
Chi-square Distributions The chi-square distributions are a family of distributions that take only positive values and are skewed to the right. A particular chi-square distribution is specified by giving its degree of freedom.
6
Chi-square Distributions Chi-square goodness of fit test uses the chi- square distribution with degrees of freedom equal to the number of categories. Just like t-distributions, we use a table to compute the P-value. The mean of a particular chi-square distribution = its degrees of freedom.
7
Chi-square Distributions For df > 2, the mode (peak) of the chi- square density curve is at df - 2. i.e., when df = 8, the chi-square distribution has a mean of 8 and a mode of 6.
8
Chi-square Goodness of Fit-Test All expected counts must be at least 5. This large sample size condition takes the place of the Normal condition for z and t procedures. Must also check that Random and Independent conditions are met.
9
Chi-square Goodness of Fit-Test Chi-square test statistic compares observed & expected counts. Do not try and perform calculations with observed & expected proportions. When checking the Large Sample Size – examine the expected counts not observed. In this test, we test H o that a categorical variable has a specified distribution.
10
Chi-square Goodness of Fit-Test If the test finds a statistically significant result, a follow up analysis that compares the observed and expected counts to look for the largest components of the chi-square statistic.
11
Chi-square Goodness of Fit-Test n
12
n
13
Inference for Relationships n Two way tables can describe relationships between 2 categorical variables. n We can test to determine whether the distribution of a categorical variable is the same for each of several populations or treatments. n Then we can test to see if there is an association between the row & column variables in the two way table.
14
Inference for Relationships n Multiple comparisons: usually have two parts: –An overall test to see if there is good evidence of any differences among the parameters that we want to compare. –A detailed follow-up analysis to decide which of the parameters differ and to estimate how large the differences are.
15
Inference for Relationships Inference for Relationships n
16
Chi-square Test for Homogeneity n This test is also known as a chi-square test for homogeneity of proportions. n This test is when you are comparing two or more populations or treatments. n Ho: The true category proportions are the same for all the populations or treatments (homogeneity of populations or treatments).
17
Chi-square Test for Homogeneity n
18
Chi-square: Association/Independence n
19
Ho: there is no association between 2 categorical variables in the population of interest. Ha: there is an association between 2 categorical variables in the population of interest
20
Chi-square: Association/Independence Ho: 2 categorical variables are independent in the population of interest. Ha: 2 categorical variables are not independent in the population of interest.’ Conditions: Random, Large Sample size (at least 5), and Independent: when sampling w/o replacement, check that the population is at least 10 times as large as the sample.
21
Cautions & Limitations o Don’t confuse homogeneity with tests for independence. Tests for homogeneity are used when individuals in each of two or more independent samples are used when individuals in a single sample are classified according to two categorical variables.
22
Cautions & Limitations o Tests for independence are used when individuals in a single sample are classified according to 2 variables. o As in the hypothesis tests, we can never say that we have strong support for the null hypothesis. If we do not reject Ho for independence, we can’t conclude that the variables are independent.
23
Cautions & Limitations o Make sure that your assumptions are reasonable. P-values for chi-square tests are only approximate. o In the chi-square test for homogeneity, the assumption of independent samples is particularly important.
24
Cautions & Limitations o Don’t jump to conclusions about causation. Just as a strong correlation between two numerical values does not mean that there is a cause and effect relationship between them, an association between two categorical variables doesn’t imply a causal relationship.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.