GOODNESS OF FIT TEST & CONTINGENCY TABLE CHAPTER 6 BCT2053 GOODNESS OF FIT TEST & CONTINGENCY TABLE
Contents 6.1 Introduction 6.2 Goodness of Fit Test 6.3 Contingency Table
6.1 Introduction Lesson outcomes: By the end of this topic you should be able to: Define the main situations when a chi-square distribution and significance test is used.
When to use Chi-Square Distribution Find confidence Interval for a variance or standard deviation Test a hypothesis about a single variance or standard deviation Tests concerning frequency distributions (Goodness of Fit) Test the Independence of two variables (Contingency Table) Test the homogeneity of proportions (Contingency Table)
6.2 Goodness of Fit Test Lesson outcomes: By the end of this topic you should be able to: Test a distribution for goodness of fit using Chi-square
When to use Chi-Square Goodness of fit test When you have some practical data and you want to know how well a particular statistical distribution, such as binomial or normal, models the data. Example: To meet customer demands, a manufacturer of running shoes may wish to see whether buyers show a preference for a specific style. If there were no preference, one would expect each style to be selected with equal frequency.
Hypothesis Null and Alternative H0 : There is No difference or no change Example: Buyers show no preference for a specific style. H1 : There is a difference or change Example: Buyers show a preference for a specific style.
Formula and Assumptions Where O = observed frequency E = expected frequency (equal frequency = 1/category) With degree of freedom equal to the number of categories minus 1 Assumptions The data are obtained from a random sample The expected frequency for each category must be 5 or more
Procedure State the hypothesis and identify the claim, Compute the test value. Find the critical value. The test is always right-tailed since O – E are square and always positive. Make the decision – reject Ho if Summarize the result.
Why this test is called goodness of fit If the graph between observed values and expected values is fitted, one can see whether the values are close together or far apart. When observed values and expected values are close together: the chi-square test value will be small. Decision must be not reject Ho (accept Ho). Hence there is a “good fit”. When observed values and expected values are far apart: the chi-square test value will be large. Decision must be reject Ho (accept H1). Hence there is a “not a good fit”.
Example 1 : goodness of fit test A market analyst whished to see whether consumers have any preference among five flavors of a new fruit soda. A sample of 100 people provided these data. Is there enough evidence to reject the claim that there is no preference in the selection of fruit soda flavors at 0.05 significance level? Cherry Strawberry Orange Lime Grape 32 28 16 14 10
Another Applications H0 is that the particular distributions does provide a model for the data. Example: state the claim distribution H1 is that it does not. Example: The distribution is not same as stated in the null hypothesis Expected value (E) = given percentage × sample size
Example 2 : goodness of fit test The adviser of an ecology club at a college believes that the group consists of 10% freshmen, 20% sophomores, 40% juniors, and 30% seniors. The membership for the club this year consisted of 14 freshmen, 19 sophomores, 51 juniors, and 16 seniors. At α = 0. 10, test the adviser’s conjecture.
Example 3 : goodness of fit test According to a particular genetic theory, the colour of strains (pink, white and blue) in a certain flower should appear in the ratio 3:2:5. In 200 randomly chosen plants, the corresponding numbers of each colour were 48, 28 and 124. Test at the 1% significance level, whether the theory is true.
EXCEL application Insert – functions - CHITEST P-value Reject H0 if P-value ≤ α
6.3 Contingency Table Lesson outcomes: By the end of this topic you should be able to: Test two variables for independence using Chi-square Test proportions for homogeneity using Chi-square
The Chi-Square Independence Test To test the independence of two tests H0 : The tests are independent (x has no relationship with y) H1 : The tests are not independent (x has relationship with y) Reject H0 if where and I = ROW (x) J =COLUMN (y) Row sum Column sum Grand total Observed values Expected values
Example 4: Chi-Square Independence Test The data below shows the number of insomnia patient according to their smoking habit in Malaysia. At α = 0.01, Can we say that insomnia is independent with smoking habit? Habit Smoking Not smoking Insomnia 20 40 Not insomnia 10 80
Example 5: Chi-Square Independence Test A researcher wishes to see if the way of people obtain information is not related with their educational background. A survey of 400 high school and college graduates yielded the following information. At α = 0.05, can the researcher conclude that the way people obtain information is not related with their educational background? Television Newspaper Other sources High School 159 90 51 College 27 42 31
Example 6 : Chi-Square Independence Test A sociologist wishes to see whether the number of years of college a person has completed is related to his or her place of residence. A sample of 88 people is selected and classified as shown. At 0.05 significance level, can the sociologist conclude that the years of college education are dependent on the person’s location? Location No college 4 year degree Advanced degree Urban 15 12 8 Suburban 9 Rural 6 7
Test for Homogeneity of Proportions Samples are selected from several different populations and the researcher is interested in determining whether the proportions of elements that have a common characteristics are the same for each population. H0 : H1 : At least one proportion is different from the others Reject H0 if where and
Example 7 : Homogeneity Test for Proportions A researcher selected a sample of 50 seniors from each of three area high schools and asked each senior, “ Do you drive to school in a car owned by either you or your parent?”. The data are shown in the table. At 0.05 , test the claim that the proportion of students who drive their own or their parents’ car is the same at all three schools. SCHOOL 1 SCHOOL 2 SCHOOL 3 Yes 18 22 16 No 32 28 34
SUMMARY Three uses of the Chi-Square distribution were explained in this chapter: Test a distribution for goodness of fit using Chi-square Test two variables for independence using Chi-square Test proportions for homogeneity using Chi-square The test is always a right tailed test:
THE END