Chi-Square Goodness of Fit

Chi-Square Goodness of Fit
3/30/2018

Using Chi-Square In this chapter, we are going to do significance tests for categorical variables Unlike the past few chapters, which have been using quantitative variables Some things about this chapter are easier There are no confidence intervals The null and alternative hypotheses are always the same Other things are potentially difficult We are going to learn 3 different types of Chi-Square tests You have to know which one to use

Chi-Square Goodness of Fit Test
For today, we are only going to learn one of the types of Chi-Square test: Chi-square Goodness of fit What does the Chi-Square Goodness of Fit Test do? It tells us whether a certain result is consistent with a hypothetical result (same as other significance tests we’ve done)

For Example Mars, Inc. makes M&M’s. The company claims that the distribution of colors is as follows: 13% brown, 13% red, 14% yellow, 16% green, 20% orange, and 24% blue. We take a random sample of 60 M&M’s (one bag), and we get the following result: Is this sample consistent with the company’s claims? If not, maybe we shouldn’t believe the company…

With what we have already learned, we could do a one-proportion-Z-Test for each different color. But this has some potential problems. Such as…

With what we have already learned, we could do a one-proportion-Z-Test for each different color. But this has some potential problems. Such as… More work—imagine a variable with 35 categories. You would have to do 35 significance tests What if one category rejects the null and another doesn’t? You probably want to draw a conclusion about the company’s claims overall, not about each color individually Instead, we can do a single Chi-Square Goodness of Fit test (considers all categories at the same time)

Hypotheses One nice thing about Chi-Square Goodness of Fit tests: we typically state our hypotheses just with words, rather than symbols

Hypotheses We could write our hypotheses with symbols, if we wanted to: But this is not as clear/intuitive to most people. I would recommend using words

Observed vs. Expected

Back to our Example Claimed proportions: 13% brown, 13% red, 14% yellow, 16% green, 20% orange, and 24% blue Calculate the expected number of each color in a sample of 60 M&M’s

Finding the Test Statistic
From these observed and expected values, we can calculate a test statistic (just like Z or T in previous significance tests) In this case, our test statistic is Chi-Square

Finding the Test Statistic
Let’s try it for our example

Chi-Square Distribution

Using the Test Statistic
We want to find out the probability of getting a result at least as extreme as the one we got

We want to find out the probability of getting a result at least as extreme as the one we got You could use Table C (feel free to try it) I don’t particularly like Table C We can use cdf instead Just like we did normalcdf for a normal sampling distribution, and tcdf for a T sampling distribution

We can use cdf instead Just like we did normalcdf for a normal sampling distribution, and tcdf for a T sampling distribution So, in our example, =10.18 Lower bound: Upper bound: Df:

We can use cdf instead Just like we did normalcdf for a normal sampling distribution, and tcdf for a T sampling distribution So, in our example, =10.18 Lower bound: 10.18 Upper bound: Big Number Df: 5 Result: what does this tell us?

Result: what does this tell us? This is our p-value There is a probability (7% chance) of getting a result at least as extreme as the one we got, IF THE NULL HYPOTHESIS WERE TRUE Using our standard cutoff of .05 for statistical significance, we would say that this is NOT a statistically significant difference. We fail to reject the null. We cannot conclude that the company’s claim is incorrect

Calculator Unlike previous tests, we HAVE to enter our data into the STAT-Edit before doing the test (and the older TI-83’s don’t even have it). So it isn’t quite as convenient If you want to try it, go to STAT—TESTS, and then GOF-Test You have to enter all of your observed in L1, and all of your expected in L2 But, on the plus side, you don’t have to calculate the test statistic by hand Particularly as you get more categories, this may or may not be worth it

Go ahead and give it a try. Do we get the same p-value?
P.S. make sure you use counts, not proportions

Conditions Note: for large sample size condition, use EXPECTED counts, not observed

An Example Problem

An Example Problem p-value of No, we fail to reject the null hypothesis that there is no difference, and we cannot conclude that births have different likelihoods on different days of the week.

Another Example When creating a new form of Tobacco plants, biologists expect the color of the new plant to be distributed as follows: 25% of the offspring will be green, 25% will be albino (white), and 50% will be yellow. Of 84 plants, 23 were green, 50 were yellow, and 11 were albino Were the results consistent with the researchers’ expectations?

Another Example When creating a new form of Tobacco plants, biologists expect the color of the new plant to be distributed as follows: 25% of the offspring will be green, 25% will be albino (white), and 50% will be yellow. Of 84 plants, 23 were green, 50 were yellow, and 11 were albino Were the results consistent with the researchers’ expectations? NO (p-value of .039)

Follow-up Analysis This is ONLY necessary on an AP Test if they specifically ask for it As far as I can tell, this has only happened once So being able to do the test is WAY more important than the follow-up analysis If we reject the null, we conclude that the expected distribution is incorrect. However, we don’t know which categories contribute to that the most. See next slide

So (even though we didn’t reject the null here), the one contributing the most towards rejecting is yellow, because of the 5.186 In fact, more than half of the total distance is due to the difference in yellow

An Example The # of students with each letter grade in AP Statistics is listed below. If the school wanted 20% A’s, 30% B’s, 30% C’s, 15% D’s, and 5% F’s, would Mr. Wetherbee get in trouble? If yes, which letter is contributing most to that? A B C D F # Students 10 9 2 5

A B C D F Observed 10 9 2 5 Expected 7 10.5 5.25 1.75 Ch-Square Contribution 1.286 .214 2.012 6.036 Total χ 2 =9.762 P-value: .045 Yes, Mr. Wetherbee would be in trouble The # of F’s contributes more 60% of this distance, so this would primarily be due to the high # of F’s The second largest contribution would be from the lower-than-expected number of D’s.

Chi-Square Goodness of Fit

Similar presentations

Presentation on theme: "Chi-Square Goodness of Fit"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chi-Square Goodness of Fit

Similar presentations

Presentation on theme: "Chi-Square Goodness of Fit"— Presentation transcript:

Similar presentations

About project

Feedback