Chi-Square Goodness of Fit Test By: Laura Hasley Statistics Gr. 12 Today, we will be learning about the chi-square goodness of fit test. This test is used to check how close an observed set of data matches the expected value.
Sum of relative difference: (O-E)2/E Always positive X2 and p-values What is X2 ? Parameter Sum of relative difference: (O-E)2/E Always positive X2 and p-values Distribution X2 is the parameter, or variable, we are solving for when conducting this test. It is the sum of the relative differences between observed data and expected data. To find the relative differences, you would do (observed-expected)2/expected. Then, by adding all of those values up, you would get the vale for X2 X2 is always positive because you are when finding the relative differences, you are squaring a value, taking away any negative values. As you increase the X2 value, the p-value becomes smaller. This is because when you increase the difference between observed and expected counts, there’s less of a chance that the expected data is accurate. Based on the relative differences, the distribution for X2 will always be right-skewed. Remember when determining skew you look at the tail of the curve not the hump.
Running the test 1. Name the test 2. State null hypothesis 3. State alternative hypothesis 4. Significance Level In running the Chi-square goodness of fit test, there are nine steps to follow. The first is to name the test you are using. So, all you’d do is write down chi square goodness of fit test. Next, you need to state the null hypothesis. For this test, the null hypothesis will always follow this format, the distribution of whatever you are testing matches the distribution provided by the company. This is the symbol to use when writing out the hypothesis. After that, you must state the alternative hypothesis. This hypothesis will always follow this format, the distribution of whatever you are testing does not match the distribution provided by the company. This is the symbol to use when writing out the hypothesis. The fourth step is to state the significance level. This will always be provided for you in the question. This is the symbol for significance level.
Running the test (cont.) 5.Conditions RS Expected count > 5 6.Sketch the distribution 7. Calculate X2 and dF 8. Find P-value Next comes checking the conditions for this test. First, we have to make sure the sample is a random one. Next, you have to make sure the expected count is greater than five for each category. To find the expected count, you take the total number in the random sample and multiply it by the percentages given. After the conditions have been met, you have to sketch the curve. A simple curve with axis will suffice, like pictured here. However, make sure you add a domain and you label where X2 would fall. Now is the time to calculate X2 and the degrees of freedom for the test. Remember to find X2, you sum all of the relative differences and to find the degrees of freedom, you would do the number of categories-1 Once those are found, you can use your Ti-83 or 84 to find the p-value
P-value < significance level P-value > significance level Interpreting p-value P-value < significance level P-value > significance level After finding the p-value, one of two conclusions can be drawn. If the p-value is less than the significance level, you can reject HO and you have evidence to suggest that the proportions provided by a company aren’t true. However, if the p-value is greater than the significance level, you must not reject the HO. You never want to accept HO. Since you can’t reject HO, you do not have evidence to suggest that the proportions provided by a company aren’t true
Different x2 tests Goodness of Fit Test of Independence Relationship between observed and expected values Proportions Test of Independence Tests to see if two data factors are independent Contingency Table Who can tell me the other type of X2 test we’ve already studied? Good, we’ve already looked at the X2 test of independence. Now, some students tend to get these confused because they both deal with X2, so here are some ways to differentiate between to two test. First, the tests have different purposes. The Goodness of fit test checks to see if provided proportions are really what companies say they are and the test of independence checks to see if two categories are independent of one another. Next, probably the easiest way to decipher which test to use is to look at the data provided. A goodness of fit test will give you proportions while the test of independence will give you a contingency table.
Bibliography Graph on slide 2-https://onlinecourses.science.psu.edu/stat100/node/32 HO and HA symbols on slide 3- http://www.psychology.emory.edu/clinical/bliwise/Tutorials/SPOWER/spowhtest.h tm Alpha symbol on slide 3-http://blog.minitab.com/blog/michelle-paret/alphas-p- values-confidence-intervals-oh-my Graph on slide 4- http://moore7oddanswers.wikispaces.com/++++12++++.++++11 Rejected picture on slide 5- http://www.jasonshen.com/2010/the-rejection- therapy-challenge-week-1/ Accepted picture on slide 5-http://depositphotos.com/3654522/stock-photo- Rejected-and-accepted-stamps.html Red circle with slash through it on slide 5-Microsoft Office Word