Presentation is loading. Please wait.

Presentation is loading. Please wait.

MATH 4030 – 11A ANALYSIS OF R X C TABLES (GOODNESS-OF-FIT TEST) 1.

Similar presentations


Presentation on theme: "MATH 4030 – 11A ANALYSIS OF R X C TABLES (GOODNESS-OF-FIT TEST) 1."— Presentation transcript:

1 MATH 4030 – 11A ANALYSIS OF R X C TABLES (GOODNESS-OF-FIT TEST) 1

2 Example 1: 3 shops are used to repair electric motors. 100 motors are send to each shop. 2 We need to compare the works done by 3 shops. - How many independent samples? - What should we compare? Null hypothesis? - How do we test? Sample statistic design? - Distribution of statistic(s)? Shop 1Shop 2Shop 3Total Repair Complete785654188 Adjustment Needed15303176 Repair Incomplete7141536 Total100 300

3 Analysis of r  c Table (Sec. 10.4): 3 C1C2…CcR Total R1X 11 X 12 …X 1c R2X 21 X 22 …X 2c …………… RrX r1 X r2 …X rc C Total N Sample sizes Total count for each outcome Grand Total Shop 1Shop 2Shop 3Total Repair Complete785654188 Adjustment Needed15303176 Repair Incomplete7141536 Total100 300

4 Shop 1Shop 2Shop 3Total Repair Complete785654188 Adjustment Needed15303176 Repair Incomplete7141536 Total100 300 4 If there is no difference between shops, the proportion of “complete”, “adjustment”, and “incomplete” are the same for all shops. 188/300 76/300 36/300 188/300 X 100 = 62.7 36/300 X 100 = 12

5 c independent samples from c populations. Column totals are fixed (sample sizes). 1. Test for Homogeneity 5 C1C2…Cc R Total R1X 11 X 12 …X 1c R2X 21 X 22 …X 2c …………… RrX r1 X r2 …X rc C Total N Test whether the outcome distributions are the same for all populations.

6 Example 2: To determine whether there really is a relationship (dependency) between a student’s lecture attendance and final exam mark, data are collected from 400 students. 6 - How many independent sample(s)? - What should we compare? Null hypothesis? - How do we test? Sample statistic design? - Distribution of statistic(s)?

7 7 C1C2…CcR Total R1X 11 X 12 …X 1c R2X 21 X 22 …X 2c …………… RrX r1 X r2 …X rc C Total N Total count for each outcome of lecture attendance Total count for each outcome (of final exam marks) Sample size

8 8 If there is no relationship (dependence) between “lecture attendance” and “final exam performance”, 112/400 167/400 121/400 23/60 = 112/400 ? Or 23/400 = 112/400 X 60/400? 60/400 188/400 152/400 63/152 = 121/400 ? Or 63/400 = 121/400 X 152/400?

9 One sample, but two (categorical) measures(values) for each unit in the sample. Only the grand total is fixed (sample sizes). 2. Test for Independency 9 C1C2…Cc R Total R1X 11 X 12 …X 1c R2X 21 X 22 …X 2c …………… RrX r1 X r2 …X rc C Total N Test whether the two (categorical) variables/factors are independent.

10 Chi-Square (Goodness of Fit) Test; 10 Expected cell frequencies: Observed cell frequencies: Test statistic: Assumption: Large sample(s) so that all expected frequencies are at least 5. Otherwise, may combine groups.

11 Example 1: 3 shops are used to repair electric motors. 100 motors are send to each shop. 11 OijShop 1Shop 2Shop 3Total Repair Complete785654188 Adjustment Needed15303176 Repair Incomplete7141536 Total100 300 EijShop 1Shop 2Shop 3Total Repair Complete62.67 188 Adjustment Needed25.33 76 Repair Incomplete12.00 36 Total100 300

12 Hypothesis Testing: 12 OijShop 1Shop 2Shop 3Total Repair Complete785654188 Adjustment Needed15303176 Repair Incomplete7141536 Total100 300 Level of significance: (Right-tailed) With df = (3 – 1)(3 – 1) = 4, the critical chi-squared value is 9.488. The chi-squared value from the sample is 16.92. Conclusion: Since the chi-square value from the sample data is greater than 9.488, we reject the null hypothesis. At least for one level of repair status, the proportions from the 3 shops are not all the same.

13 Example 2: To determine whether there really is a relationship (dependency) between a student’s lecture attendance and final exam mark. 13 Oij Less than 25% attendance Between 25% - 60% attendence Over 60% attendence Total Final Exam < 50%236029112 Final Exam 51-70%287960167 Final Exam > 70%94963121 Total60188152400 Eij Less than 25% attendance Between 25% - 60% attendence Over 60% attendence Total Final Exam < 50%16.852.6442.56112 Final Exam 51-70%25.0578.4963.46167 Final Exam > 70%18.1556.8745.98121 Total60188152400

14 Oij Less than 25% attendance Between 25% - 60% attendence Over 60% attendence Total Final Exam < 50%236029112 Final Exam 51- 70% 287960167 Final Exam > 70%94963121 Total60188152400 Hypothesis Testing: 14 Level of significance: (Right-tailed) With df = (3 – 1)(3 – 1) = 4, the critical chi-squared value is 9.488. The chi-squared value from the sample is 9.488. Conclusion: Since the chi-square value from the sample data is greater than 9.488, we reject the null hypothesis. Attendance of lectures and final exam mark are not independent.

15 Goodness of Fit Test (for distributions); 15 To compare the observed frequencies (from sample(s)) and a theoretical distribution. The grouped frequency table with all frequencies at least 5. Combine groups if needed; Calculate the expected frequencies using theoretical probabilities. Same chi-square statistic can be used; The null hypothesis: the population where the sample is drawn from has the assumed distribution (normal, Poisson, etc.)

16 Example 3: 16 Number of radio messages received by an air traffic controller during a time period of 5 minutes is assumed to have Poison distribution with parameter 4.6. To verify this assumption, data from 400 five-minutes intervals are collected, with the frequency table: Number of radio messages Observed frequencies 03 115 247 376 468 574 646 739 815 99 105 112 120 131

17 17 Number of radio messages Observed frequencies Poisson Probability Expected frequencies 030.01014.02 1150.046218.50 2470.106342.54 3760.163165.23 4680.187575.01 5740.172569.01 6460.132352.91 7390.086934.77 8150.050019.99 990.025510.22 1050.01184.70 1120.00491.97 1200.00190.75 1310.00100.39 vs


Download ppt "MATH 4030 – 11A ANALYSIS OF R X C TABLES (GOODNESS-OF-FIT TEST) 1."

Similar presentations


Ads by Google