Download presentation
Presentation is loading. Please wait.
Published byNaomi Parks Modified over 9 years ago
1
Goodness of Fit Multinomials
2
Multinomial Proportions Thus far we have discussed proportions for situations where the result for the qualitative variable could be only “success” or “failure” Now we discuss the situation where there are multiple outcomes for the qualitative variable
3
EXAMPLE Suppose the 1000 people had 5 choices COLA OBSERVED FREQUENCY (1) Coke f 1 = 410 (2) Pepsif 2 = 350 (3) RCf 3 = 80 (4) Shastaf 4 = 50 (5) Joltf 5 = 110
4
QUESTIONS (1) Can we conclude that there are differences in cola preference? (2) Last year 40% favored Coke, 35% Pepsi and 25% all other brands. Can we conclude these preferences have changed? (3) Give a 95% confidence interval for those who favor Coke.
5
(1) CAN IT BE CONCLUDED THAT THERE ARE DIFFERENCES IN COLA PREFERENCES? The answer is NO unless we can conclusively show otherwise. H 0 : (NO) p 1 = p 2 = p 3 = p 4 = p 5 =.2 H A : (YES) At least one p j .2 =.05 THIS IS A 2 (Chi-squared) TEST!
6
THE 2 (Chi-squared) STATISTIC The 2 (Chi-squared) statistic is defined as the cumulative mean square differences between the observed values (f i ) and the expected values if H 0 were true (e i )
7
RULE OF 5 2 (Chi-squared) is actually only an approximate distribution for the test statistic To be a “valid” approximation: ALL e i ’s should be 5 If the rule of 5 is violated, combine some categories so that the condition is met
8
THE 2 (Chi-squared) TEST Reject H 0 if 2 > 2.05,DF DF = k-1, where k = # categories (=5, here) 2.05,4 = a critical value found in a 2 table 2.05,4 = 9.48773 If H 0 were true, p 1 = p 2 = p 3 = p 4 = p 5 =.2 We would expect to find: e 1 =.2(1000) = 200; and e 2 =.2(1000) = 200; e 3 =.2(1000) = 200; e 4 =.2(1000) = 200; e 5 =.2(1000) = 200 ALL e i ’s ARE 5
9
CALCULATION OF 2 THE MULTINOMIAL TABLE Cola ObservedExpected Difference Mean Sq. Dif. i f i e i (f i - e i ) (f i - e i ) 2 /e i 1 410 200 210 220.5 2 350 200 150 112.5 3 80 200 -120 72.0 4 50 200 -150 112.5 5 110 200 - 90 40.5 SUM = 558.0 = 2
10
RESULTS 2 = 558.0 > 2.05,4 = 9.48773 There is strong evidence that differences exist in cola preferences
11
(2) CAN IT BE CONCLUDED COLA PREFERENCES HAVE CHANGED SINCE LAST YEAR? H 0 : (NO) p 1 =.40; p 2 =.35; p OTHER =.25 H A : (YES) At least one p j its hypothesized value =.05 There are now k = 3 categories. Reject H 0 if 2 > 2.05,2 = 5.99147
12
CALCULATION OF 2 Cola ObservedExpected Difference Mean Sq. Dif. i f i e i (f i - e i ) (f i - e i ) 2 /e i 1 410 400 10.25 2 350 350 0 0 Other 240 250 -10.40 SUM =.65 = 2 2 =.65 < 5.99147 Cannot conclude preferences have changed All e i ’s > 5
13
(3) CONFIDENCE INTERVAL FOR PROPORTION WHO FAVOR COKE This is now binomial –Coke and everything else
14
Excel The Excel function CHITEST returns the p-value for the hypothesis test. Its form is CHITEST (Range of observed values,Range of estimated values)
15
=C3*$B$8 Drag down to D4:D7 =CHITEST(B3:B7,D3:D7 ) VERY LOW p-value =C12*$B$15 Drag down to D13:D14 VERY HIGH p-value =CHITEST(B12:B14,D12:D14 )
16
Review Multinomial problems exist when there are more than two possible outcomes for a qualitative variable Excel Approach -- compare observed values to expected values by using CHITEST to give the p-value Hand approach -- Compare the 2 statistic to 2 ,DF where DF = # categories - 1 = k-1 The value of the 2 statistic is:
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.