Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 14: Chi-Square Procedures. 14.1 – Test for Goodness of Fit.

Similar presentations


Presentation on theme: "Chapter 14: Chi-Square Procedures. 14.1 – Test for Goodness of Fit."— Presentation transcript:

1 Chapter 14: Chi-Square Procedures

2 14.1 – Test for Goodness of Fit

3 Chi-square test for goodness of fit: Used to determine if what outcome you expect to happen actually does happen Observed Counts: Count of actual results Expected Counts:Count of expected results To find the expected counts multiply the proportion you expect times the sample size (2)(2)

4 Note: Sometimes the probabilities will be expected to be the same and sometimes they will be expected to be different

5 Ho: All of the proportions are as expected Chi-square test for goodness of fit: H A : One or more of the proportions are different from expected

6 Chi-square test for goodness of fit: df = k – 1 categories

7 Properties of the Chi-distribution: Always positive (being squared) Skewed to the right Distribution changes as degrees of freedom change

8

9 Area is shaded to the right Properties of the Chi-distribution: Always positive (being squared) Skewed to the right Distribution changes as degrees of freedom change

10 Chi-square test for goodness of fit: Conditions: all expected counts are ≥ 5

11 Calculator Tip!Goodness of Fit L1: Observed L2: Expected L3: (L1 – L2) 2 / L2 List – Math – Sum – L3

12 Calculator Tip! P(  2 > #) 2nd dist -  2 cdf(  2, big #, df)

13

14

15 Example #1 A study yields a chi-square statistic value of 20 (  2 = 20). What is the P value of the test if the study was a goodness-of-fit test with 12 categories?

16

17 Example #1 A study yields a chi-square statistic value of 20 (  2 = 20). What is the P value of the test if the study was a goodness-of-fit test with 12 categories? < P <0.0250.05 Or by calc:  2 cdf(20, 10000000, 11) = 2nd dist -  2 cdf(  2, big #, df) 0.04534

18 Example #2 Find the expected values and calculate the  2 of the 96 rolls of the die. Then find the probability. Face Value 123456 Observed191510141721 Expected 16 (19 – 16) 2 16 =0.5625 (15 – 16) 2 16 =0.0625 2.250.250.06251.5625

19 Example #2 Find the expected values and calculate the  2 of the 96 rolls of the die. Then find the probability. Face Value 123456 Observed191510141721 Expected 16 0.56250.06252.250.250.06251.5625 0.5625 + 0.0625 + 2.25 + 0.25 + 0.0625 + 1.5625 = 4.75

20 P(  2 > 4.75) = df =6 – 1 = 5

21

22 P(  2 > 4.75) = df =6 – 1 = 5 P(  2 > 4.75) > 0.25 Or by calc:  2 cdf(4.75, 10000000, 5) = 0.4471

23 Example #3 The number of defects from a manufacturing process by day of the week are as follows: The manufacturer is concerned that the number of defects is greater on Monday and Friday. Test, at the 0.05 level of significance, the claim that the proportion of defects is the same each day of the week. MondayTuesdayWednesdayThursdayFriday #3623262540 P: The true proportion of defects from a manufacturing process per day

24 H: The proportion of defects from a manufacturing process is the same Mon-Fri Ho: The proportion of defects from a manufacturing process is not the same Mon-Fri (on one day or more) HA:HA:

25 A: MondayTuesdayWednesdayThursdayFriday #3623262540 Expected 150 errors total. 150 5 = 30 30 30 3030 30 (All expected counts > 5) N:Chi-Square Goodness of Fit

26 T:  2 = MondayTuesdayWednesdayThursdayFriday #3623262540 Expected 30 30 3030 30  (O – E) 2 E = = 7.533 (36 – 30) 2 30 + (23 – 30) 2 30 + …

27 O: P(  2 > 7.533) = df =5 – 1 = 4

28

29 P(  2 > 7.533) = df =5 – 1 = 4 0.10 7.533) > 0.15 Or by calc:  2 cdf(7.533, 10000000, 4) = 0.1102 O:

30 M: P ___________  0.11020.05 > Accept the Null

31 S: There is not enough evidence to claim the proportion of defects from a manufacturing process is not the same Mon-Fri (on one day or more)

32 14.2 – Inference for Two-Way Tables

33 To compare two proportions, we use a 2-Proportion Z Test. If we want to compare three or more proportions, we need a new procedure.

34 Two – Way Table:  Organize the data for several proportions  R rows and C columns  Dimensions are r x c

35 To calculate the expected counts, multiply the row total by the column total, and divide by the table total: Expected count = Row total x Column total table total Degrees of Freedom: (r – 1)(c – 1)

36 Chi-Square test for Homogeneity: Ho: The proportions are the same among 2 or more populations Ha: The proportions are different among 2 or more populations Expected Counts are ≥ 5 SRS Conditions: Compare two or more populations on one categorical variable

37 Chi-Square test for Association/Independence: Ho: There is no association between two categorical variables Ha: There is an association Expected Counts are ≥ 5 SRS Conditions: Two categorical variables collected from a single population

38 Calculator Tip!Test for Homogeneity/Independence 2 nd – matrx – edit – [A] – rxc – Table info Stat – tests –  2 –test Observed: [A] Expected: [B] Calculate Then: Note: Expected: [B] is done automatically!

39 Example #1 The table shows the number of people in each grade of high school who preferred a different color of socks. a. What is the expected value for the number of 12th graders who prefer red socks? Expected count = Row total x Column total table total 20 x 14 56 = 20 14 18 1215 56 =5

40 Example #1 The table shows the number of people in each grade of high school who preferred a different color of socks. b. Find the degrees of freedom. (r – 1)(c – 1) (3 – 1)(4 – 1) (2)(3) 6

41 Example #2 An SRS of a group of teens enrolled in alternative schooling programs was asked if they smoked or not. The information is classified by gender in the table. Find the expected counts for each cell, and then find the chi-square statistic, degrees of freedom, and its corresponding probability. Expected Counts:Row total x Column total table total 70 x 79 217 79138 70 147 217 147 x 79 217 70 x 138 217 147 x 138 217 = = = = 25.484 53.516 44.516 93.484

42  2 =  (O – E) 2 E = Expected Counts: SmokerNon-Smoker Male Female 25.484 53.516 44.516 93.484 0.56197 (23 – 25.484) 2 25.48 + (47 – 44.516) 2 44.516 + (56 – 53.516) 2 53.516 + (91 – 93.484) 2 93.484

43  2 =  (O – E) 2 E = Expected Counts: SmokerNon-Smoker Male Female 25.484 53.516 44.516 93.484 0.56197 (r – 1)(c – 1) =(2 – 1)(2 – 1) =(1)(1) =1Degrees of Freedom: P(  2 > 0.56197 ) =

44

45  2 =  (O – E) 2 E = Expected Counts: SmokerNon-Smoker Male Female 25.484 53.516 44.516 93.484 0.56197 (r – 1)(c – 1) =(2 – 1)(2 – 1) =(1)(1) =1Degrees of Freedom: P(  2 > 0.56197 ) = More than 0.25OR:0.4535

46 Example #3 At a school a random sample of 20 male and 16 females were asked to classify which political party they identified with. DemocratRepublicanIndependent Male1172 Female781 Are the proportions of Democrats, Republicans, and Independents the same within both populations? Conduct a test of significance at the α = 0.05 level. P: The proportion of Democrats, Republicans, and Independents that are Male and Female.

47 H: Ho: The proportions are the same among males and females and their political party H A : The proportions are different among males and females and their political party

48 A: SRS(says) Expected Counts  5 DemocratRepublicanIndependent Male1172 Female781 Row total x Column total table total 20 x 18 36 16 x 18 36 20 x 15 36 16 x 15 36 = = = = 10 8 8.33 6.67 20 x 3 36 16 x 3 36 = = 1.67 1.33 18 153 20 16 36 Not all are expected counts are  5, proceed with caution!

49 N:Chi-Square test for Homogeneity

50 T:  2 =  (O – E) 2 E = 0.855 (11 – 10) 2 10 + (7 – 8) 2 8 DemocratRepublicanIndependent Male1172 Female781 Male108.331.67 Female86.671.33 Expected + (7 – 8.33) 2 8.33 + (8 – 6.67) 2 6.67 + (2 – 1.67) 2 1.67 + (1 – 1.33) 2 1.33

51 O: (r – 1)(c – 1) =(2 – 1)(3 – 1) =(1)(2) =2Degrees of Freedom: P(  2 > 0.855 ) =

52

53 More than 0.25 OR: 0.6521 O: (r – 1)(c – 1) =(2 – 1)(3 – 1) =(1)(2) =2Degrees of Freedom: P(  2 > 0.855 ) =

54 M: P ___________  0.6521 0.05 > Accept the Null

55 S: There is not enough evidence to claim the proportions are different among males and females and their political party

56 Example #4 The following chart represents the score distribution on the AP Exams for different subjects at a certain high school. Is there evidence that the score distribution is dependent from the subject? P: Determine if AP scores are independent from the subject areas.

57 H: Ho: AP scores and AP test are independent. H A : AP scores and AP test are not independent.

58 A: SRS(says) Expected Counts  5 Row total x Column total table total 52 6830 34 42 44 18 12 150

59 34 x 52 150 42 x 52 150 34 x 68 150 42 x 68 150 = = = = 11.787 14.56 15.413 19.04 34 x 30 150 42 x 30 150 = = 6.8 8.4 52 6830 34 42 44 18 12 150 44 x 52 150 44 x 68 150 == 15.25319.947 44 x 30 150 = 8.8

60 18 x 52 150 12 x 52 150 18 x 68 150 12 x 68 150 = = = = 6.24 4.16 8.16 5.44 18 x 30 150 12 x 30 150 = = 3.6 2.4 52 6830 34 42 44 18 12 150 Not all are expected counts are  5, proceed with caution!

61 N:Chi-Square test for Independence

62 T: 511.78715.4136.8 414.5619.048.4 315.25319.9478.8 26.248.163.6 14.165.442.4 Expected  2 =  (O – E) 2 E = 5.227698

63 O: (r – 1)(c – 1) =(5 – 1)(3 – 1) =(4)(2) =8Degrees of Freedom: P(  2 > 5.227698 ) =

64

65 More than 0.25 OR: 0.732985 O: P(  2 > 5.227698 ) = (r – 1)(c – 1) =(5 – 1)(3 – 1) =(4)(2) =8Degrees of Freedom:

66 M: P ___________  0.732985 0.05 > Accept the Null

67 S: There is not enough evidence to claim the AP scores and AP test are dependent


Download ppt "Chapter 14: Chi-Square Procedures. 14.1 – Test for Goodness of Fit."

Similar presentations


Ads by Google