Download presentation
Presentation is loading. Please wait.
Published byMabel Lloyd Modified over 9 years ago
1
Chi-square test Chi-square test or 2 test Notes: Page 218 1.Goodness of Fit 2.Independence 3.Homogeneity
2
2 Test for Independence Used with categorical, bivariate data from ONE sample Used to see if the two categorical variables are associated (dependent) or not associated (independent)
3
Assumptions & formula remain the same!
4
Hypotheses – written in words H 0 : two variables are independent H a : two variables are dependent Be sure to write in context!
5
A beef distributor wishes to determine whether there is a relationship between geographic region and cut of meat preferred. If there is no relationship, we will say that beef preference is independent of geographic region. Suppose that, in a random sample of 500 customers, 300 are from the North and 200 from the South. Also, 150 prefer cut A, 275 prefer cut B, and 75 prefer cut C.
6
Expected Counts Assuming H 0 is true,
7
If beef preference is independent of geographic region, how would we expect this table to be filled in? NorthSouthTotal Cut A150 Cut B275 Cut C75 Total300200500 9060 165110 4530 (300/500) x150 = 90 (300/500) x150 = 90 (200/500) x150 = 60 (200/500) x150 = 60
8
Degrees of freedom Or cover up one row & one column & count the number of cells remaining!
9
Now suppose that in the actual sample of 500 consumers the observed numbers were as follows: Is there sufficient evidence to suggest that geographic regions and beef preference are not independent? (Is there a difference between the expected and observed counts?) NorthSouthTotal Cut A10050150 Cut B150125275 Cut C502575 Total300200500
10
Assumptions: Have a random sample of people All expected counts are greater than 5. H 0 : geographic region and beef preference are independent H a : geographic region and beef preference are dependent P-value =.0226df = 2 =.05 Since p-value < , I reject H 0. There is sufficient evidence to suggest that geographic region and beef preference are dependent. Expected Counts: N S A90 60 B165110 C45 30 Calculator: 2 nd x -1 (Matrix), EDIT, 3 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Enter Expected data STAT, TEST, X 2 – Test, Calculate Calculator: 2 nd x -1 (Matrix), EDIT, 3 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Enter Expected data STAT, TEST, X 2 – Test, Calculate
11
Assumptions: Have a random sample of people All expected counts are greater than 5. H 0 : geographic region and beef preference are independent H a : geographic region and beef preference are dependent P-value =.0226df = 2 =.05 Since p-value < , I reject H 0. There is sufficient evidence to suggest that geographic region and beef preference are dependent. Expected Counts: N S A90 60 B165110 C45 30 HOWEVER, the Calculator can find Matrix B for you! Try this: 2 nd x -1 (Matrix), EDIT, 3 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Leave this blank STAT, TEST, X 2 – Test, Calculate It still ran the test! Now go to 2 nd x -1 (Matrix), EDIT, Select B and you will see the Expected Values HOWEVER, the Calculator can find Matrix B for you! Try this: 2 nd x -1 (Matrix), EDIT, 3 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Leave this blank STAT, TEST, X 2 – Test, Calculate It still ran the test! Now go to 2 nd x -1 (Matrix), EDIT, Select B and you will see the Expected Values SO, There are TWO ways to find your Expected Values: Use the formula OR let the calculator find them for you. Be sure to know BOTH methods SO, There are TWO ways to find your Expected Values: Use the formula OR let the calculator find them for you. Be sure to know BOTH methods
12
2 test for homogeneity single categorical two (or more) independent samplesUsed with a single categorical variable from two (or more) independent samples Used to see if the two populations are the same (homogeneous)
13
Assumptions & formula remain the same! Expected counts & df are found the same way as test for independence. Only Only change is the hypotheses!
14
Hypotheses – written in words H 0 : the proportions for the two (or more) distributions are the same H a : At least one of the proportions for the distributions is different Be sure to write in context!
15
Ex 2) The following data is on drinking behavior for independently chosen random samples of male and female students. Does there appear to be a gender difference with respect to drinking behavior? (Note: low = 1-7 drinks/wk, moderate = 8-24 drinks/wk, high = 25 or more drinks/wk) MenWomenTotal None140186326 Low4786611139 Moderate300173473 High631679 Total98110362017
16
Assumptions: Have 2 random sample of students All expected counts are greater than 5. H 0 : the proportions of drinking behaviors is the same for female & male students H a : at least one of the proportions of drinking behavior is different for female & male students P-value = 8.67E-21 ≈ 0 df = 3 =.05 Since p-value < , I reject H 0. There is sufficient evidence to suggest that drinking behavior is not the same for female & male students. Expected Counts: M F 0158.6167.4 L554.0585.0 M230.1243.0 H38.440.6 Remember, let the calculator find the Expected Counts (Matrix B) for you: 2 nd x -1 (Matrix), EDIT, 4 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Leave this blank STAT, TEST, X 2 – Test, Calculate. 2 nd x -1 (Matrix), EDIT, Select B and you will see the Expected Values Remember, let the calculator find the Expected Counts (Matrix B) for you: 2 nd x -1 (Matrix), EDIT, 4 x 2 (Row x Column) For Matrix A: Enter Observed data For Matrix B: Leave this blank STAT, TEST, X 2 – Test, Calculate. 2 nd x -1 (Matrix), EDIT, Select B and you will see the Expected Values
17
Homework: Finish Page 222, 223 This concludes our last lesson. The test will be next Tuesday. (We will practice and review Friday and Monday) SuperMonday 6-9pm, RHS Cafeteria Annex – Be there
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.