Download presentation
Presentation is loading. Please wait.
Published byEdward Rose Modified over 9 years ago
1
WARM – UP: The Math club and the Spanish club traditionally are composed of a similar distribution of class level. A random sample of this year’s math club members is selected with the following results: 12 Freshman, 8 Sophomores, 21 Juniors, and 19 Seniors. A random sample of this year’s Spanish club members is selected with the following results: 20 Freshman, 10 Sophomores, 38 Juniors, and 36 Seniors. Is there evidence of a similar distribution of class level among the two groups? FreshmanSophomoreJuniorSenior Math Club 1282119 Spanish Club 20103836
2
WARM – UP: Is there evidence of a similar distribution of class level among the two groups? FreshmanSophomoreJuniorSenior Math Club 1282119 Spanish Club 20103836 X 2 = 0.614 X 2 Test ofHomogeneity P-Value = 0.8931 H 0 : The Distribution of class level among the Math Club is similar to the Spanish Club. H a : The Distribution of class level among the Math Club is NOT similar to the Spanish Club. Since the P-Value is greater than α = 0.05 we FAIL to REJECT H 0. There is insufficient Evidence to conclude that there is a difference in class level % among the clubs. 1.SRS – stated 2.All Expected Counts are 1 or greater. 3.No more than 20% of the Expected Counts are less than 5. 11.716.5921.5920.12 20.2911.4237.4234.88
3
The THREE types of Chi-Square Tests: 1.The Chi-Square Test for Goodness of Fit. 2.The Chi-Square Test for Independence. 3.The Chi-Square Test for Homogeneity. H 0 : The Distribution of the variable is uniform or is equivalent to prescribed %. H a : The Distribution does not equal… H 0 : Variable #1 is Independent of Variable #2 (No Relationship). H a : Variable #1 is NOT Independent of Variable #2 (or an association is present). H 0 : The Distribution of the variable is equivalent among the populations. H a : The Distribution of the variable within the populations is NOT equivalent to that of the other pop.
4
% color PlainPeanut CrispyMinisPeanut Butter Almond Brown 13%12% 17%13%10% Blue 24%23% 17%25%20% Orange 20%23% 16%25%20% Green 16%15% 16%12%20% Red 13%12% 17%12%10% Yellow 14%15% 17%13%20% 1.Hypothesis 2.Name of Test 3.Obs/Exp. Counts 4.X 2 / p-Value 5.Check Assumptions 6.Conclusion
5
1.Lay out a large sheet of paper—you’ll be sorting M&Ms on this. 2.Open up a bag of M&Ms 3.DO NOT EAT ANY OF THE M&M’S… yet 4.Separate the M&M’s into color categories and count the number of each color of M&M you have. 5.Record your counts in Data Chart 1. 6.Determine the Chi square value and p-value for your data. Data Chart Color Categories BrownBlueOrangeGreenRedYellowTotal Expected % %%%% Observed (O) count Expected (E) count (O – E) 2 /E Σ ((O-E) 2 /E) = X 2 P-Value = X 2 cdf(X 2, E99, df):_________
6
H 0 : The Distribution of Colors in the packet of “Plain” M&M’s is equal to the Percentage indicated by the Mars Company. H a : The Distribution of Colors in the packet of “Plain” M&M’s is NOT equal to the Percentage indicated by the Mars Company. 1.Hypothesis 2.Name of Test 3.Obs/Exp. Counts 4.X 2 / p-Value 5.Check Assumptions 6.Conclusion
7
EXAMPLE: Is the Distribution of colors in a package of PLAIN M&M’s statistically equivalent to the Distribution of colors in a package of PEANUT M&M’s? A random package of plain and peanut M&M’s are selected and analyzed. BrownBlueOrangeGreenRedYellow PLAIN432705 PEANUT132211 X 2 Test of Homogeneity P-Value = X 2 cdf (4.967, E99, 5) = 0.4200 H 0 : The Distribution of colors in the Plain Packet of M&M’s is equivalent to that of the Peanut M&M’s. H a : The Distribution of colors in the Plain Packet of M&M’s is NOT equivalent to that of the Peanut M&M’s. X 2 = 4.967 3.394.062.716.100.684.06 1.611.941.292.900.321.94
8
EXAMPLE: Is the Distribution of colors in a package of PLAIN M&M’s statistically equivalent to the Distribution of colors in a package of PEANUT M&M’s? A random package of plain and peanut M&M’s are selected and analyzed. BrownBlueOrangeGreenRedYellow PLAIN432705 PEANUT132211 X 2 Test of Homogeneity P-Value = X 2 cdf (4.967, E99, 3) = 0.4200 H 0 : The Distribution of colors in the Plain Packet of M&M’s is equivalent to that of the Peanut M&M’s. H a : The Distribution of colors in the Plain Packet of M&M’s is NOT equivalent to that of the Peanut M&M’s. X 2 = 4.967 Since the P-Value is NOT less than α = 0.05 there is NO evidence to reject H 0. No evidence that the Distributions are NOT equivalent. Although the results are uncertain. CONDITIONS 1.SRS - Stated √ 2.All Expected Counts are 1 or greater. X 3.No more than 20% of the Expected Counts are less than 5. X 3.394.062.716.100.684.06 1.611.941.292.900.321.94
9
An SRS of 120 voters from AR and an SRS of 115 voters from TX was taken to determine whether there was a significant difference in how people, as of that moment, would vote with regards to Obama. Definitely Would Mostly Likely Probably would Not Definitely Would Not Arkansas 35452812 Texas 30381730 X 2 = 11.277 X 2 Test ofHomogeneity P-Value = X 2 cdf (11.277, E99, 3) = 0.0103 H 0 : The Distribution of how people would vote today in the State of Arkansas is equal to that of Texas. H a : The Distribution of how people would vote today in the State of Arkansas is NOT equal to that of Texas. 33.1942.2822.9821.45 31.8140.6222.0220.55 Since the P-Value is less than α = 0.05 we REJECT H 0. Support is different between AR and TX. 1.SRS – stated 2.All Expected Counts are 1 or greater. 3.No more than 20% of the Expected Counts are less than 5.
10
Does ones regional location have an affect on their Political affiliation? To begin to investigate this situation data from 177 random voters was analyzed. DemocratRepublican West3927 Northeast3515 Southeast1744 Political Affiliation Location a.) Find the Proportion of Democrats among each Democrats among each regional location. regional location. b.) Make a Bar Chart for these Prop. these Prop. c.) Find the Expected Values for each cell. each cell. d.) Find evidence of an affect. 0.591 0.700 0.279 % of Dem. 0 50 100 N NW SE Regional Location 32.07 24.29 29.64 33.93 25.71 31.36
11
Does ones regional location have an affect on their Political affiliation? To begin to investigate this situation data from 177 random voters was analyzed. DemocratRepublican West3927 Northeast3515 Southeast1744 Political Affiliation Location d.) Find evidence of an affect. 32.07 24.29 29.64 33.93 25.71 31.36 X 2 = 22.01 X 2 Test ofIndependence P-Value = X 2 cdf (22.01, E99, 2) = 0 H 0 : Political Affiliation is INDEPENDENT of Location H a : Political Affiliation is NOT INDEPENDENT of Location Since the P-Value is less than α = 0.05 we will REJECT H 0. There is evidence that Political Affiliation is NOT INDEPENDENT of Location. 1.SRS – stated 2.All Expected Counts are 5 or greater.
12
#18 Medical researchers followed an SRS of 6272 Swedish men for 30 years to see if there was an association between the amount of fish in their diet and Prostate Cancer. Is there any evidence of such an association? Fish Consumption Total Subjects Prostate Cancer Never12414 Small part of diet2621201 Moderate part2978209 Large part54942 NO Prostate Cancer14110 2012420 2092769 42507 9.21114.79 194.742426.26 221.262756.74 40.79508.21 X 2 Test ofIndependence P-Value = X 2 cdf (3.677, E99, 3) = 0.2985 H 0 : There is NO relationship between fish consumption and the development of Prostate Cancer. H a : There is relationship between fish consumption and the development of Prostate Cancer. X 2 = 3.677
13
Fish Consumption Total Subjects Prostate Cancer Never12414 Small part of diet2621201 Moderate part2978209 Large part54942 NO Prostate Cancer14110 2012420 2092769 42507 9.21114.79 194.742426.26 221.262756.74 40.79508.21 X 2 Test ofIndependence P-Value = X 2 cdf (3.677, E99, 3) = 0.2985 H 0 : There is NO relationship between fish consumption and the development of Prostate Cancer. H a : There is relationship between fish consumption and the development of Prostate Cancer. X 2 = 3.677 Since the P-Value is NOT less than α = 0.05 there is NO evidence to reject H 0. There is NO relationship between fish consumption and Prostate Cancer. CONDITIONS 1.SRS - Stated √ 2.All Expected Counts are 1 or greater. √ 3.No more than 20% of the Expected Counts are less than 5. √ WARM – UP Medical researchers followed 6272 Swedish men for 30 years to see if there was an association between the amount of fish in their diet and Prostate Cancer. Is there any evidence of such an association?
14
EXAMPLE: Texas introduces a New scratch off Lottery ticket in which it claims to have the following prize payouts: 2% - $1000; 10% - $100; 25% - $20; 40% - $5; and the rest – Losing tickets. You buy a random sample of 200 tickets and get the following results: Are` the ticket payouts valid? $1000$100$20$5$0 OBS. DATA 621266384 H 0 : The Texas Lottery Scratch off payout matches the advertised %’s. H a : NOT X 2 = 47.574 X 2 Goodness of Fit Test EXP. DATA 420508046 P-Value = X 2 cdf (47.574, E99, 4) = 0 Since the P-Value is less than α = 0.05 we REJECT H 0. The Texas Lottery Scratch off payout does NOT match the advertised %.
15
CONDITIONS 1.SRS -stated 2.All Expected Counts are 1 or greater. √ 3.No more than 20% of the Expected Counts are less than 5. √ EXP. DATA 420508046
16
WHAT Chi-Square Test should you choose? It is ALL A MATTER OF HOW THE DATA IS SAMPLED. Goodness of Fit Test Goodness of Fit Test – ONE Sample from a single Population using ONE Categorical Variable. Test of Independence – Test of Independence – ONE Sample from a single Population using TWO Categorical Variables. Test of Homogeneity Test of Homogeneity – TWO or More Samples from Populations using ONE Categorical Variable.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.