WARM – UP: The Math club and the Spanish club traditionally are composed of a similar distribution of class level. A random sample of this year’s math.

Slides:



Advertisements
Similar presentations
By Josh Spiezle, Emy Chinen, Emily Lopez, Reid Beloff.
Advertisements

The Analysis of Categorical Data and Goodness of Fit Tests
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Chapter 11 Inference for Distributions of Categorical Data
Chapter 10 Chi-Square Tests and the F- Distribution 1 Larson/Farber 4th ed.
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
WARM UP Pick the 4 papers from back of room and then check your answer to 12 (d-i)
Chi-Square Test.
Chi-square Goodness of Fit Test
Analysis of Count Data Chapter 26
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Chapter 26: Comparing Counts AP Statistics. Comparing Counts In this chapter, we will be performing hypothesis tests on categorical data In previous chapters,
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means.
13.1 Goodness of Fit Test AP Statistics. Chi-Square Distributions The chi-square distributions are a family of distributions that take on only positive.
Section 10.1 Goodness of Fit. Section 10.1 Objectives Use the chi-square distribution to test whether a frequency distribution fits a claimed distribution.
Chapter 11: Inference for Distributions of Categorical Data.
Chi-square test Chi-square test or  2 test. crazy What if we are interested in seeing if my “crazy” dice are considered “fair”? What can I do?
10.1: Multinomial Experiments Multinomial experiment A probability experiment consisting of a fixed number of trials in which there are more than two possible.
Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics.
GOODNESS OF FIT Larson/Farber 4th ed 1 Section 10.1.
Warm up On slide.
Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.
1 Chapter 10. Section 10.1 and 10.2 Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
AGENDA:. AP STAT Ch. 14.: X 2 Tests Goodness of Fit Homogeniety Independence EQ: What are expected values and how are they used to calculate Chi-Square?
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
The table below gives the pretest and posttest scores on the MLA listening test in Spanish for 20 high school Spanish teachers who attended an intensive.
Chapter 12 The Analysis of Categorical Data and Goodness of Fit Tests.
By: Avni Choksi and Brittany Nguyen
By.  Are the proportions of colors of each M&M stated by the M&M company true proportions?
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
Section 13.2 Chi-Squared Test of Independence/Association.
11.1 Chi-Square Tests for Goodness of Fit Objectives SWBAT: STATE appropriate hypotheses and COMPUTE expected counts for a chi- square test for goodness.
Chapter 14 Inference for Distribution of Categorical Variables: Chi-Squared Procedures.
CHAPTER 11: INFERENCE FOR DISTRIBUTIONS OF CATEGORICAL DATA 11.1 CHI-SQUARE TESTS FOR GOODNESS OF FIT OUTCOME: I WILL STATE APPROPRIATE HYPOTHESES AND.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
13.2 Inference for Two Way Tables.  Analyze Two Way Tables Using Chi-Squared Test for Homogeneity and Independence.
Goodness-of-Fit and Contingency Tables Chapter 11.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Chi Square Analysis. What is the chi-square statistic? The chi-square (chi, the Greek letter pronounced "kye”) statistic is a nonparametric statistical.
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
 Check the Random, Large Sample Size and Independent conditions before performing a chi-square test  Use a chi-square test for homogeneity to determine.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Check your understanding: p. 684
CHAPTER 11 Inference for Distributions of Categorical Data
X2 = X2 Test of Independence P-Value =
Section 10-1 – Goodness of Fit
The Analysis of Categorical Data and Chi-Square Procedures
Does ones regional location have an affect on their Political affiliation? To begin to investigate this situation data from 177 random voters was analyzed.
Chi-Square - Goodness of Fit
Goodness of Fit Test - Chi-Squared Distribution
Chapter 10 Analyzing the Association Between Categorical Variables
X2 = Based on the following results, is the die in
The Analysis of Categorical Data and Goodness of Fit Tests
Day 66 Agenda: Quiz Ch 12 & minutes.
The Analysis of Categorical Data and Goodness of Fit Tests
Chi-square = 2.85 Chi-square crit = 5.99 Achievement is unrelated to whether or not a child attended preschool.
Chi-squared tests Goodness of fit: Does the actual frequency distribution of some data agree with an assumption? Test of Independence: Are two characteristics.
The Analysis of Categorical Data and Goodness of Fit Tests
The Analysis of Categorical Data and Goodness of Fit Tests
Chapter 26 Part 2 Comparing Counts.
Warm Up A 2009 study investigated whether people can tell the difference between pate, processed meats and gourmet dog food. Researchers used a food processor.
Inference for Distributions of Categorical Data
Presentation transcript:

WARM – UP: The Math club and the Spanish club traditionally are composed of a similar distribution of class level. A random sample of this year’s math club members is selected with the following results: 12 Freshman, 8 Sophomores, 21 Juniors, and 19 Seniors. A random sample of this year’s Spanish club members is selected with the following results: 20 Freshman, 10 Sophomores, 38 Juniors, and 36 Seniors. Is there evidence of a similar distribution of class level among the two groups? FreshmanSophomoreJuniorSenior Math Club Spanish Club

WARM – UP: Is there evidence of a similar distribution of class level among the two groups? FreshmanSophomoreJuniorSenior Math Club Spanish Club X 2 = X 2 Test ofHomogeneity P-Value = H 0 : The Distribution of class level among the Math Club is similar to the Spanish Club. H a : The Distribution of class level among the Math Club is NOT similar to the Spanish Club. Since the P-Value is greater than α = 0.05 we FAIL to REJECT H 0. There is insufficient Evidence to conclude that there is a difference in class level % among the clubs. 1.SRS – stated 2.All Expected Counts are 1 or greater. 3.No more than 20% of the Expected Counts are less than

The THREE types of Chi-Square Tests: 1.The Chi-Square Test for Goodness of Fit. 2.The Chi-Square Test for Independence. 3.The Chi-Square Test for Homogeneity. H 0 : The Distribution of the variable is uniform or is equivalent to prescribed %. H a : The Distribution does not equal… H 0 : Variable #1 is Independent of Variable #2 (No Relationship). H a : Variable #1 is NOT Independent of Variable #2 (or an association is present). H 0 : The Distribution of the variable is equivalent among the populations. H a : The Distribution of the variable within the populations is NOT equivalent to that of the other pop.

% color PlainPeanut CrispyMinisPeanut Butter Almond Brown 13%12% 17%13%10% Blue 24%23% 17%25%20% Orange 20%23% 16%25%20% Green 16%15% 16%12%20% Red 13%12% 17%12%10% Yellow 14%15% 17%13%20% 1.Hypothesis 2.Name of Test 3.Obs/Exp. Counts 4.X 2 / p-Value 5.Check Assumptions 6.Conclusion

1.Lay out a large sheet of paper—you’ll be sorting M&Ms on this. 2.Open up a bag of M&Ms 3.DO NOT EAT ANY OF THE M&M’S… yet 4.Separate the M&M’s into color categories and count the number of each color of M&M you have. 5.Record your counts in Data Chart 1. 6.Determine the Chi square value and p-value for your data. Data Chart Color Categories BrownBlueOrangeGreenRedYellowTotal Expected % %%%% Observed (O) count Expected (E) count (O – E) 2 /E Σ ((O-E) 2 /E) = X 2 P-Value = X 2 cdf(X 2, E99, df):_________

H 0 : The Distribution of Colors in the packet of “Plain” M&M’s is equal to the Percentage indicated by the Mars Company. H a : The Distribution of Colors in the packet of “Plain” M&M’s is NOT equal to the Percentage indicated by the Mars Company. 1.Hypothesis 2.Name of Test 3.Obs/Exp. Counts 4.X 2 / p-Value 5.Check Assumptions 6.Conclusion

EXAMPLE: Is the Distribution of colors in a package of PLAIN M&M’s statistically equivalent to the Distribution of colors in a package of PEANUT M&M’s? A random package of plain and peanut M&M’s are selected and analyzed. BrownBlueOrangeGreenRedYellow PLAIN PEANUT X 2 Test of Homogeneity P-Value = X 2 cdf (4.967, E99, 5) = H 0 : The Distribution of colors in the Plain Packet of M&M’s is equivalent to that of the Peanut M&M’s. H a : The Distribution of colors in the Plain Packet of M&M’s is NOT equivalent to that of the Peanut M&M’s. X 2 =

EXAMPLE: Is the Distribution of colors in a package of PLAIN M&M’s statistically equivalent to the Distribution of colors in a package of PEANUT M&M’s? A random package of plain and peanut M&M’s are selected and analyzed. BrownBlueOrangeGreenRedYellow PLAIN PEANUT X 2 Test of Homogeneity P-Value = X 2 cdf (4.967, E99, 3) = H 0 : The Distribution of colors in the Plain Packet of M&M’s is equivalent to that of the Peanut M&M’s. H a : The Distribution of colors in the Plain Packet of M&M’s is NOT equivalent to that of the Peanut M&M’s. X 2 = Since the P-Value is NOT less than α = 0.05 there is NO evidence to reject H 0. No evidence that the Distributions are NOT equivalent. Although the results are uncertain. CONDITIONS 1.SRS - Stated √ 2.All Expected Counts are 1 or greater. X 3.No more than 20% of the Expected Counts are less than 5. X

An SRS of 120 voters from AR and an SRS of 115 voters from TX was taken to determine whether there was a significant difference in how people, as of that moment, would vote with regards to Obama. Definitely Would Mostly Likely Probably would Not Definitely Would Not Arkansas Texas X 2 = X 2 Test ofHomogeneity P-Value = X 2 cdf (11.277, E99, 3) = H 0 : The Distribution of how people would vote today in the State of Arkansas is equal to that of Texas. H a : The Distribution of how people would vote today in the State of Arkansas is NOT equal to that of Texas Since the P-Value is less than α = 0.05 we REJECT H 0. Support is different between AR and TX. 1.SRS – stated 2.All Expected Counts are 1 or greater. 3.No more than 20% of the Expected Counts are less than 5.

Does ones regional location have an affect on their Political affiliation? To begin to investigate this situation data from 177 random voters was analyzed. DemocratRepublican West3927 Northeast3515 Southeast1744 Political Affiliation Location a.) Find the Proportion of Democrats among each Democrats among each regional location. regional location. b.) Make a Bar Chart for these Prop. these Prop. c.) Find the Expected Values for each cell. each cell. d.) Find evidence of an affect % of Dem N NW SE Regional Location

Does ones regional location have an affect on their Political affiliation? To begin to investigate this situation data from 177 random voters was analyzed. DemocratRepublican West3927 Northeast3515 Southeast1744 Political Affiliation Location d.) Find evidence of an affect X 2 = X 2 Test ofIndependence P-Value = X 2 cdf (22.01, E99, 2) = 0 H 0 : Political Affiliation is INDEPENDENT of Location H a : Political Affiliation is NOT INDEPENDENT of Location Since the P-Value is less than α = 0.05 we will REJECT H 0. There is evidence that Political Affiliation is NOT INDEPENDENT of Location. 1.SRS – stated 2.All Expected Counts are 5 or greater.

#18 Medical researchers followed an SRS of 6272 Swedish men for 30 years to see if there was an association between the amount of fish in their diet and Prostate Cancer. Is there any evidence of such an association? Fish Consumption Total Subjects Prostate Cancer Never12414 Small part of diet Moderate part Large part54942 NO Prostate Cancer X 2 Test ofIndependence P-Value = X 2 cdf (3.677, E99, 3) = H 0 : There is NO relationship between fish consumption and the development of Prostate Cancer. H a : There is relationship between fish consumption and the development of Prostate Cancer. X 2 = 3.677

Fish Consumption Total Subjects Prostate Cancer Never12414 Small part of diet Moderate part Large part54942 NO Prostate Cancer X 2 Test ofIndependence P-Value = X 2 cdf (3.677, E99, 3) = H 0 : There is NO relationship between fish consumption and the development of Prostate Cancer. H a : There is relationship between fish consumption and the development of Prostate Cancer. X 2 = Since the P-Value is NOT less than α = 0.05 there is NO evidence to reject H 0. There is NO relationship between fish consumption and Prostate Cancer. CONDITIONS 1.SRS - Stated √ 2.All Expected Counts are 1 or greater. √ 3.No more than 20% of the Expected Counts are less than 5. √ WARM – UP Medical researchers followed 6272 Swedish men for 30 years to see if there was an association between the amount of fish in their diet and Prostate Cancer. Is there any evidence of such an association?

EXAMPLE: Texas introduces a New scratch off Lottery ticket in which it claims to have the following prize payouts: 2% - $1000; 10% - $100; 25% - $20; 40% - $5; and the rest – Losing tickets. You buy a random sample of 200 tickets and get the following results: Are` the ticket payouts valid? $1000$100$20$5$0 OBS. DATA H 0 : The Texas Lottery Scratch off payout matches the advertised %’s. H a : NOT X 2 = X 2 Goodness of Fit Test EXP. DATA P-Value = X 2 cdf (47.574, E99, 4) = 0 Since the P-Value is less than α = 0.05 we REJECT H 0. The Texas Lottery Scratch off payout does NOT match the advertised %.

CONDITIONS 1.SRS -stated 2.All Expected Counts are 1 or greater. √ 3.No more than 20% of the Expected Counts are less than 5. √ EXP. DATA

WHAT Chi-Square Test should you choose? It is ALL A MATTER OF HOW THE DATA IS SAMPLED. Goodness of Fit Test Goodness of Fit Test – ONE Sample from a single Population using ONE Categorical Variable. Test of Independence – Test of Independence – ONE Sample from a single Population using TWO Categorical Variables. Test of Homogeneity Test of Homogeneity – TWO or More Samples from Populations using ONE Categorical Variable.