Chapter 14 Inference for Distribution of Categorical Variables: Chi-Squared Procedures.

Slides:



Advertisements
Similar presentations
Lesson Test for Goodness of Fit One-Way Tables.
Advertisements

CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Chapter 11 Inference for Distributions of Categorical Data
Chi Square Procedures Chapter 11.
Chapter 11 Inference for Distributions of Categorical Data
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Analysis of Two-Way Tables Inference for Two-Way Tables IPS Chapter 9.1 © 2009 W.H. Freeman and Company.
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
Chi-square Goodness of Fit Test
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
Lesson Inference for Two-Way Tables. Vocabulary Statistical Inference – provides methods for drawing conclusions about a population parameter from.
Analysis of Two-Way Tables
Chapter 13: Inference for Tables – Chi-Square Procedures
Lecture Presentation Slides SEVENTH EDITION STATISTICS Moore / McCabe / Craig Introduction to the Practice of Chapter 9 Analysis of Two-Way Tables.
Goodness-of-Fit Tests and Categorical Data Analysis
AP STATISTICS LESSON 13 – 1 (DAY 1) CHI-SQUARE PROCEDURES TEST FOR GOODNESS OF FIT.
13.1 Goodness of Fit Test AP Statistics. Chi-Square Distributions The chi-square distributions are a family of distributions that take on only positive.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chapter 11: Inference for Distributions of Categorical Data.
Analysis of two-way tables - Formulas and models for two-way tables - Goodness of fit IPS chapters 9.3 and 9.4 © 2006 W.H. Freeman and Company.
Chapter 26 Chi-Square Testing
Chapter 11 Inference for Tables: Chi-Square Procedures 11.1 Target Goal:I can compute expected counts, conditional distributions, and contributions to.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit Tests.
CHAPTER 11 SECTION 2 Inference for Relationships.
Chapter 11 The Chi-Square Test of Association/Independence Target Goal: I can perform a chi-square test for association/independence to determine whether.
13.2 Chi-Square Test for Homogeneity & Independence AP Statistics.
Analysis of Two-Way tables Ch 9
+ Chi Square Test Homogeneity or Independence( Association)
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.1 © 2006 W.H. Freeman and Company.
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company.
The Practice of Statistics Third Edition Chapter (13.1) 14.1: Chi-square Test for Goodness of Fit Copyright © 2008 by W. H. Freeman & Company Daniel S.
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
+ Chapter 11 Inference for Distributions of Categorical Data 11.1Chi-Square Goodness-of-Fit Tests 11.2Inference for Relationships.
Lesson Inference for Two-Way Tables. Knowledge Objectives Explain what is mean by a two-way table. Define the chi-square (χ 2 ) statistic. Identify.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
The Practice of Statistics Third Edition Chapter 14: Inference for Distributions of Categorical Variables: Chi-Square Procedures Copyright © 2008 by W.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Associations between Categorical Variables Chapter 10: Chi-Square Procedures.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chi Square Procedures Chapter 14. Chi-Square Goodness-of-Fit Tests Section 14.1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
Textbook Section * We already know how to compare two proportions for two populations/groups. * What if we want to compare the distributions of.
11/12 9. Inference for Two-Way Tables. Cocaine addiction Cocaine produces short-term feelings of physical and mental well being. To maintain the effect,
Inference for Tables: Chi-Squares procedures (2 more chapters to go!)
Check your understanding: p. 684
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Vocabulary Statistical Inference – provides methods for drawing conclusions about a population parameter from sample data Expected Values– row total *
Test for Goodness of Fit
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Lesson 11 - R Chapter 11 Review:
Chapter 13: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 9 Analysis of Two-Way Tables
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Inference for Distributions of Categorical Data
Presentation transcript:

Chapter 14 Inference for Distribution of Categorical Variables: Chi-Squared Procedures

Chapter Objectives Understand and Conduct Chi-Square Tests – Goodness of Fit – Homogeneity of Populations – Association/Independence Given a two-way table, compute conditional distributions Use technology to conduct a chi-square significance test

Introduction Chi-Square Goodness of Fit Test allows us to determine whether a specified population distribution seems valid We can compare two or more population proportions using a chi-squared test for homogeneity of population Chi-square test of association/ independence makes it possible to use the info provided in a two-way table to determine whether the distributions of one variable has been influenced by another

Questions that Chi-Square can answer: 1.Are you more likely to have a car accident when using your cell phone? 2. Does our distribution of M&M colors match the stated distribution by the Mars company? 3. Is there an association between having an exclusive territory firm and the success of the business?

14.1 Test for Goodness of Fit Suppose you want to know how likely it is that you get only 2 out of 52 red M&M’s in a bag. You could conduct a z-test described in chapter 12 and that would tell you how likely this one proportion would be However, if you wanted to look at all six sample proportions (one for each color), conducting 6 different tests of significance would be quite inefficient

Chi-Square can help!!! Chi-Square test for goodness of fit can be used to see if the observed sample distribution is significantly different in some way from the hypothesized population distribution In this example, we can test all 6 color proportions at the same time to see if there is a significant difference

M&M Activity Population of Interest: M&M candies H o = The distribution of candy colors is as given by Mars H a = At least one of the proportions is different from the stated distributions. Complete activity

Car Collisions While On Cell Phones Are car collisions while on cell phones equally likely for each day of the week? A study of 699 drivers who were using a cell phone while involved in a crash examined this question.

Car Collisions ex. cont’d. H o : Car accidents involving cell phone use are equally likely on each day of the week p sunday = p monday = … = p saturday = 1/7 H a :The probabilities of a motor vehicle accident involving cell phones varies from day to day ( At least one of the proportions differs from the stated value)

Car Collisions Data DAYSunMonTueWedThuFriSat NUMBER of collisions In this table, it is clear that the # of collisions is NOT equally distributed between the days. The histogram to the right further shows this point, with the left bar showing the expected count per day (~100) and the right bar showing the observed count.

Car Collision StatisticsDAYObservedExpectedSunday20699*1/7= Monday Tuesday Wednesday Thursday Friday Saturday

Car Collision Are the conditions met? All expected counts at least one? Yes No more than 20% less than 5? Yes What are the degrees of freedom? 7 – 1 =6 The p-value will represent the probability of observing a value of χ 2 at least as extreme as the one actually observed.

P-value of Car Collision For our example, in order to have a p-value of.05 and 6 degrees of freedom, we would need a critical value (X 2 ) of In fact, even at a p-value of.0005 we would need a X 2 of Since our X 2 is , or bigger than the required for any significance level, we have enough evidence to reject the H o In other words, the difference in the distribution of collisions per day is statistically significant

χ 2 on the Graphing Calculator The Goodness of Fit Test appears on the TI-84, not the 83. – For TI-84: Enter observed values in L1 and expected values in L2. – Stat → Test → GOF-Test Observed L1 Expected L2 Enter DF Calculate This will give the χ 2 test statistic and the p-value

χ 2 on the Graphing Calculator For the TI-83 – Enter observed values in L1 and expected values in L2. – Define L3 as (L1-L2) 2 /L2 – 2 nd List → MATH → sum (L3) This will give the χ 2 test statistic. Use the table to determine p-value or χ 2 cdf ( χ 2, u.b., df)

Chi-Square Properties Total area under a chi-square curve is 1 Each curve begins at 0 (except df = 1), increases to a peak, then approaches the hor. asymptote from above Each chi-square curve is skewed right and as the df increases, the curve becomes more and more symmetrical and appears more Normal

Goodness of Fit Tests Notes In the chi-square test for goodness of fit we test the null hypothesis that a categorical variable has a specified distribution. If we find significance, we can conclude that our variable has a distribution different from the specified one These tests do not specifically tell us which categories have the greatest differences – In our car collisions example, which days have the largest difference in observed vs expected values? Saturday and Sunday HW: p846 #14.3, 10

14.2 Inference for Two-Way Tables Goodness of fit tests work for one way tables, for two-way tables we need a new method of approach We could conduct multiple proportion tests and compare the results, but that would be inefficient and also a bit unclear The Chi-Square Test for Homogeneity of Populations will make this much more simple

χ 2 Tests for Homogeneity of Populations We are testing one categorical variable measured across two or more populations The H o will be that the distribution is the same across the all of the populations being compared The H a will be that the distribution is not the same

Conditions 1.Data must come from independent SRS’s from the populations of interest 2.All expected counts are greater than 1 and no more than 20% are less than 5 Once again we will be using a four step process

Does background music influence wine purchases? 3 types of music (None, French, and Italian) were played at random times and the counts of 3 types of wine sold (French, Italian and Other) were taken to see if there are any significant differences based on type of music.

Wine example In this example, we have three different populations: pop 1: bottles of wine sold when no music is playing pop 2: bottles of wine sold when French music is playing pop 3: bottles of wine sold when Italian music is playing Populations of interest: the distribution of types of wine selected for each music type

Wine example H o : The proportion of each wine type sold are the same in all three pops. or the distribution of wine selected is the same for all three populations of music types. H a : the distribution of wine types are not all the same

Calculating Chi-Square Where the expected counts are found by

Calculate the expected counts Note: expected counts need not be whole numbers

Calculations The calculator will compute the results for you, but you do need to demonstrate that you understand the formula… Df = (r – 1) x (c – 1) = (3 – 1) x (3 – 1) = 4

P-value and conclusion P( χ 2 > 18.28) for df = 4 is between and Conclusion: We have strong evidence to reject Ho and conclude that the type of music being played has significant effect on wine sales

Using your calculator Enter your values from your 2 way table into Matrix A (in this case a 3x3) Perform a chi-square test [Stat] > [Test] [chi- square] Make sure Matrix A is the observed and B is expected, then hit calculate Notice that Matrix B now contains your expected values so you can compare where, if at all, major deviations occurred

Caution The chi-square test only confirms there is some relationship It does not in itself tell us what population our conclusion describes If the study was done in one market on a Saturday, the results may apply only to Saturday shoppers at this market. HW: pg. 855 #14.11, 14.15, 14.17

Chi-Square Test of Association/Independence Two categorical variables measure across a single population. We are drawing a single sample and breaking the sample down into categories.

Example: Franchises that succeed A franchise is a business that has multiple locations owned by individuals, rather than owned by the parent company – Think Applebees, Home Depot, Dunkin Donuts, etc The franchisee pays money to the parent company in exchange for an established brand Some franchises offer exclusive territory clauses in their contracts that state that the franchisee will be the only representative of the parent company in a specified area

How does the presence of an exclusive- territory clause in the contract relate to the survival of the business? A study designed to address this question collected data from a sample of 170 new franchise firms. Two categorical variables were measured for each firm. First – successful or not based on whether it was still franchising as of a certain date. Second – whether or not they had an exclusive-territory clause

Performing the Test Here are the results: H o : There is no association between success and exclusive territory H a : There is an associate between success and exclusive territory

Conditions The data comes from an SRS We must check the expected counts Since all values are at least 5, our conditions are met.

Calculations df = (r – 1)(c – 1) = (2 – 1)(2 – 1) = 1 P-value: between 0.01 and 0.02

Interpretation We have sufficient evidence ( χ 2 = 5.91, df = 1, 0.01 < P < 0.02) of an association between success and an exclusive territory in the population of franchisees. pg. 874 #

What type of chi-square test should you use? Are the proportions of employees of different races represented at a company the same as the population? GOF Test A particular company has two offices, one in New York and one in Minneapolis. Are the proportions of employees of different races the same at each office? Homogeneity of Populations

Which type of chi-square test should you use?  Does the level of education received affect whether you will become a millionaire or not? Association/Independence pg. 876 #14.26, 14.29, 14.39