Elementary Statistics

Slides:



Advertisements
Similar presentations
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Advertisements

11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
11-3 Contingency Tables In this section we consider contingency tables (or two-way frequency tables), which include frequency counts for categorical data.
Chi-Square Tests and the F-Distribution
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Section 9-2 Inferences About Two Proportions.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 1.. Section 11-2 Goodness of Fit.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 11-4 McNemar’s Test for Matched Pairs.
1 Pertemuan 11 Uji kebaikan Suai dan Uji Independen Mata kuliah : A Statistik Ekonomi Tahun: 2010.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
+ Chi Square Test Homogeneity or Independence( Association)
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 Chapter 10. Section 10.1 and 10.2 Triola, Elementary Statistics, Eighth Edition. Copyright Addison Wesley Longman M ARIO F. T RIOLA E IGHTH E DITION.
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved.Copyright © 2010 Pearson Education Section 9-3 Inferences About Two Means:
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Lecture Slides Elementary Statistics Tenth Edition and the.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Goodness-of-Fit and Contingency Tables Chapter 11.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Comparing Counts Chi Square Tests Independence.
Other Chi-Square Tests
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chi-Square hypothesis testing
Presentation 12 Chi-Square test.
Chi-square test or c2 test
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Essentials of Statistics 5th Edition
Lecture Slides Elementary Statistics Twelfth Edition
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 12 Tests with Qualitative Data
Chapter 25 Comparing Counts.
Elementary Statistics
Elementary Statistics
Chapter 5 Hypothesis Testing
Elementary Statistics
Elementary Statistics
Chapter 11 Goodness-of-Fit and Contingency Tables
Elementary Statistics
Lecture Slides Elementary Statistics Tenth Edition
Elementary Statistics
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Elementary Statistics
Chapter 11: Inference for Distributions of Categorical Data
Contingency Tables: Independence and Homogeneity
Overview and Chi-Square
Inference on Categorical Data
Lesson 11 - R Chapter 11 Review:
Paired Samples and Blocks
Analyzing the Association Between Categorical Variables
Chapter 26 Comparing Counts.
Chapter 13: Inference for Distributions of Categorical Data
Lecture Slides Elementary Statistics Twelfth Edition
Section 11-1 Review and Preview
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
Chapter 26 Comparing Counts.
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Elementary Statistics Thirteenth Edition Chapter 11 Goodness-of-Fit and Contingency Tables Copyright © 2018, 2014, 2012 Pearson Education, Inc. All Rights Reserved

Goodness-of-Fit and Contingency Tables 11-1 Goodness-of-Fit 11-2 Contingency Tables

Key Concept We now consider methods for analyzing contingency tables, which include frequency counts for categorical data arranged in a two-by-two table. We present a method for conducting a hypothesis test of the null hypothesis that the row and column variables are independent of each other. We will also consider three variations of the basic method: (1) test of homogeneity, (2) Fisher’s exact test, and (3) McNemar’s test for matched pairs.

Contingency Table (1 of 2) A contingency table (or two-way frequency table) is a table consisting of frequency counts of categorical data corresponding to two different variables. (One variable is used to categorize rows, and a second variable is used to categorize columns.)

Contingency Table (2 of 2) Test of Independence In a test of independence, we test the null hypothesis that in a contingency table, the row and column variables are independent. (That is, there is no dependency between the row variable and the column variable.)

Objective Conduct a hypothesis test of independence between the row variable and column variable in a contingency table.

Notation O represents the observed frequency in a cell of a contingency table. E represents the expected frequency in a cell, found by assuming that the row and column variables are independent. r represents the number of rows in a contingency table (not including labels or row totals). c represents the number of columns in a contingency table (not including labels or column totals).

Requirements The sample data are randomly selected. The sample data are represented as frequency counts in a two-way table. For every cell in the contingency table, the expected frequency E is at least 5. (There is no requirement that every observed frequency must be at least 5.)

Null and Alternative Hypotheses The null and alternative hypotheses are as follows: H0: The row and column variables are independent. H1: The row and column variables are dependent.

Test Statistic for a Test of Independence The null and alternative hypotheses are as follows: H0: The row and column variables are independent. H1: The row and column variables are dependent.

P-values P-values are typically provided by technology, or a range of P-values can be found from Table A-4.

Degrees of freedom = (r − 1) (c − 1) Critical Values The critical values are found in Table A-4 using Degrees of freedom = (r − 1) (c − 1) where r is the number of rows and c is the number of columns. Tests of independence with a contingency table are always right-tailed.

Relationships Among Key Components in a Test of Independence

Example: Finding Expected Frequency (1 of 4) The table on the next slide is a contingency table with four rows and two columns. The cells of the table contain frequency counts. The frequency counts are the observed values, and the expected values are shown in parentheses. The row variable identifies the treatment used for a stress fracture in a foot bone, and the column variable identifies the outcome as a success or failure. Refer to the table and find the expected frequency for the cell in the first row and first column, where the observed frequency is 54.

Example: Finding Expected Frequency (2 of 4) blank Success Failure Surgery 54(E = 47.478) 12(E = 18.522) Weight-Bearing Cast 41(E = 66.182) 51(E = 25.818) Non-Weight-Bearing Cast for 6 Weeks 70(E = 52.514) 3(E = 20.486) Non-Weight-Bearing Cast for Less Than 6 Weeks 17(E = 15.826) 5(E = 6.174)

Example: Finding Expected Frequency (3 of 4) Solution The first cell lies in the first row (with a total frequency of 66) and the first column (with total frequency of 182). The “grand total” is the sum of all frequencies in the table, which is 253. The expected frequency of the first cell is

Example: Finding Expected Frequency (4 of 4) Interpretation We can interpret the expected value by stating that if we assume that success is independent of the treatment, then we expect to find that 47.478 of the subjects would be treated with surgery and that treatment would be successful. There is a discrepancy between O = 54 and E = 47.478, and such discrepancies are key components of the test statistic that is a collective measure of the overall disagreement between the observed frequencies and the frequencies expected with independence between the row and column variables.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (1 of 9) Use the same sample data from the previous example with a 0.05 significance level to test the claim that success of the treatment is independent of the type of treatment. What does the result indicate about the increasing trend to use surgery?

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (2 of 9) Solution REQUIREMENT CHECK (1) On the basis of the study description, we will treat the subjects as being randomly selected and randomly assigned to the different treatment groups. (2) The results are expressed as frequency counts. (3) The expected frequencies are all at least 5. (The lowest expected frequency is 6.174.) The requirements are satisfied.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (3 of 9) Solution The null hypothesis and alternative hypothesis are as follows: H0: Success is independent of the treatment. H1: Success and the treatment are dependent. The significance level is a = 0.05.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (4 of 9) Solution Because the data are in the form of a contingency table, we use the χ² distribution with this test statistic:

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (5 of 9) Solution P-Value from Technology If using technology, results typically include the χ² test statistic and the P-value. For example, see the accompanying XLSTAT display showing the test statistic is χ² = 58.393 and the P-value is less than 0.0001.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (6 of 9) Solution P-Value from Table A-4 If using Table A-4 instead of technology, first find the number of degrees of freedom: (r − 1)(c − 1) = (4 − 1)(2 − 1) = 3 degrees of freedom. Because the test statistic of X² = 58.393 exceeds the highest value (12.838) in Table A-4 for the row corresponding to 3 degrees of freedom, we know that P-value < 0.005. Because the P-value is less than the significance level of 0.05, we reject the null hypothesis of independence between success and treatment.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (7 of 9) Solution Critical Value If using the critical value method of hypothesis testing, the critical value of χ² = 7.815 is found from Table A- 4, with a = 0.05 in the right tail and the number of degrees of freedom given by (r − 1)(c − 1) = (4 − 1)(2 − 1) = 3.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (8 of 9) Solution Critical Value (con’t) The test statistic and critical value are shown in the figure. Because the test statistic does fall within the critical region, we reject the null hypothesis of independence between success and treatment.

Example: Does the Choice of Treatment for a Fracture Affect Success Example: Does the Choice of Treatment for a Fracture Affect Success? (9 of 9) Interpretation It appears that success is dependent on the treatment. Although the results of this test do not tell us which treatment is best, we can see that the success rates of 81.8%, 44.6%, 95.9%, and 77.3% suggest that the best treatment is to use a non–weight-bearing cast for 6 weeks. These results suggest that the increasing use of surgery is a treatment strategy that is not supported by the evidence.

Chi-Square Test of Homogeneity A chi-square test of homogeneity is a test of the claim that different populations have the same proportions of some characteristics.

Sampling from Different Populations In a typical test of independence, sample subjects are randomly selected from one population and values of two different variables are observed. In a typical chi-square test of homogeneity, subjects are randomly selected from the different populations separately.

Procedure In conducting a test of homogeneity, we can use the same notation, requirements, test statistic, critical value, and procedures given previously, with this exception: Instead of testing the null hypothesis of independence between the row and column variables, we test the null hypothesis that the different populations have the same proportion of some characteristic.

Example: The Lost Wallet Experiment (1 of 6) The next slide lists results from a Reader’s Digest experiment in which 12 wallets were intentionally lost in each of 16 different cities, including New York City, London, Amsterdam, and so on. Use a 0.05 significance level with the data to test the null hypothesis that the cities have the same proportion of returned wallets. The Reader’s Digest headline “Most Honest Cities: The Reader’s Digest Lost Wallet Test” implies that whether a wallet is returned is dependent on the city in which it was lost. Test the claim that the proportion of returned wallets is not the same in the 16 different cities.

Example: The Lost Wallet Experiment (2 of 6) City A B C D E F G H I J K L M N O P Wallet Returned 8 5 7 11 6 3 1 4 2 9 Wallet Not Returned 10

Example: The Lost Wallet Experiment (3 of 6) Solution REQUIREMENT CHECK (1) Based on the description of the study, we will treat the subjects as being randomly selected and randomly assigned to the different cities. (2) The results are expressed as frequency counts. (3) The expected frequencies are all at least 5. (All expected values are either 5.625 or 6.375.) The requirements are satisfied.

Example: The Lost Wallet Experiment (4 of 6) Solution The null hypothesis and alternative hypothesis are as follows: H0: Whether a lost wallet is returned is independent of the city in which it was lost. H1: A lost wallet being returned depends on the city in which it was lost.

Example: The Lost Wallet Experiment (5 of 6) Solution The accompanying StatCrunch display shows the test statistic of χ² = 35.388 (rounded) and the P-value of 0.002 (rounded). Because the P-value of 0.002 is less than the significance level of 0.05, we reject the null hypothesis of independence between the two variables. (“If the P is low, the null must go.”)

Example: The Lost Wallet Experiment (6 of 6) Interpretation We reject the null hypothesis of independence, so it appears that the proportion of returned wallets depends on the city in which they were lost. There is sufficient evidence to conclude that the proportion of returned wallets is not the same in the 16 different cities.

Fisher’s Exact Test Every cell must have an expected frequency of at least 5. Fisher’s exact test is often used for a 2 × 2 contingency table with one or more expected frequencies that are below 5. Fisher’s exact test provides an exact P-value. Because the calculations are quite complex, it’s a good idea to use technology.

Example: Does Yawning Cause Others to Yawn? The MythBusters show on the Discovery Channel tested the theory that when someone yawns, others are more likely to yawn. The results are summarized below. Using Fisher’s exact test results in a P-value of 0.513, so there is not sufficient evidence to support the myth that people exposed to yawning actually yawn more than those not exposed to yawning. Blank Subject Exposed to Yawning? Yes No Did Subject Yawn? Yes 10 4 Did Subject Yawn? No 24 12

McNemar’s Test for Matched Pairs (1 of 2) For 2 × 2 tables consisting of frequency counts that result from matched pairs, the frequency counts within each matched pair are not independent and, for such cases, we can use McNemar’s test of the null hypothesis that the frequencies from the discordant (different) categories occur in the same proportion.

McNemar’s Test for Matched Pairs (2 of 2) Blank Treatment X: Cured Not Cured Treatment Y: Cured a b Treatment Y: Not Cured c d McNemar’s test requires that for a table as shown, the frequencies are such that b + c ≥ 10. The test is a right- tailed chi-square test with the following test statistic:

Example: Are Hip Protector’s Effective? (1 of 4) A randomized controlled trial was designed to test the effectiveness of hip protectors in preventing hip fractures in the elderly. Nursing home residents each wore protection on one hip, but not the other. Results are summarized in below. Blank No Hip Protector Worn: No Hip Fracture Hip Fracture Hip Protector Worn: No Hip Fracture 309 10 Hip Protector Worn: Hip Fracture 15 2

Example: Are Hip Protector’s Effective? (2 of 4) McNemar’s test can be used to test the null hypothesis that the following two proportions are the same: The proportion of subjects with no hip fracture on the protected hip and a hip fracture on the unprotected hip. The proportion of subjects with a hip fracture on the protected hip and no hip fracture on the unprotected hip.

Example: Are Hip Protector’s Effective? (3 of 4) Solution Using the discordant (different) pairs with the general format we have b = 10 and c = 15, so the test statistic is calculated as follows:

Example: Are Hip Protector’s Effective? (4 of 4) Solution With a 0.05 significance level and degrees of freedom given by df = 1, we refer to Table A-4 to find the critical value of χ² = 3.841 for this right-tailed test. The test statistic of χ² = 0.640 does not exceed the critical value of χ² = 3.841, so we fail to reject the null hypothesis. The proportion of hip fractures with the protectors worn is not significantly different from the proportion of hip fractures without the protectors worn.