Chapter 16 Goodness-of-Fit Tests and Contingency Tables

Slides:



Advertisements
Similar presentations
Lecture 8: Hypothesis Testing
Advertisements

Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
STATISTICS HYPOTHESES TEST (III) Nonparametric Goodness-of-fit (GOF) tests Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering.
STATISTICS HYPOTHESES TEST (I)
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
1 2 Test for Independence 2 Test for Independence.
CALENDAR.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 10 second questions
Overview of Lecture Parametric vs Non-Parametric Statistical Tests.
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Chapter 7 Sampling and Sampling Distributions
Chi Square Interpretation. Examples of Presentations The following are examples of presentations of chi-square tables and their interpretations. These.
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
The basics for simulations
NIPRL Chapter 10. Discrete Data Analysis 10.1 Inferences on a Population Proportion 10.2 Comparing Two Population Proportions 10.3 Goodness of Fit Tests.
Elementary Statistics
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 10-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Statistics for Business and Economics
Chapter 13: Chi-Square Test
PP Test Review Sections 6-1 to 6-6
© 2011 Pearson Education, Inc
Contingency Tables Prepared by Yu-Fen Li.
Statistics for Business and Economics Chapter 9 Categorical Data Analysis.
STATISTICS ELEMENTARY MARIO F. TRIOLA
Chi-Square and Analysis of Variance (ANOVA)
Copyright ©2005 Brooks/Cole, a division of Thomson Learning, Inc. Statistical Significance for 2 x 2 Tables Chapter 13.
Chapter 4 Inference About Process Quality
Statistics Review – Part I
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Before Between After.
Subtraction: Adding UP
Equal or Not. Equal or Not
Statistical Inferences Based on Two Samples
© The McGraw-Hill Companies, Inc., Chapter 10 Testing the Difference between Means and Variances.
Copyright © 2014 Pearson Education, Inc. All rights reserved Chapter 10 Associations Between Categorical Variables.
The Right Questions about Statistics: How hypothesis testing works Maths Learning Centre The University of Adelaide A hypothesis test is designed to DECIDE.
© The McGraw-Hill Companies, Inc., Chapter 12 Chi-Square.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chi-Square Tests Chapter 12.
Chapter Thirteen The One-Way Analysis of Variance.
Chapter 8 Estimation Understandable Statistics Ninth Edition
Experimental Design and Analysis of Variance
Essential Cell Biology
Module 20: Correlation This module focuses on the calculating, interpreting and testing hypotheses about the Pearson Product Moment Correlation.
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Commonly Used Distributions
Chapter 26 Comparing Counts
Chapter 12 Goodness-of-Fit Tests and Contingency Analysis
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 14 Analysis of Categorical Data
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter Goals After completing this chapter, you should be able to:
Previous Lecture: Analysis of Variance
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Chapter Outline Goodness of Fit test Test of Independence.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 11 Chi-Square Tests.
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter 11 Chi-Square Tests.
Chapter 13 Goodness-of-Fit Tests and Contingency Analysis
Chapter 11 Chi-Square Tests.
Presentation transcript:

Chapter 16 Goodness-of-Fit Tests and Contingency Tables Statistics for Business and Economics 6th Edition Chapter 16 Goodness-of-Fit Tests and Contingency Tables Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chapter Goals After completing this chapter, you should be able to: Use the chi-square goodness-of-fit test to determine whether data fits a specified distribution Perform tests for normality Set up a contingency analysis table and perform a chi-square test of association Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chi-Square Goodness-of-Fit Test Does sample data conform to a hypothesized distribution? Examples: Do sample results conform to specified expected probabilities? Are technical support calls equal across all days of the week? (i.e., do calls follow a uniform distribution?) Do measurements from a production process follow a normal distribution? Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chi-Square Goodness-of-Fit Test (continued) Are technical support calls equal across all days of the week? (i.e., do calls follow a uniform distribution?) Sample data for 10 days per day of week: Sum of calls for this day: Monday 290 Tuesday 250 Wednesday 238 Thursday 257 Friday 265 Saturday 230 Sunday 192  = 1722 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Logic of Goodness-of-Fit Test If calls are uniformly distributed, the 1722 calls would be expected to be equally divided across the 7 days: Chi-Square Goodness-of-Fit Test: test to see if the sample results are consistent with the expected results Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Observed vs. Expected Frequencies Oi Expected Ei Monday Tuesday Wednesday Thursday Friday Saturday Sunday 290 250 238 257 265 230 192 246 TOTAL 1722 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chi-Square Test Statistic H0: The distribution of calls is uniform over days of the week H1: The distribution of calls is not uniform The test statistic is where: K = number of categories Oi = observed frequency for category i Ei = expected frequency for category i Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

The Rejection Region H0: The distribution of calls is uniform over days of the week H1: The distribution of calls is not uniform Reject H0 if  (with k – 1 degrees of freedom) 2 Do not reject H0 Reject H0 2 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chi-Square Test Statistic H0: The distribution of calls is uniform over days of the week H1: The distribution of calls is not uniform k – 1 = 6 (7 days of the week) so use 6 degrees of freedom: 2.05 = 12.5916  = .05 Conclusion: 2 = 23.05 > 2 = 12.5916 so reject H0 and conclude that the distribution is not uniform 2 Do not reject H0 Reject H0 2.05 = 12.5916 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Goodness-of-Fit Tests, Population Parameters Unknown Idea: Test whether data follow a specified distribution (such as binomial, Poisson, or normal) . . . . . . without assuming the parameters of the distribution are known Use sample data to estimate the unknown population parameters Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Goodness-of-Fit Tests, Population Parameters Unknown (continued) Suppose that a null hypothesis specifies category probabilities that depend on the estimation (from the data) of m unknown population parameters The appropriate goodness-of-fit test is the same as in the previously section . . . . . . except that the number of degrees of freedom for the chi-square random variable is Where K is the number of categories Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Test of Normality The assumption that data follow a normal distribution is common in statistics Normality was assessed in prior chapters Normal probability plots (Chapter 6) Normal quintile plots (Chapter 8) Here, a chi-square test is developed Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Test of Normality (continued) Two population parameters can be estimated using sample data: For a normal distribution, Skewness = 0 Kurtosis = 3 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Bowman-Shelton Test for Normality Consider the null hypothesis that the population distribution is normal The Bowman-Shelton Test for Normality is based on the closeness the sample skewness to 0 and the sample kurtosis to 3 The test statistic is as the number of sample observations becomes very large, this statistic has a chi-square distribution with 2 degrees of freedom The null hypothesis is rejected for large values of the test statistic Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Bowman-Shelton Test for Normality (continued) The chi-square approximation is close only for very large sample sizes If the sample size is not very large, the Bowman-Shelton test statistic is compared to significance points from text Table 16.7 Sample size N 10% point 5% point 20 30 40 50 75 100 125 150 2.13 2.49 2.70 2.90 3.09 3.14 3.31 3.43 3.26 3.71 3.99 4.26 4.27 4.29 4.34 4.39 200 250 300 400 500 800 ∞ 3.48 3.54 3.68 3.76 3.91 4.32 4.61 4.43 4.60 4.74 4.82 5.46 5.99 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Example: Bowman-Shelton Test for Normality The average daily temperature has been recorded for 200 randomly selected days, with sample skewness 0.232 and kurtosis 3.319 Test the null hypothesis that the true distribution is normal From Table 16.7 the 10% critical value for n = 200 is 3.48, so there is not sufficient evidence to reject that the population is normal Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Contingency Tables Contingency Tables Used to classify sample observations according to a pair of attributes Also called a cross-classification or cross-tabulation table Assume r categories for attribute A and c categories for attribute B Then there are (r x c) possible cross-classifications Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

r x c Contingency Table Attribute B Attribute A 1 2 . . . C Totals . r Or1 C1 O12 O22 Or2 C2 … O1c O2c Orc Cc R1 R2 Rr n Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Test for Association Consider n observations tabulated in an r x c contingency table Denote by Oij the number of observations in the cell that is in the ith row and the jth column The null hypothesis is The appropriate test is a chi-square test with (r-1)(c-1) degrees of freedom Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Test for Association Let Ri and Cj be the row and column totals (continued) Let Ri and Cj be the row and column totals The expected number of observations in cell row i and column j, given that H0 is true, is A test of association at a significance level  is based on the chi-square distribution and the following decision rule Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Contingency Table Example Left-Handed vs. Gender Dominant Hand: Left vs. Right Gender: Male vs. Female H0: There is no association between hand preference and gender H1: Hand preference is not independent of gender Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Contingency Table Example (continued) Sample results organized in a contingency table: Gender Hand Preference Left Right Female 12 108 120 Male 24 156 180 36 264 300 sample size = n = 300: 120 Females, 12 were left handed 180 Males, 24 were left handed Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Logic of the Test H0: There is no association between hand preference and gender H1: Hand preference is not independent of gender If H0 is true, then the proportion of left-handed females should be the same as the proportion of left-handed males The two proportions above should be the same as the proportion of left-handed people overall Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Finding Expected Frequencies 120 Females, 12 were left handed 180 Males, 24 were left handed Overall: P(Left Handed) = 36/300 = .12 If no association, then P(Left Handed | Female) = P(Left Handed | Male) = .12 So we would expect 12% of the 120 females and 12% of the 180 males to be left handed… i.e., we would expect (120)(.12) = 14.4 females to be left handed (180)(.12) = 21.6 males to be left handed Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Expected Cell Frequencies (continued) Expected cell frequencies: Example: Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Observed vs. Expected Frequencies Observed frequencies vs. expected frequencies: Gender Hand Preference Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36 264 300 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

The Chi-Square Test Statistic The Chi-square test statistic is: where: Oij = observed frequency in cell (i, j) Eij = expected frequency in cell (i, j) r = number of rows c = number of columns Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Observed vs. Expected Frequencies Gender Hand Preference Left Right Female Observed = 12 Expected = 14.4 Observed = 108 Expected = 105.6 120 Male Observed = 24 Expected = 21.6 Observed = 156 Expected = 158.4 180 36 264 300 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Contingency Analysis 2 Decision Rule: If 2 > 3.841, reject H0, otherwise, do not reject H0 Here, 2 = 0.6848 < 3.841, so we do not reject H0 and conclude that gender and hand preference are not associated  = 0.05 2 2.05 = 3.841 Do not reject H0 Reject H0 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.

Chapter Summary Used the chi-square goodness-of-fit test to determine whether sample data match specified probabilities Conducted goodness-of-fit tests when a population parameter was unknown Tested for normality using the Bowman-Shelton test Used contingency tables to perform a chi-square test for association Compared observed cell frequencies to expected cell frequencies Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.