22-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 22 Analysis.

Slides:



Advertisements
Similar presentations
Introductory Mathematics & Statistics for Business
Advertisements

Contingency Tables For Tests of Independence. Multinomials Over Various Categories Thus far the situation where there are multiple outcomes for the qualitative.
Basic Statistics The Chi Square Test of Independence.
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
CHAPTER 23: Two Categorical Variables The Chi-Square Test ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture.
Chapter 10 Chi-Square Tests and the F- Distribution 1 Larson/Farber 4th ed.
Analysis of frequency counts with Chi square
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 Analysis of frequency data n.
Chi-Square Tests and the F-Distribution
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Chi-Square and Analysis of Variance (ANOVA)
Presentation 12 Chi-Square test.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 12-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Goodness-of-Fit Tests and Categorical Data Analysis
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
Copyright © 2013 Pearson Education, Inc. All rights reserved Chapter 10 Inferring Population Means.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on Categorical Data 12.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
CHAPTER 5 INTRODUCTORY CHI-SQUARE TEST This chapter introduces a new probability distribution called the chi-square distribution. This chi-square distribution.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Other Chi-Square Tests
Analysis of Two-Way Tables Moore IPS Chapter 9 © 2012 W.H. Freeman and Company.
CHI SQUARE TESTS.
Chi Square Classifying yourself as studious or not. YesNoTotal Are they significantly different? YesNoTotal Read ahead Yes.
Other Chi-Square Tests
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chap 11-1 Copyright ©2013 Pearson Education, Inc. publishing as Prentice Hall Chapter 11 Chi-Square Tests Business Statistics: A First Course 6 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
CHAPTER INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 2 methods.
4 normal probability plots at once par(mfrow=c(2,2)) for(i in 1:4) { qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”) title(paste(“yourchoice”,i,sep=“”))}
Slide 1 Copyright © 2004 Pearson Education, Inc..
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
STATISTICS Advanced Higher Chi-squared test. Advanced Higher STATISTICS Chi-squared test Finding if there is a significant association between sets of.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
CHAPTER INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Slide 1 © 2002 McGraw-Hill Australia, PPTs t/a Introductory Mathematics & Statistics for Business 4e by John S. Croucher 1 n Learning Objectives –Understand.
11.1 Chi-Square Tests for Goodness of Fit Objectives SWBAT: STATE appropriate hypotheses and COMPUTE expected counts for a chi- square test for goodness.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Copyright © Cengage Learning. All rights reserved. 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Copyright © 2009 Pearson Education, Inc LEARNING GOAL Interpret and carry out hypothesis tests for independence of variables with data organized.
5.1 INTRODUCTORY CHI-SQUARE TEST
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chapter 11 Chi-Square Tests.
Chapter 12 Tests with Qualitative Data
Chapter 11 Goodness-of-Fit and Contingency Tables
Elementary Statistics
Chapter 11 Chi-Square Tests.
Inference on Categorical Data
11E The Chi-Square Test of Independence
Chapter 11 Chi-Square Tests.
Presentation transcript:

22-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 22 Analysis of Frequency Data Introductory Mathematics & Statistics

22-2 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Learning Objectives Understand the meaning of a categorical variable Understand the difference between a single-variable problem and a two-variable problem Construct a table for a single-variable problem Construct a contingency table for a two-variable problem Analyse single-variable data Analyse two-variable data

22-3 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.1 Categorical data Data are often non-numerical, in the sense that each individual observation is a description rather than a number Averages cannot be used in these circumstances Systems where the observations are descriptive (rather than numerical) are described as categorical, because the individuals are being classified into categories Examples –What gender are you? –What colour are your eyes? –Do you have a valid driver’s licence? –What suburb do you live in? –Have you ever travelled overseas? –Who is your favourite lecturer? –Do you have an internet connection at home?

22-4 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.1 Categorical data (cont…) The following statistical questions also involve categorical variables: –Are people who are avid followers of sport more likely to own a large-screen television than those who do not follow sport? –Does area of residence affect the likelihood of owning a motor vehicle? –Do people who live in particular part of a city have any different radio preferences from those who live elsewhere? –Do males and females differ in their level of interests in attending the opera? –Is there a significantly higher proportion of older wine- drinkers than younger wine-drinkers?

22-5 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.1 Categorical data (cont…) These questions may also conveniently be expressed as questions about differences between proportions, such as: –Does the proportion of individuals owning a large-screen television differ between avid followers of sport and others? –Does the proportion of people who own motor vehicles differ from one area of residence to another? –Does the proportion of people preferring various radio stations differ depending on where people live in a city? –Does the proportion of males interested in attending the opera differ from the proportion for females? –Does the proportion of wine-drinkers differ with age?

22-6 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.2 Single-variable categorical data It is common practice to have a standard form of presentation It is convenient to work with frequency data, that is data in which the number of occurrences of each category is recorded A frequency table is a table in which the number of occurrences of each category is recorded Table 22.1 Outcomes of 60 rolls of a fair six-sided die Category Total Frequency

22-7 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.3 Contingency tables Some problems involve two categorical variables, and questions often arise about their relationship A two-dimensional table is where one variable is presented along the rows and the other variable down the columns Table 22.3 A typical contingency table for the residence and internet survey InternetNorth South East West Total Yes No Total Live

22-8 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.3 Contingency tables (cont…) Contingency tables have characteristics that are common to all such tables. These include: –The final column is a total column –The final row is a total row –It generally does not matter which variable is along the columns and which is along the rows –Frequencies must add up along each row –Frequencies must add up down each column –The value in the bottom right-hand corner of the table represents the total number of observations overall. It is often referred to as the grand total frequency

22-9 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems The question to be answered is whether an observed set of categorical data is reasonably consistent with what was expected by some prior line of reasoning Analysis of single variable problems. The steps involved are known as a goodness-of-fit test The steps involved in the analysis of a single variable problem are as follows: 1. Construct the null hypothesis for the problem. This usually takes the general form of:  H 0 : There is no difference between the observed frequencies and the expected frequencies This should be modified for each individual problem  H 1 : The alternative hypothesis (using a two-sided alternative)

22-10 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) 2. Obtain the observed frequencies from the data of the problem 3. Determine the expected frequencies; these are ones we might ‘expect’ to occur if H 0 were true 4. Calculate the measure of the discrepancy between the observed and expected frequencies using by the chi-square test statistic –The symbol  2 is called ‘chi-square’, with the ‘chi’ being pronounced as ‘ky’ –Also, since the square of a number can never be negative, the value of a  2 -test statistic can also never be negative

22-11 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) 5. Associated with the test statistic are degrees of freedom. Determine the degrees of freedom for a goodness-of-fit test using: Degrees of freedom = number of categories – 1 6. Obtain the critical value, from Table 9. Two pieces of information are required: the degrees of freedom (down the left-hand column) and the significance level desired (across the top row)

22-12 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) 7. Compare the value of χ 2 that you calculated with the critical value from Table 9 If χ 2 < the critical value, we cannot reject Ho If χ 2 > the critical value, we reject Ho 8. Based on the outcome of Step 7, draw an appropriate conclusion

22-13 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) Example Suppose that a statistician is presented with six-sided die and asked to determine whether it is ‘fair’, that is whether it is equally likely that the outcome will be a 1, 2, 3, 4, 5 or 6 when the die is tossed. The die is rolled a total of 300 times. The outcomes are shown in the following table OutcomeFrequency Total300

22-14 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) Solution If the die is really fair, there is a 1/6 probability that any given face will appear at any roll. Thus, in a loose sense, the 300 rolls would be ‘expected’ to yield 300 × 1/6 = 50 occurrences of each face Step 1: H 0 : The die is fair H 1 : The die is not fair Step 2: The observed frequencies are the actual values obtained for each category; that is 48, 57, 60, 42, 44 and 49 Step 3: Since H 0 assumes that the die is fair, the expected frequency for each category is the same, that is, 300 × 1/6 = 50

22-15 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) Step 4: For the die, the calculations required for the  2 -test statistic are: Step 5: For the die, since there are 6 categories, the degrees of freedom are 6 – 1 = 5

22-16 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.4 Analysis of single-variable problems (cont…) Step 6: If a significance level of  = 0.05 is desired, we go to the degrees of freedom row 5 and column 0.05 to obtain a critical value of Step 7: For the die, we have:  2 = 5.08 and 5.08 < Therefore, in this case, we cannot reject H o Step 8: Since we cannot reject H o, the conclusion is that it is quite possible that the die may be fair. That is, the evidence of the outcomes of the rolls does not give us grounds to conclude that the die is not fair

22-17 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.5 Analysis of contingency tables The  2 technique can be generalised to the case where two variables are involved The data will be in the form of a contingency table with any number of rows and columns The steps involved in the analysis of contingency tables are as follows: 1. Construct the null hypothesis for the problem. This usually takes the general form that the two variables are independent or that there is no relationship between them H 0 : The two variables are independent or H 0 : There is no relationship between the two variables The alternative hypothesis (using a two-sided alternative) would be: H 1 : The two variables are not independent or H 1 : There is a relationship between the two variables

22-18 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.5 Analysis of contingency tables (cont…) 2. Identify the observed frequencies from the data of the problem. There will be one observed frequency for each cell of the contingency table 3. Calculate the expected frequencies, those that we might ‘expect’ to occur if H 0 were true. For each cell of the contingency table there will also be an expected frequency. The expected frequency for each cell can be found using: The grand total frequency can be found in the bottom right-hand corner of the table

22-19 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.5 Analysis of contingency tables (cont…) 4. Calculate the measure of the discrepancy between the observed and expected frequencies using the  2 test statistic. The formula is: Note that there is one term required in the calculation for each cell of the table. 5. Determine the degrees of freedom for the contingency table Degrees of freedom = (number of rows – 1) × (number of columns – 1)

22-20 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e 22.5 Analysis of contingency tables (cont…) 6. Obtain the critical value from Table 9, using both the degrees of freedom and the desired significance level 7. Compare the value of  2 that you calculated with the critical value from Table 9 If  2 < the critical value, we cannot reject H 0 If  2 > the critical value, we can reject H 0 8. Based on the outcome of Step 7, draw an appropriate conclusion

22-21 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Summary We have understood –the meaning of a categorical variable –the difference between a single-variable problem and a two- variable problem We constructed –a table for a single-variable problem –a contingency table for a two-variable problem We analysed single-variable data Lastly we analysed two-variable data