Two Way Tables and the Chi-Square Test ● Here we study relationships between two categorical variables. – The data can be displayed in a two way table.

Slides:



Advertisements
Similar presentations
The Chi-Square Test for Association
Advertisements

Hypothesis Testing IV Chi Square.
Chapter 13: Inference for Distributions of Categorical Data
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
8/15/2015Slide 1 The only legitimate mathematical operation that we can use with a variable that we treat as categorical is to count the number of cases.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
Goodness-of-Fit Tests and Categorical Data Analysis
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 26 Comparing Counts.
Chapter 26: Comparing Counts AP Statistics. Comparing Counts In this chapter, we will be performing hypothesis tests on categorical data In previous chapters,
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 10.7.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chi-square Test of Independence Steps in Testing Chi-square Test of Independence Hypotheses.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Chi-square test or c2 test
Chapter 26 Chi-Square Testing
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Testing Hypothesis That Data Fit a Given Probability Distribution Problem: We have a sample of size n. Determine if the data fits a probability distribution.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
FPP 28 Chi-square test. More types of inference for nominal variables Nominal data is categorical with more than two categories Compare observed frequencies.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 17 l Chi-Squared Analysis: Testing for Patterns in Qualitative Data.
+ Chi Square Test Homogeneity or Independence( Association)
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
Chapter Outline Goodness of Fit test Test of Independence.
July, 2000Guang Jin Statistics in Applied Science and Technology Chapter 12. The Chi-Square Test.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
Bullied as a child? Are you tall or short? 6’ 4” 5’ 10” 4’ 2’ 4”
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
Statistics 300: Elementary Statistics Section 11-3.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Comparing Observed Distributions A test comparing the distribution of counts for two or more groups on the same categorical variable is called a chi-square.
Textbook Section * We already know how to compare two proportions for two populations/groups. * What if we want to compare the distributions of.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Test of independence: Contingency Table
Chi-square test or c2 test
Chapter 12 Tests with Qualitative Data
The Chi-Square Distribution and Test for Independence
Chapter 10 Analyzing the Association Between Categorical Variables
Analyzing the Association Between Categorical Variables
Hypothesis Tests for a Standard Deviation
Presentation transcript:

Two Way Tables and the Chi-Square Test ● Here we study relationships between two categorical variables. – The data can be displayed in a two way table (also called a contingency table), showing the counts or percents of individuals that fall into various categories. ● How to test whether there's a relationship between the independent variable and the dependent variable? – A special case of hypothesis testing---all general ideas apply. – Make use of the Chi-square distribution and the Chi-square statistic

● Each row represents a value of the independent (or explanatory) variable; Each column represents a value of the dependent variable –Not an iron rule. Some like to do the opposite. Doesn't matter as long as you know what you are doing and make it clear to the reader ● The number of observations falling into each combination of categories is entered into each cell of the table –A visual idea of the (lack of) relationship between the two variables are based on the percents from the counts in the table (using raw count data can be misleading due to unequal sample sizes for different groups) Two-Way Tables

Two-Way Table: Example Religiosity Low (5.5%)(39%)(55.5%)(100%) Moderate (9.2%)(49%)(41.8%)(100%) High (26.6%)(54.9%)(18.5%)(100%) Total ,679 (11.9%)(45.7%)(42.4%)(100%) (Abortion opinions, by level of attendance at religious services) Abortion opinion Never allow Depends Personal choice Total

The Hypotheses ● The null hypothesis: Religiosity has no effect on Abortion opinion ● The alternative: Religiosity has an effect on Abortion opinion ● Key idea: Under the null hypothesis, the distribution over the different values of Abortion opinion should be the same regardless of the value of Religiosity, and is approximately given by the last row of the table: ( 11.9%, 45.7%, 42.4%) – So ask: if the null hypothesis were true, what should be the expected counts in the cells for each value of Religiosity? ● e.g., for Religiosity=“low”, there are a total of 802 observations. How many of these should say “Never allow”, “Depends”, and “Personal choice”? Answer: – 11.9% * 802, 45.7% * 802, 42.4% * 802, – i.e., (95, 367, 340) ● This is what we'd expect if the null is true. What we actually observe is instead (44, 313, 445) ● Are the differences like these due to random chance? Or they are “significant” so that we would reject the null hypothesis?

Two-Way Table Example: Religiosity Low44 (95)313 (367)445 (340)802 (5.5%)(39%)(55.5%)(100%) Moderate41 (53)218 (204) 186 (188)445(9.2%)(49%)(41.8%) (100%) High115 (52)237 (198)80 (183)432 (26.6%)(54.9%)(18.5%)(100%) Total ,679 (11.9%)(45.7%)(42.4%)(100%) Showing Expected Counts Abortion opinion Never allow Depends Personal choice Total

Testing the Hypotheses ● Now recall the logic of hypothesis testing: We need to find the probability of observing the test statistic (or something more extreme) if the null is true (p-value). If p-value “small enough”, we reject the null. ● In our current situation, what is the test statistic? What is its distribution? ● Intuitively, the test statistic would involve all the differences between the observed counts and the expected counts. ● Surely that's the case! And it turns out that our test statistic follows a different distribution from the normal, a so called “Chi-Square distribution.” But that's about it. All the rest of the ideas are the same as in the tests we discussed before.

The Chi-square Statistic

Sampling Distribution of the Chi-square Statistic

The Family of Chi-square Distributions

Table of Critical Values for Chi-Square Test

● For 2X2 tables, the “Magic number” = 3.84: If the Chi-Square statistic is greater than or equal to this value, the relationship is considered significant at the a=0.05 level. ● What's the magic number for our 3X3 example? (9.49) ● Our computed Chi-Square statistic in the above example is What's our conclusion? ● Stata: example: sysuse nlsw88; tab race married, chi2 Making the Decision: Is the Relationship Statistically Significant?

● The Chi-Square test can easily be adapted to test whether the distribution of a single variable departs from some expected distribution. ● e.g. Are some months more popular for giving births than others? ● We have observations on # of births in each of 12 months ● Under the null hypothesis that all months are equally popular, the expected distribution is 1/12 for each of the 12 months. ● Compute Chi-Square the usual way, comparing the observed and the expected data. Degree of freedom is 12-1=11. ● Another view: ”dependent variable” has 12 values (two way table has 12 columns) ● Imagine an “independent variable” that takes two values: “observed” and “expected”, for the latter the cell counts are the same as the expected distribution. ● Compute Chi-Square the usual way degree of freedom=(12-1)(2-1)=11 Chi-Square Test for a Single Variable