Chapter 10 Analyzing the Association Between Categorical Variables

Slides:



Advertisements
Similar presentations
Hypothesis Testing IV Chi Square.
Advertisements

Chapter 13: The Chi-Square Test
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Statistics 303 Chapter 9 Two-Way Tables. Relationships Between Two Categorical Variables Relationships between two categorical variables –Depending on.
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
Chapter 10 Analyzing the Association Between Categorical Variables
How Can We Test whether Categorical Variables are Independent?
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Goodness-of-Fit Tests and Categorical Data Analysis
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Statistical Significance for a two-way table Inference for a two-way table We often gather data and arrange them in a two-way table to see if two categorical.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
© Copyright McGraw-Hill CHAPTER 11 Other Chi-Square Tests.
1 Chapter 11: Analyzing the Association Between Categorical Variables Section 11.1: What is Independence and What is Association?
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
11.2 Tests Using Contingency Tables When data can be tabulated in table form in terms of frequencies, several types of hypotheses can be tested by using.
© Copyright McGraw-Hill 2004
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 10 Comparing Two Groups Section 10.1 Categorical Response: Comparing Two Proportions.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Lecture 11. The chi-square test for goodness of fit.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Statistics 300: Elementary Statistics Section 11-3.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.1 Independence.
Chi-Square hypothesis testing
Warm Up Check your understanding on p You do NOT need to calculate ALL the expected values by hand but you need to do at least 2. You do NOT need.
Presentation 12 Chi-Square test.
CHAPTER 11 Inference for Distributions of Categorical Data
Chi-square test or c2 test
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
CHAPTER 11 CHI-SQUARE TESTS
Chapter 11 Chi-Square Tests.
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
CHAPTER 11 Inference for Distributions of Categorical Data
Qualitative data – tests of association
Chi Square Two-way Tables
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Contingency Tables: Independence and Homogeneity
Overview and Chi-Square
Chapter 11 Chi-Square Tests.
Inference on Categorical Data
Lesson 11 - R Chapter 11 Review:
Analyzing the Association Between Categorical Variables
CHAPTER 11 CHI-SQUARE TESTS
CHAPTER 11 Inference for Distributions of Categorical Data
Copyright © Cengage Learning. All rights reserved.
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chapter 11 Analyzing the Association Between Categorical Variables
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11 Chi-Square Tests.
Analysis of two-way tables
Presentation transcript:

Chapter 10 Analyzing the Association Between Categorical Variables Learn …. How to detect and describe associations between categorical variables

What Is Independence and What is Association? Section 10.1 What Is Independence and What is Association?

Example: Is There an Association Between Happiness and Family Income?

Example: Is There an Association Between Happiness and Family Income?

Example: Is There an Association Between Happiness and Family Income? The percentages in a particular row of a table are called conditional percentages They form the conditional distribution for happiness, given a particular income level

Example: Is There an Association Between Happiness and Family Income?

Guidelines when constructing tables with conditional distributions: Example: Is There an Association Between Happiness and Family Income? Guidelines when constructing tables with conditional distributions: Make the response variable the column variable Compute conditional proportions for the response variable within each row Include the total sample sizes

Independence vs Dependence For two variables to be independent, the population percentage in any category of one variable is the same for all categories of the other variable For two variables to be dependent (or associated), the population percentages in the categories are not all the same

Example: Happiness and Gender

Example: Happiness and Gender

Example: Belief in Life After Death

Example: Belief in Life After Death Are race and belief in life after death independent or dependent? The conditional distributions in the table are similar but not exactly identical It is tempting to conclude that the variables are dependent

Example: Belief in Life After Death Are race and belief in life after death independent or dependent? The definition of independence between variables refers to a population The table is a sample, not a population

Independence vs Dependence Even if variables are independent, we would not expect the sample conditional distributions to be identical Because of sampling variability, each sample percentage typically differs somewhat from the true population percentage

How Can We Test whether Categorical Variables are Independent? Section 10.2 How Can We Test whether Categorical Variables are Independent?

A Significance Test for Categorical Variables The hypotheses for the test are: H0: The two variables are independent Ha: The two variables are dependent (associated) The test assumes random sampling and a large sample size

What Do We Expect for Cell Counts if the Variables Are Independent? The count in any particular cell is a random variable Different samples have different values for the count The mean of its distribution is called an expected cell count This is found under the presumption that H0 is true

How Do We Find the Expected Cell Counts? For a particular cell, the expected cell count equals:

Example: Happiness by Family Income

The Chi-Squared Test Statistic The chi-squared statistic summarizes how far the observed cell counts in a contingency table fall from the expected cell counts for a null hypothesis

Example: Happiness and Family Income

Example: Happiness and Family Income State the null and alternative hypotheses for this test H0: Happiness and family income are independent Ha: Happiness and family income are dependent (associated)

Example: Happiness and Family Income Report the statistic and explain how it was calculated: To calculate the statistic, for each cell, calculate: Sum the values for all the cells The value is 73.4

Example: Happiness and Family Income The larger the value, the greater the evidence against the null hypothesis of independence and in support of the alternative hypothesis that happiness and income are associated

The Chi-Squared Distribution To convert the test statistic to a P-value, we use the sampling distribution of the statistic For large sample sizes, this sampling distribution is well approximated by the chi-squared probability distribution

The Chi-Squared Distribution

The Chi-Squared Distribution Main properties of the chi-squared distribution: It falls on the positive part of the real number line The precise shape of the distribution depends on the degrees of freedom: df = (r-1)(c-1)

The Chi-Squared Distribution Main properties of the chi-squared distribution: The mean of the distribution equals the df value It is skewed to the right The larger the value, the greater the evidence against H0: independence

The Chi-Squared Distribution

The Five Steps of the Chi-Squared Test of Independence 1. Assumptions: Two categorical variables Randomization Expected counts ≥ 5 in all cells

The Five Steps of the Chi-Squared Test of Independence 2. Hypotheses: H0: The two variables are independent Ha: The two variables are dependent (associated)

The Five Steps of the Chi-Squared Test of Independence 3. Test Statistic:

The Five Steps of the Chi-Squared Test of Independence 4. P-value: Right-tail probability above the observed value, for the chi-squared distribution with df = (r-1)(c-1) 5. Conclusion: Report P-value and interpret in context If a decision is needed, reject H0 when P-value ≤ significance level

Chi-Squared is Also Used as a “Test of Homogeneity” The chi-squared test does not depend on which is the response variable and which is the explanatory variable When a response variable is identified and the population conditional distributions are identical, they are said to be homogeneous The test is then referred to as a test of homogeneity

Example: Aspirin and Heart Attacks Revisited

Example: Aspirin and Heart Attacks Revisited What are the hypotheses for the chi-squared test for these data? The null hypothesis is that whether a doctor has a heart attack is independent of whether he takes placebo or aspirin The alternative hypothesis is that there’s an association

Example: Aspirin and Heart Attacks Revisited Report the test statistic and P-value for the chi-squared test: The test statistic is 25.01 with a P-value of 0.000 This is very strong evidence that the population proportion of heart attacks differed for those taking aspirin and for those taking placebo

Example: Aspirin and Heart Attacks Revisited The sample proportions indicate that the aspirin group had a lower rate of heart attacks than the placebo group

Limitations of the Chi-Squared Test If the P-value is very small, strong evidence exists against the null hypothesis of independence But… The chi-squared statistic and the P-value tell us nothing about the nature of the strength of the association

Limitations of the Chi-Squared Test We know that there is statistical significance, but the test alone does not indicate whether there is practical significance as well