1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test.

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
1 1 Slide © 2011 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 12 Goodness-of-Fit Tests and Contingency Analysis
Inference about the Difference Between the
Discrete (Categorical) Data Analysis
Analysis of frequency counts with Chi square
1 1 Slide © 2009 Econ-2030-Applied Statistics-Dr. Tadesse. Chapter 11: Comparisons Involving Proportions and a Test of Independence n Inferences About.
Chapter 14 Analysis of Categorical Data
Chapter Goals After completing this chapter, you should be able to:
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
©2011 Brooks/Cole, Cengage Learning Elementary Statistics: Looking at the Big Picture 1 Lecture 33: Chapter 12, Section 2 Two Categorical Variables More.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
1 1 Slide © 2005 Thomson/South-Western Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial Population Goodness of.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.
1 © 2008 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 12 The Analysis of Categorical Data and Goodness-of-Fit Tests.
A Course In Business Statistics 4th © 2006 Prentice-Hall, Inc. Chap 9-1 A Course In Business Statistics 4 th Edition Chapter 9 Estimation and Hypothesis.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
1 1 Slide Chapter 11 Comparisons Involving Proportions n Inference about the Difference Between the Proportions of Two Populations Proportions of Two Populations.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
1 1 Slide © 2009 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Chapter 13 Inference for Counts: Chi-Square Tests © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chapter 14 Chi-Square Tests.  Hypothesis testing procedures for nominal variables (whose values are categories)  Focus on the number of people in different.
Lecture PowerPoint Slides Basic Practice of Statistics 7 th Edition.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Statistics 300: Elementary Statistics Section 11-3.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
The Chi-Square Distribution  Chi-square tests for ….. goodness of fit, and independence 1.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Log-linear Models Please read Chapter Two. We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94)
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Chi-Square hypothesis testing
Chapter 11 Chi-Square Tests.
Statistical Analysis Professor Lynne Stokes
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 12 Tests with Qualitative Data
Active Learning Lecture Slides
Qualitative data – tests of association
Data Analysis for Two-Way Tables
The Chi-Square Distribution and Test for Independence
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Chapter 11 Chi-Square Tests.
Analyzing the Association Between Categorical Variables
Chi-Square Hypothesis Testing PART 3:
Chapter 11 Chi-Square Tests.
Presentation transcript:

1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test

2 Independence Employment Status is independent of Age Note: One population, responses formed by two categorizations

3 Homogeneity If nondiscriminatory, promotions are binomially distributed with a common  for both gender categories If nondiscriminatory, promotions are binomially distributed with a common  for both gender categories Note: Two populations, common distribution of responses

4 Cognitive Learning in Rats -- Tolman, Ritchie, Kalish (1946) Prior Theory: Discrete Learning Steps Candidate Theory: Cognitive Learning Goal -- Hull -- Tolman C D Barrier B A

5 Goodness of Fit Number of Rats ACTotal Path Chosen Evidence of cognitive learning ? If random selection, Multinomial with  j = 1/4 Evidence of cognitive learning ? If random selection, Multinomial with  j = 1/4 BD

6 Compare Incidence of Death Penalty Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Drunk, Lover’s Quarrel, Argument, etc. More Serious Vicious, Cold-blooded, Unprovoked, Murder, etc.

7 Chi-Square Tests for Count Data Independence Distribution of responses across one categorization is identical for each category of a second categorization Homogeneity Distribution of responses is identical across several categories of one categorical variable or across several independent samples Goodness of Fit Responses are consistent with a stated probability distribution Parameters specified Unknown parameter values

8 Sampling Schemes

9 Chi-square Tests 1. Tests for independence in contingency tables

10 Contingency Tables (Crosstabs) Two categorizations (rows and columns) Each with mutually exclusive categories Sample of n independent observations Are the two categorizations statistically independent? Are the two categorizations statistically independent? e.g., Is employment status statistically independent of age? Note: Equivalent to Homogeneity Test, Unspecified p, When Only 2 Rows

11 Notation for Observed Frequencies 1... j... c Total 1... i O ij Row i Total... r TotalColumn n j Total (Ri)(Ri) (Cj)(Cj) Column Categories Row Categories

12 Chi-square Test for Independence H o : Row and column categories are independent H a : Row and column categories are not independent If row and column categories are independent, Reject Ho if X 2 > X  2 X  2 = Chi-Square df = (r - 1)(c - 1)

13 Degrees of Freedom for Contingency Tables Given Row and Column Totals, df = (r – 1)(c – 1) Row 1: df = c - 1 Row 2: df = c - 1 Row r-1: df = c Row r: Estimated expected frequencies in column j sum to C j

14 Chi-square Contingency Table Test Summary Reject Ho if X 2 > X  2 X  2 = Chi-Square df = (r - 1)(c - 1) Notational Convention: E ij Even Though Estimated

15 Employment Discrimination Observed Frequencies Expected Frequencies Chi-square Calculation

16 Employment Discrimination Age (yrs) Employment Status Age (yrs) Employment Status Are age and employment status related ?

17 Employment Discrimination H o : Employment Status and Age are independent H a : Employment Status and Age are not independent Reject Ho if X 2 > (  = 0.01, df = 1) Conclusion: There is sufficient evidence (p < 0.001), using a significance level of 0.05, to conclude that employment status and age are not statistically independent. X 2 = Reason: A greater number of older employees were terminated than expected under the hypothesis of independence.

18 Drug Usage Group Frequency of Drug Use Frequency of Drug Use Group

19 Drug Usage Observed Frequencies Expected Frequencies Chi-Square Calculation

20 Drug Usage H o : Drug Usage and Campus Group are Independent H a : Drug Usage and Campus Group are Not Independent Reject Ho if X 2 > (  = 0.05, df = 2) Conclusion : Using a significance level of 0.05, there is sufficient evidence (0.025 < p < 0.05) to conclude that drug usage and campus group are not statistically independent. X 2 = 6.87 Reason : A greater number of athletes and fewer members of campus organizations reported monthly usage of drugs than expected under the hypothesis of independence.

21

22 Chi-square Tests 1. Tests for independence in contingency tables 2. Tests for homogeneity

23 Binomial Samples (Product Binomial Sampling) Hypothesis #1: Is  w = 0.5? Binomial inference on  Equivalently, overall goodness of fit (known  ) Hypothesis #2: Are all the  w equal? Test for homogeneity (equal but unknown  ) Hypothesis #3: Is each  w = 0.5? Goodness of fit (8 Samples, known  ) Genetic Theory: H o :  W = 0.5 vs. H a :  W 0.5 Assumptions: 8 Samples, mutually independent counts Assumptions: 8 Samples, mutually independent counts

24 Test of Homogeneity of k Binomial Samples, Specified  H o :  1 =  2 = … =  8 = 0.5 vs. H a :  j 0.5 for some j X 2 = 22.96, df = 8, p = Does not assume homogeneity (see below)

25 Test of Homogeneity of k Binomial Samples: Unspecified  H o :  1 =  2 = … =  8 vs. H a :  j  k for some (j,k)

26 Test of Homogeneity of k Binomial Samples: Unspecified  X 2 = 20.43, df = 7, p = Note: Only one of each pair of expected vlues is independently estimated (k = 8, not 16) H o :  1 =  2 = … =  8 vs. H a :  j  k for some (j,k)