Download presentation
Presentation is loading. Please wait.
Published byMabel Christiana Booth Modified over 9 years ago
1
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test
2
2 Independence Employment Status is independent of Age Note: One population, responses formed by two categorizations
3
3 Homogeneity If nondiscriminatory, promotions are binomially distributed with a common for both gender categories If nondiscriminatory, promotions are binomially distributed with a common for both gender categories Note: Two populations, common distribution of responses
4
4 Cognitive Learning in Rats -- Tolman, Ritchie, Kalish (1946) Prior Theory: Discrete Learning Steps Candidate Theory: Cognitive Learning Goal -- Hull -- Tolman C D Barrier B A
5
5 Goodness of Fit Number of Rats 4 5 81532 ACTotal Path Chosen Evidence of cognitive learning ? If random selection, Multinomial with j = 1/4 Evidence of cognitive learning ? If random selection, Multinomial with j = 1/4 BD
6
6 Compare Incidence of Death Penalty Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Are victim’s race and sentence independent? Is aggravation level an explanatory factor? Drunk, Lover’s Quarrel, Argument, etc. More Serious Vicious, Cold-blooded, Unprovoked, Murder, etc.
7
7 Chi-Square Tests for Count Data Independence Distribution of responses across one categorization is identical for each category of a second categorization Homogeneity Distribution of responses is identical across several categories of one categorical variable or across several independent samples Goodness of Fit Responses are consistent with a stated probability distribution Parameters specified Unknown parameter values
8
8 Sampling Schemes
9
9 Chi-square Tests 1. Tests for independence in contingency tables
10
10 Contingency Tables (Crosstabs) Two categorizations (rows and columns) Each with mutually exclusive categories Sample of n independent observations Are the two categorizations statistically independent? Are the two categorizations statistically independent? e.g., Is employment status statistically independent of age? Note: Equivalent to Homogeneity Test, Unspecified p, When Only 2 Rows
11
11 Notation for Observed Frequencies 1... j... c Total 1... i O ij Row i Total... r TotalColumn n j Total (Ri)(Ri) (Cj)(Cj) Column Categories Row Categories
12
12 Chi-square Test for Independence H o : Row and column categories are independent H a : Row and column categories are not independent If row and column categories are independent, Reject Ho if X 2 > X 2 X 2 = Chi-Square df = (r - 1)(c - 1)
13
13 Degrees of Freedom for Contingency Tables Given Row and Column Totals, df = (r – 1)(c – 1) Row 1: df = c - 1 Row 2: df = c - 1 Row r-1: df = c - 1... Row r: Estimated expected frequencies in column j sum to C j
14
14 Chi-square Contingency Table Test Summary Reject Ho if X 2 > X 2 X 2 = Chi-Square df = (r - 1)(c - 1) Notational Convention: E ij Even Though Estimated
15
15 Employment Discrimination Observed Frequencies Expected Frequencies Chi-square Calculation
16
16 Employment Discrimination Age (yrs) Employment Status Age (yrs) Employment Status Are age and employment status related ?
17
17 Employment Discrimination H o : Employment Status and Age are independent H a : Employment Status and Age are not independent Reject Ho if X 2 > 6.635 ( = 0.01, df = 1) Conclusion: There is sufficient evidence (p < 0.001), using a significance level of 0.05, to conclude that employment status and age are not statistically independent. X 2 = 138.67 Reason: A greater number of older employees were terminated than expected under the hypothesis of independence.
18
18 Drug Usage Group Frequency of Drug Use Frequency of Drug Use Group
19
19 Drug Usage Observed Frequencies Expected Frequencies Chi-Square Calculation
20
20 Drug Usage H o : Drug Usage and Campus Group are Independent H a : Drug Usage and Campus Group are Not Independent Reject Ho if X 2 > 5.991 ( = 0.05, df = 2) Conclusion : Using a significance level of 0.05, there is sufficient evidence (0.025 < p < 0.05) to conclude that drug usage and campus group are not statistically independent. X 2 = 6.87 Reason : A greater number of athletes and fewer members of campus organizations reported monthly usage of drugs than expected under the hypothesis of independence.
21
21
22
22 Chi-square Tests 1. Tests for independence in contingency tables 2. Tests for homogeneity
23
23 Binomial Samples (Product Binomial Sampling) Hypothesis #1: Is w = 0.5? Binomial inference on Equivalently, overall goodness of fit (known ) Hypothesis #2: Are all the w equal? Test for homogeneity (equal but unknown ) Hypothesis #3: Is each w = 0.5? Goodness of fit (8 Samples, known ) Genetic Theory: H o : W = 0.5 vs. H a : W 0.5 Assumptions: 8 Samples, mutually independent counts Assumptions: 8 Samples, mutually independent counts
24
24 Test of Homogeneity of k Binomial Samples, Specified H o : 1 = 2 = … = 8 = 0.5 vs. H a : j 0.5 for some j X 2 = 22.96, df = 8, p = 0.003 Does not assume homogeneity (see below)
25
25 Test of Homogeneity of k Binomial Samples: Unspecified H o : 1 = 2 = … = 8 vs. H a : j k for some (j,k)
26
26 Test of Homogeneity of k Binomial Samples: Unspecified X 2 = 20.43, df = 7, p = 0.005 Note: Only one of each pair of expected vlues is independently estimated (k = 8, not 16) H o : 1 = 2 = … = 8 vs. H a : j k for some (j,k)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.