Statistical Analysis Professor Lynne Stokes

Statistical Analysis Professor Lynne Stokes
Department of Statistical Science Lecture #2 Chi-square Tests for Homogeneity, Chi-square Goodness of Fit Test,

Chi-square Tests Tests for independence in contingency tables
Tests for homogeneity

Binomial Samples (Product Binomial Sampling)
Genetic Theory: Ho: pW = vs. Ha: pW Assumptions: 8 samples, mutually independent counts Hypothesis #1: Is pw = 0.5? Binomial inference on p Equivalently, overall goodness of fit (known p) Hypothesis #2: Are all the pw equal? Test for homogeneity (equal but unknown p) Hypothesis #3: Is each pw = 0.5? Goodness of fit (8 samples, known p)

Test of Homogeneity of k Binomial Samples, Specified p
Ho: p1 = p2 = … =p8 = vs. Ha: pj for some j Does not assume homogeneity (see below) X2 = , df = 8 , p = 0.003

Test of Homogeneity of k Binomial Samples: Unspecified p
Ho: p1 = p2 = … =p8 vs. Ha: pj pk for some (j,k)

Test of Homogeneity of k Binomial Samples: Unspecified p
Ho: p1 = p2 = … =p8 vs. Ha: pj pk for some (j,k) X2 = , df = 7 , p = 0.005 Note: Only one of each pair of expected values is independently estimated (k = 8, not 16)

Chi-square Tests Tests for independence in contingency tables
Tests for homogeneity Goodness of fit tests

Chi-square Goodness of Fit Test: Specified Probabilities
Assumptions n independent observations k mutually exclusive possible outcomes pj = Pr(outcome j) is the same on every trial Sample size condition All npj At least 80% of the npj

Goodness of Fit Test: Specified Probabilities
Sample size: n Observed count for outcome j : Oj Expected count for outcome j : Ej = npj Ho: Pr(outcome j) = pj for j = 1 , ... , k Ha: Pr(outcome j) pj for at least one j Reject Ho if X2 > Xa2 Xa2 = Chi-Square df = k - 1

Sufficient Evidence of
Cognitive Learning Path Chosen A B C D Total Number of rats Expected number Sufficient Evidence of Cognitive Learning ? p = 0.026 Using a significance level of a = 0.05, there is sufficient evidence (p = 0.026) to reject the hypothesis that rats choose the 4 doors with equal probability.

Mendelian Inheritance
Do the genotypes of a cross-breeding occur in the ratio 9:3:3:1 ? Reject Ho if X2 > (a = 0.05)

Mendelian Inheritance
X2 = = 2.66 There is insufficient evidence (p > 0.10) at a significance level of 0.05 to conclude that the genotypes from this type of cross-breeding occur in proportions that differ from those predicted by Mendelian inheritance theory.

Chi-Square Goodness of Fit Test: Unknown Parameters
Estimate the parameters of the distribution Divide range of data values into mutually exclusive and exhaustive classes Discrete data: often use the values themselves Continuous data: use k = n1/2 or k = log(n) classes Estimate the probability of being in each class Compare the observed (Oi) counts in each class with the estimated expected (Ei) counts

Chi-Square Goodness of Fit Test for the Poisson Distribution
Number of senders (automated telephone equipment) in use at a given time 23 – 1 = 22 Categories H0: number ~ Poisson Ha: number not Poisson Reject if X > C20.05(20) = 31.4 df: 22 – 1 (mutually exclusive & exhaustive) – 1 (estimated parameter) = 20

Chi-Square Goodness of Fit Test for the Normal Distribution
Divide the data into mutually exclusive and exhaustive (contiguous) classes First and last classes are open-ended ( , U1), (L2,U2), (L3, U3) … (Lk, ) with Lj = Uj-1 Estimate the mean and standard deviation Calculate z-scores for the limits of each class Estimate the Probability Content for Each Class pj = Pr(zLj < z < zUj) Estimate the Expected Frequency for Each Class Ej = npj

Chi-Square Goodness of Fit Test
Can be applied to any discrete or continuous probability distribution, only probabilities need be specified: Ei = npi Asymptotic chi-square distribution All Ei > 1 & at Least 80% of the Ei > 5 Does not have the highest power for specific distributions, against specific alternatives Degrees of freedom (k classes) If each class represents an independent sample (i.e, k replicate samples) and all parameters are known (i.e., known probabilities), df = k If the classes represent mutually exclusive and exhaustive categories (i.e., expected frequencies must sum to n), data are independent and from a single sample All parameters are known, df = k – 1 r parameters are estimated: df = k – r – 1 e.g., (n – 1)s2/s2 ~ C2(n – 1)

Goodness of Fit to the Binomial, Known p
Normal theory approximation Chi-square tests

Binomial Sample, Specified p: Normal Theory Approximation
Genetic Theory: Ho: pW = vs. Ha: pW Greater Power by Combining Samples (Assuming Homogeneity) p = 0.110

Alternative to the Binomial Test: Chi-square Goodness of Fit, Specified p
Genetic Theory: Ho: pW = vs. Ha: pW p = 0.110

Overall Binomial Test vs. Test of Homogeneity, Specified p
Ho: p1 = p2 = … =p8 = vs. Ha: pj for some j X2 = , df = 1 , p = 0.110 Greater Power if Homogeneous X2 = , df = 8 , p = 0.003 Greater Power if Not Homogeneous

Homogeneity, unspecified p equivalent to independence
Binomial Samples pw unspecified Homogeneity, unspecified p equivalent to independence

Some Goodness of Fit Tests
Chi-square Goodness-of-fit test Very general, can have little power Kolmogorov-Smirnov goodness-of-fit test Good general test, especially for continuous random variables Wilk-Shapiro test for normality Regarded as the best test for normality

Comparing Odds Ratios Across Categories

Race and Death Penalty Punishment
Are the results consistent across aggravation levels ?

Mantel-Haenszel Test Several 2 x 2 tables
Assuming a common odds ratio, test that the odds ratio = 1

Expected frequencies for chi-square test of independence Note: None have sufficient sample sizes for tests of independence

Mantel-Haenszel Test Select one cell; e.g., upper-left
Calculate the excess for each table Excess = Observed – Expected e.g., Excess = O11 – E11 Calculate the variances of the excesses Variance = R1R2C1C2/n2(n-1)

Conclusion: Nearly 7 more white-victim murderers received the death penalty than would be expected if the odds were the same for white- and black-victim murderers

Estimating the Common Odds Ratio
Death Penalty and Race

Statistical Analysis Professor Lynne Stokes

Similar presentations

Presentation on theme: "Statistical Analysis Professor Lynne Stokes"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Analysis Professor Lynne Stokes

Similar presentations

Presentation on theme: "Statistical Analysis Professor Lynne Stokes"— Presentation transcript:

Similar presentations

About project

Feedback