Section VII Chi-square test for comparing proportions and frequencies.

Slides:



Advertisements
Similar presentations
Goodness Of Fit. For example, suppose there are four entrances to a building. You want to know if the four entrances are equally used. You observe 400.
Advertisements

Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Statistical Inference for Frequency Data Chapter 16.
Chapter 11 Inference for Distributions of Categorical Data
Chapter 13: The Chi-Square Test
Chi Square Analyses: Comparing Frequency Distributions.
© 2010 Pearson Prentice Hall. All rights reserved The Chi-Square Test of Independence.
Chapter Goals After completing this chapter, you should be able to:
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
BCOR 1020 Business Statistics
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
11.4 Hardy-Wineberg Equilibrium. Equation - used to predict genotype frequencies in a population Predicted genotype frequencies are compared with Actual.
Chi-Squared Test.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics, A First Course 4 th Edition.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory. How.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Education 793 Class Notes Presentation 10 Chi-Square Tests and One-Way ANOVA.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
Chapter 11 Chi- Square Test for Homogeneity Target Goal: I can use a chi-square test to compare 3 or more proportions. I can use a chi-square test for.
Chi square analysis Just when you thought statistics was over!!
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests Business Statistics: A First Course Fifth Edition.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 11-1 Chapter 11 Chi-Square Tests and Nonparametric Tests Statistics for.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Dan Piett STAT West Virginia University Lecture 12.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
1 Chi-square Test Dr. T. T. Kachwala. Using the Chi-Square Test 2 The following are the two Applications: 1. Chi square as a test of Independence 2.Chi.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
Chapter 13. The Chi Square Test ( ) : is a nonparametric test of significance - used with nominal data -it makes no assumptions about the shape of the.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8… Where we are going… Significance Tests!! –Ch 9 Tests about a population proportion –Ch 9Tests.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Log-linear Models Please read Chapter Two. We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94)
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Test of independence: Contingency Table
Chapter 11 Chi-Square Tests.
Chapter 12 Tests with Qualitative Data
John Loucks St. Edward’s University . SLIDES . BY.
Data Analysis for Two-Way Tables
Analysis of count data 1.
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
STAT 312 Introduction Z-Tests and Confidence Intervals for a
Chapter 11 Chi-Square Tests.
Chapter Outline Goodness of Fit test Test of Independence.
Chapter 11 Chi-Square Tests.
Lecture 46 Section 14.5 Wed, Apr 13, 2005
Math 10, Spring 2019 Introductory Statistics
Presentation transcript:

Section VII Chi-square test for comparing proportions and frequencies

Z test for comparing proportions between two independent group Z = P 1 – P 2 SE SE d = √P 1 (1-P 1 )/n 1 + P 2 (1-P 2 )/n 2 Definition: Z 2 = χ 2 (1)= chi- square stat with one degree of freedom (df=1).

What if there are many (k) groups / many proportions to compare and we worry about multiplicity? Can do overall (omnibus) χ 2 test. Overall χ 2 test with k-1 degrees of freedom (df) is for testing null hypothesis that π 1 =π 2 =π 3 = … π k = π. Analog to overall F test in ANOVA for comparing many means.

Ex: Troublesome morning sickness 137/350 = 39% have troublesome morning sickness overall after treatment (100% had it before treatment) observed frequencies no tx accupress. dummy | total yes | 137 no | 213 total | 350 Pct yes 56% 24% 37% | 39%

What frequencies are expected if there is no association between treatment and outcome? (Null hypothesis: π 1 =π 2 =π 3 =π) Expected frequencies if no assn. no tx accupress. dummy total yes | 137 no | 213 total | % yes, 61% no in each group Calculating expected frequency – Example for the “yes, no tx” cell. Expected freq= 119 (137/350) = 46.6

If there is no association, the observed and expected frequencies should be similar. Chi-square statistic is a measure of squared differences between observed and expected frequencies.  2 =  (observed – expected) 2 expected In this example (with six cells)  2 = (67–46.6) 2 + ( ) 2 + … + ( ) 2 = df =(# rows-1)(# cols –1)= (2-1)(3-1)=2, p value < (get from =CHIDIST(χ2,df) in =EXCEL or chi-square table) If this overall p value is NOT significant, we conclude all proportions are not significantly different from each other at the α level, that is, there is no association.

Rule of thumb for Chi-square significance If the null hypothesis is true, the expected (average) value of the  2 statistic is equal to its degrees of freedom. E(  2 )=df. So, if the null hypothesis is true,  2 /df ≈ 1.0. Therefore a  2 value less than its df (or equivalently  2 /df < 1 ) is never statistically significant.

Technical note- Fisher’s exact test computation of the chi-square test p value Conventionally, the p values for the Z and χ 2 statistics are obtained by looking them up on the Gaussian distribution or the corresponding χ 2 distribution, which is derived from the Gaussian distribution. However, when the sample size is small and the expected frequencies are less than 5 in at least 20% of the cells, it can be shown that the central limit theorem approximation may not be accurate so the distribution of Z or χ 2 may not be Gaussian. So p values from the Gaussian or χ 2 tables are incorrect. In this case, the exact, correct p value can be computed based on the multinomial distribution (which we have not studied), although this computation is very difficult without a computer program. The algorithm for computing the exact p value was developed by RA Fisher so p values computed this was are said to be computed using “Fishers exact test”. However, the purpose is still to compare frequencies and proportions as with the Z and chi-square tests. In principle, the Fisher procedure could always be used in place of looking up a p value on the chi-square distribution.

F statistic for means is the analog of the chi-square statistic for proportion _ _ F =  (Y i - Y) 2 n i /(k-1) S e 2 _ Where Y i is the mean of the i th group n i is the sample size in the i th group (i=1, 2, 3,...k) _ Y = overall mean, k=number of groups and S e 2 is the squared pooled standard deviation defined by S e 2 = (n 1 -1) S (n 2 -1) S (n k -1)S k 2 (n 1 +n n k ) – k If the overall F based p value is not significant, we conclude none of the means are significantly different from each other.

Chi square goodness of fit Comparing observed to expected results from “theory”. Example: Poisson distribution We observe n=100 persons and record the number of colds each had in a three month period. Number colds number persons (o) expected number (e) Under the Poisson distribution with mean=92/100, the mean in the data, we computed the expected (e) number of persons with 0, 1, 2, 3 and 4 or more colds. e=100 (0.92 y e ) /y! The chi-square statistic = ∑ (o-e) 2 /e = df=5-1=4, p = NOT significant if data fits model.

Gregor Mendel pea seed form If round dominates angular, expect 25%+50%=75% round phenotype in hybrids observed plantroundangulartotalpct round % % % % % % % % % % total % Chi square=5.297, df=9, chi sq/df= 0.59, p value=0.8077