Chi-square, Goodness of fit, and Contingency Tables

Slides:



Advertisements
Similar presentations
What is Chi-Square? Used to examine differences in the distributions of nominal data A mathematical comparison between expected frequencies and observed.
Advertisements

CHI-SQUARE(X2) DISTRIBUTION
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Tutorial: Chi-Square Distribution Presented by: Nikki Natividad Course: BIOL Biostatistics.
Contingency Tables Chapters Seven, Sixteen, and Eighteen Chapter Seven –Definition of Contingency Tables –Basic Statistics –SPSS program (Crosstabulation)
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Aaker, Kumar, Day Seventh Edition Instructor’s Presentation Slides
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Chi Square Classifying yourself as studious or not. YesNoTotal Are they significantly different? YesNoTotal Read ahead Yes.
Chi square analysis Just when you thought statistics was over!!
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Chapter Outline Goodness of Fit test Test of Independence.
State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Leftover Slides from Week Five. Steps in Hypothesis Testing Specify the research hypothesis and corresponding null hypothesis Compute the value of a test.
Chi Squared Statistical test used to see if the results of an experiment support a theory or to check that categorical data is independent of each other.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Statistics 300: Elementary Statistics Section 11-3.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Chi Square Test Dr. Asif Rehman.
Comparing Counts Chi Square Tests Independence.
Introduction to Marketing Research
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
Statistical Analysis: Chi Square
CHI-SQUARE(X2) DISTRIBUTION
Test of independence: Contingency Table
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Chi-Square hypothesis testing
Chapter 9: Non-parametric Tests
Presentation 12 Chi-Square test.
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chi-square: Comparing Observed and
Hypothesis Testing Review
Community &family medicine
Data Analysis for Two-Way Tables
Chapter 11 Goodness-of-Fit and Contingency Tables
Chi Square SBI3UP.
Chi-square: Comparing Observed and Expected Counts
Chi Square Two-way Tables
Econ 3790: Business and Economics Statistics
Reasoning in Psychology Using Statistics
Hypothesis Testing and Comparing Two Proportions
Chapter 11: Inference for Distributions of Categorical Data
Chapter 10 Analyzing the Association Between Categorical Variables
Contingency Tables: Independence and Homogeneity
Statistical Analysis Chi-Square.
STAT 312 Introduction Z-Tests and Confidence Intervals for a
Analyzing the Association Between Categorical Variables
Lecture 43 Sections 14.4 – 14.5 Mon, Nov 26, 2007
Assistant prof. Dr. Mayasah A. Sadiq FICMS-FM
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Inference for Two Way Tables
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chapter Outline Goodness of Fit test Test of Independence.
Quadrat sampling & the Chi-squared test
Quadrat sampling & the Chi-squared test
Presentation transcript:

Chi-square, Goodness of fit, and Contingency Tables

What is the χ2 distribution Basically a distribution of squared differences

Useful for detecting categorical differences Calculate the χ2 test statistic= (observed-expected)2/expected Degrees of freedom = number of categories -1 Look up χ2 value for that degree of freedom and chosen alpha value. If test statistic > table value, then significant

Two sided test: find the column corresponding to α/2 in the table for upper critical values and reject the null hypothesis if the test statistic is greater than the tabled value. Use 1 - α /2 in the table for lower critical values and reject null if the test statistic is less than the tabled value. Upper one-sided test: find column corresponding to α in upper critical values table. If test statistic greater, reject.

Also useful for model fitting Assume you have a fit a model to some data and have some residual errors left over. You want to check if residuals are normally distributed. You bin them in a histogram Estimate proportions of residuals in each, compare to actual data

Model Fitting Example Consider a classic genetics experiment. The offspring of a cross between the F1 brassicas was 53 dark green and 11 yellow. If the plants are heterozygous for color the ratio of 3 dark green to 1 yellow would be expected.   Dark Green Yellow Total Observed numbers (O) 53 11 64 Expected numbers (E) 48 16 O - E 5 -5 (O-E)2 25 (O-E)2 / E 25/48 = 0.52 25/16 = 1.56 2.08

Compound Hypotheses and Directionality With multiple categories, compound hypotheses are possible H0 Pr(cat 1) = 0.25, Pr(cat 2) = 0.50 and Pr(cat 3) = 0.75 HA: one of the above not the case Where there are 2 categories, a “directional alternative” is possible

Directional Alternatives Only in the case of “dichotomous variables” – two categories, effectively. Step 1: Check Directionality of trend If not, p-value > 0.5 by necessity If so, proceed to step 2 The P-value is half what it would be if HA were non directional

Directional Alternative Example Two football teams records are compared against the average number of wins by an NFL team per year, 9. Team 1 won 14 games this year and several players were caught doping with HGF. Team 2 won 11 games this year and tested clean. Is there evidence that doping increased the number of wins by team 1?

Contingency Tables Use χ2 test statistic as above, but Calculate expected values for each element in table from E=(row total)*(column total)/Grand Total; Df =1

2x2 Contingency Tables Can indicate either Two independent samples with a dichotomous observed variabled One sample with two dichotomous observed variables Female Male Tot(col) HIV test 9 8 17 No HIV test 52 51 103 Tot (row) 61 59 120

Relation to Independence of data You can interpret contingency tables in terms of conditional probabilities Pr(HIV test | female)= 9/61 Pr(female | HIV test) = 9/17 Test becomes H0 : Likelihood of taking and HIV test is independent of sex Female Male Tot(col) HIV test 9 8 17 No HIV test 52 51 103 Tot (row) 61 59 120

Rxk contingency tables Same as above, but degrees of freedom = (r-1)*(k-1).

Corrections to the Chi-Squared Test It is a requirement that a chi-squared test be applied to discrete data. Counting numbers are appropriate, continuous measurements are not. Assuming continuity in the underlying distribution distorts the p value and may make false positives more likely. Frank Yates proposed a correction to the chi-squared formula. Adding a small negative term to the argument. This tends to increase the p-value, and makes the test more conservative, making false positives less likely. However, the test may now be *too* conservative. Additionally, chi squared test should not be used when the observed values in a cell are <5. It is, at times not inappropriate to pad an empty cell with a small value, though, as one can only assume the result would be more significant with no value there.