Chi-square Goodness of Fit

Slides:



Advertisements
Similar presentations
Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)
Advertisements

AP Statistics Tuesday, 15 April 2014 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi-square test for independence;
AP Biology.  Segregation of the alleles into gametes is like a coin toss (heads or tails = equal probability)  Rule of Multiplication  Probability.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Basic Statistics The Chi Square Test of Independence.
Hypothesis Testing IV Chi Square.
Chapter 13: The Chi-Square Test
Chi Square Analyses: Comparing Frequency Distributions.
Please turn in your signed syllabus. We will be going to get textbooks shortly after class starts. Homework: Reading Guide – Chapter 2: The Chemical Context.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Monday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables.
Chi Square Test Dealing with categorical dependant variable.
Statistical Analysis I have all this data. Now what does it mean?
Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.
Statistical Analysis I have all this data. Now what does it mean?
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 14 Nonparametric Tests Part III: Additional Hypothesis Tests Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
© 2014 by Pearson Higher Education, Inc Upper Saddle River, New Jersey All Rights Reserved HLTH 300 Biostatistics for Public Health Practice, Raul.
CHI SQUARE TESTS.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Chi-Square Test James A. Pershing, Ph.D. Indiana University.
Science Practice 2: The student can use mathematics appropriately. Science Practice 5: The student can perform data analysis and evaluation of evidence.
Wednesday, December 5 Chi-square Test of Independence: Two Variables. Summing up!
Wednesday, December 7 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables. Summing up!
Chi Square 11.1 Chi Square. All the tests we’ve learned so far assume that our data is normally distributed z-test t-test We test hypotheses about parameters.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
I. ANOVA revisited & reviewed
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Basic Statistics The Chi Square Test of Independence.
Political Science 30: Political Inquiry
Data measurement, probability and Spearman’s Rho
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chapter 9: Non-parametric Tests
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Hypothesis Testing Review
CHOOSING A STATISTICAL TEST
Qualitative data – tests of association
Learning Aims By the end of this session you are going to totally ‘get’ levels of significance and why we do statistical tests!
Data measurement, probability and statistical tests
Social Research Methods
Chi-square test or c2 test
Inferential Statistics
Statistical Tool Boxes
The Chi-Square Distribution and Test for Independence
Is a persons’ size related to if they were bullied
Chi Square Two-way Tables
Is a persons’ size related to if they were bullied
Reasoning in Psychology Using Statistics
Goodness of Fit Test - Chi-Squared Distribution
Play it Again and When Your Parameters Aren’t Metric
Statistical Analysis Chi-Square.
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Analyzing the Association Between Categorical Variables
Data measurement, probability and statistical tests
11E The Chi-Square Test of Independence
Parametric versus Nonparametric (Chi-square)
Reasoning in Psychology Using Statistics
How do you know if the variation in data is the result of random chance or environmental factors? O is the observed value E is the expected value.
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Chapter 18: The Chi-Square Statistic
Inferential Statistical Tests
Chi-Square Test A fundamental problem in Science is determining whether the experiment data fits the results expected. How can you tell if an observed.
Hypothesis Testing - Chi Square
Contingency Tables (cross tabs)
Students Get handout - Chi-square statistical
Math 10, Spring 2019 Introductory Statistics
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Examine Relationships
Presentation transcript:

Chi-square Goodness of Fit Friday, Dec. 3 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables. Summing up!

gg yy yg yg yg yg yy yg gg gy 25% 25% 25% 25%

Pea Color freq Observed freq Expected Yellow 158 150 Green 42 50 TOTAL 200 200

2  Chi Square Goodness of Fit Pea Color freq Observed freq Expected Yellow 158 150 Green 42 50 TOTAL 200 200 2 =  (fo - fe)2 fe i=1 k d.f. = k - 1, where k = number of categories of in the variable.

“… the general level of agreement between Mendel’s expectations and his reported results shows that it is closer than would be expected in the best of several thousand repetitions. The data have evidently been sophisticated systematically, and after examining various possibilities, I have no doubt that Mendel was deceived by a gardening assistant, who knew only too well what his principal expected from each trial made…” -- R. A. Fisher

2  Mendel's Cooking! Chi Square Goodness of Fit Pea Color freq Observed freq Expected Yellow 151 150 Green 49 50 TOTAL 200 200 2 =  (fo - fe)2 fe i=1 k d.f. = k - 1, where k = number of categories of in the variable.

Peas to Kids: Another Example Goodness of Fit At my children’s school science fair last year, where participation was voluntary but strongly encouraged, I counted about 60 boys and 40 girls who had submitted entries. Since I expect a ratio of 50:50 if there were no gender preference for submission, is this observation deviant, beyond chance level?

Boys Girls Expected: 50 50 Observed: 60 40

Boys Girls Expected: 50 50 Observed: 60 40 2 =  (fo - fe)2 fe i=1 k

2  Boys Girls Expected: 50 50 Observed: 60 40 (fo - fe)2 = fe k (fo - fe)2  = fe i=1 For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories.

2  Boys Girls Expected: 50 50 Observed: 60 40 (fo - fe)2 = = + k (fo - fe)2 (60-50)2 (40-50)2  = = + = 4.00 fe 50 50 i=1 For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories.

2  Boys Girls Expected: 50 50 Observed: 60 40 (fo - fe)2 = = + k (fo - fe)2 (60-50)2 (40-50)2  = = + = 4.00 fe 50 50 i=1 For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories. This value, chi-square, will be distributed with known probability values, where the degrees of freedom is a function of the number of categories (not n). In this one-variable case, d.f. = k - 1.

2  Boys Girls Expected: 50 50 Observed: 60 40 (fo - fe)2 = = + k (fo - fe)2 (60-50)2 (40-50)2  = = + = 4.00 fe 50 50 i=1 For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories. This value, chi-square, will be distributed with known probability values, where the degrees of freedom is a function of the number of categories (not n). In this one-variable case, d.f. = k - 1. Critical value of chi-square at =.05, d.f.=1 is 3.84, so reject H0.

Chi-square Test of Independence Are two nominal level variables related or independent from each other? Is race related to SES, or are they independent?

White Black Hi 12 3 15 SES 32 Lo 16 16 47 28 19

Row n x Column n Total n The expected frequency of any given cell is White Black Hi 12 3 15 SES 32 Lo 16 16 47 28 19

2 = (fo - fe)2 fe  r=1 r c=1 c At d.f. = (r - 1)(c - 1)

The expected frequency of any given cell is Row n x Column n Total n (15x28)/47 (15x19)/47 15 (32x28)/47 (32x19)/47 32 47 28 19

The expected frequency of any given cell is Row n x Column n Total n (15x28)/47 (15x19)/47 15 8.94 6.06 (32x28)/47 (32x19)/47 32 19.06 12.94 47 28 19

2  Please calculate: = (fo - fe)2 fe 12 3 15 8.94 6.06 16 16 32 r=1 r c=1 c 12 3 15 8.94 6.06 16 16 32 19.06 12.94 47 28 19

Important assumptions: Independent observations. Observations are mutually exclusive. Expected frequencies should be reasonably large: d.f. 1, at least 5 d.f. 2, >2 d.f. >3, if all expected frequencies but one are greater than or equal to 5 and if the one that is not is at least equal to 1.

Univariate Statistics: Interval Mean one-sample t-test Ordinal Median Nominal Mode Chi-squared goodness of fit

Y X Bivariate Statistics Nominal Ordinal Interval Nominal 2 Rank-sum t-test Kruskal-Wallis H ANOVA Ordinal Spearman rs (rho) Interval Pearson r Regression X

Who said this? "The definition of insanity is doing the same thing over and over again and expecting different results".

I hate this quote! I hate this quote! I hate this quote! Who said this? I hate this quote! "The definition of insanity is doing the same thing over and over again and expecting different results". I hate this quote! I hate this quote! I hate this quote! I hate this quote!

I don’t like it because from a statistical point of view, it is insane to do the same thing over and over again and expect the same results! More to the point, the wisdom of statistics lies in understanding that repeating things some ways ends up with results that are more the same than others. Hmm. Think about this for a moment. Statistics allows one to understand the expected variability in results even when the same thing is done, as a function of σ and N.

Your turn! Given this start, explain why uncle Albert heads us down the wrong path. In your answer, make sure you refer to the error statistic (e.g., standard error of the mean, standard error of the difference between means, Mean Square within) as well as the sample size N. In short, explain why statistical thinking is beautiful, and why Albert Einstein (if he ever said it) was wrong.