Wednesday, December 7 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables. Summing up!

Slides:



Advertisements
Similar presentations
Chi-square test Chi-square test or  2 test. Chi-square test countsUsed to test the counts of categorical data ThreeThree types –Goodness of fit (univariate)
Advertisements

AP Statistics Tuesday, 15 April 2014 OBJECTIVE TSW (1) identify the conditions to use a chi-square test; (2) examine the chi-square test for independence;
AP Biology.  Segregation of the alleles into gametes is like a coin toss (heads or tails = equal probability)  Rule of Multiplication  Probability.
Chi-Square Test A fundamental problem is genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Bivariate Analyses.
Hypothesis Testing IV Chi Square.
Statistical Inference for Frequency Data Chapter 16.
INTRODUCTION TO NON-PARAMETRIC ANALYSES CHI SQUARE ANALYSIS.
Chi Square Analyses: Comparing Frequency Distributions.
Please turn in your signed syllabus. We will be going to get textbooks shortly after class starts. Homework: Reading Guide – Chapter 2: The Chemical Context.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Monday, Dec. 2 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables.
Chi Square Test Dealing with categorical dependant variable.
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Crosstabs and Chi Squares Computer Applications in Psychology.
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Distributions of Nominal Variables 12/02. Nominal Data Some measurements are just types or categories – Favorite color, college major, political affiliation,
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
11.4 Hardy-Wineberg Equilibrium. Equation - used to predict genotype frequencies in a population Predicted genotype frequencies are compared with Actual.
Chi-Squared Test.
Hypothesis Testing:.
Imagine a a bag that contained 90 white marbles and 10 black marbles. If you drew 10 marbles, how many would you expect to come up white, and how many.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory. How.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
CHI SQUARE TESTS.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
© 2000 Prentice-Hall, Inc. Statistics The Chi-Square Test & The Analysis of Contingency Tables Chapter 13.
Chi-Square Test James A. Pershing, Ph.D. Indiana University.
Comparing Counts.  A test of whether the distribution of counts in one categorical variable matches the distribution predicted by a model is called a.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Science Practice 2: The student can use mathematics appropriately. Science Practice 5: The student can perform data analysis and evaluation of evidence.
Chapter Fifteen Chi-Square and Other Nonparametric Procedures.
Bullied as a child? Are you tall or short? 6’ 4” 5’ 10” 4’ 2’ 4”
Wednesday, December 5 Chi-square Test of Independence: Two Variables. Summing up!
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture #1 Chi-square Contingency Table Test.
Did Mendel fake is data? Do a quick internet search and can you find opinions that support or reject this point of view. Does it matter? Should it matter?
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Chi Square 11.1 Chi Square. All the tests we’ve learned so far assume that our data is normally distributed z-test t-test We test hypotheses about parameters.
Bivariate Association. Introduction This chapter is about measures of association This chapter is about measures of association These are designed to.
Political Science 30: Political Inquiry. How Sure is Sure? Quantifying Uncertainty in Tables Using Two-Way Tables SAT scores and UC admissions What’s.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
Hypothesis Testing Review
Qualitative data – tests of association
Chi-square test or c2 test
Chi-square Goodness of Fit
The Chi-Square Distribution and Test for Independence
Is a persons’ size related to if they were bullied
Chi Square Two-way Tables
Goodness of Fit Test - Chi-Squared Distribution
Chapter 10 Analyzing the Association Between Categorical Variables
Analyzing the Association Between Categorical Variables
How do you know if the variation in data is the result of random chance or environmental factors? O is the observed value E is the expected value.
UNIT V CHISQUARE DISTRIBUTION
Chi-Square Test A fundamental problem in Science is determining whether the experiment data fits the results expected. How can you tell if an observed.
Hypothesis Testing - Chi Square
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Presentation transcript:

Wednesday, December 7 Chi-square Goodness of Fit Chi-square Test of Independence: Two Variables. Summing up!

gg yy yg yyyggggy 25%25% 25% 25%

Pea Color freq Observed freq Expected Yellow Green TOTAL

Pea Color freq Observed freq Expected Yellow Green TOTAL 22 =  (f o - f e ) 2 fefe i=1 k Chi Square Goodness of Fit d.f. = k - 1, where k = number of categories of in the variable.

“… the general level of agreement between Mendel’s expectations and his reported results shows that it is closer than would be expected in the best of several thousand repetitions. The data have evidently been sophisticated systematically, and after examining various possibilities, I have no doubt that Mendel was deceived by a gardening assistant, who knew only too well what his principal expected from each trial made…” -- R. A. Fisher

Pea Color freq Observed freq Expected Yellow Green TOTAL 22 =  (f o - f e ) 2 fefe i=1 k Chi Square Goodness of Fit d.f. = k - 1, where k = number of categories of in the variable.

Peas to Kids: Another Example Goodness of Fit At my children’s school science fair last year, where participation was voluntary but strongly encouraged, I counted about 60 boys and 40 girls who had submitted entries. Since I expect a ratio of 50:50 if there were no gender preference for submission, is this observation deviant, beyond chance level?

BoysGirls Expected:5050 Observed:6040

BoysGirls Expected:5050 Observed:6040 22 =  (f o - f e ) 2 fefe i=1 k

BoysGirls Expected:5050 Observed:6040 22 =  (f o - f e ) 2 fefe i=1 k For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories.

BoysGirls Expected:5050 Observed:6040 22 =  (f o - f e ) 2 fefe i=1 k For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories. (60-50) 2 (40-50) = 4.00=

BoysGirls Expected:5050 Observed:6040 22 =  (f o - f e ) 2 fefe i=1 k For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories. (60-50) 2 (40-50) = 4.00= This value, chi-square, will be distributed with known probability values, where the degrees of freedom is a function of the number of categories (not n). In this one-variable case, d.f. = k - 1.

BoysGirls Expected:5050 Observed:6040 22 =  (f o - f e ) 2 fefe i=1 k For each of k categories, square the difference between the observed and the expected frequency, divide by the expected frequency, and sum over all k categories. (60-50) 2 (40-50) = 4.00= This value, chi-square, will be distributed with known probability values, where the degrees of freedom is a function of the number of categories (not n). In this one-variable case, d.f. = k - 1. Critical value of chi-square at  =.05, d.f.=1 is 3.84, so reject H 0.

Chi-square Test of Independence Are two nominal level variables related or independent from each other? Is race related to SES, or are they independent?

Lo Hi SES WhiteBlack

Row n x Column n Total n The expected frequency of any given cell is Lo Hi SES WhiteBlack

22 = (f o - f e ) 2 fefe  r=1 r  c=1 c At d.f. = (r - 1)(c - 1)

Row n x Column n Total n The expected frequency of any given cell is (15x28)/47(15x19)/47 (32x28)/47(32x19)/47

Row n x Column n Total n The expected frequency of any given cell is (15x28)/47(15x19)/47 (32x28)/47(32x19)/

22 = (f o - f e ) 2 fefe  r=1 r  c=1 c Please calculate:

Important assumptions: Independent observations. Observations are mutually exclusive. Expected frequencies should be reasonably large: d.f. 1, at least 5 d.f. 2, >2 d.f. >3, if all expected frequencies but one are greater than or equal to 5 and if the one that is not is at least equal to 1.

Univariate Statistics: IntervalMeanone-sample t-test OrdinalMedian NominalModeChi-squared goodness of fit

Bivariate Statistics NominalOrdinalInterval Nominal  2 Rank-sumt-test Kruskal-Wallis HANOVA OrdinalSpearman r s (rho) IntervalPearson r Regression Y X

Who said this? "The definition of insanity is doing the same thing over and over again and expecting different results".

Who said this? "The definition of insanity is doing the same thing over and over again and expecting different results".

I don’t like it because from a statistical point of view, it is insane to do the same thing over and over again and expect the same results! More to the point, the wisdom of statistics lies in understanding that repeating things some ways ends up with results that are more the same than others. Hmm. Think about this for a moment. Statistics allows one to understand the expected variability in results even when the same thing is done, as a function of σ and N.

Your turn! Given this start, explain why uncle Albert heads us down the wrong path. In your answer, make sure you refer to the error statistic (e.g., standard error of the mean, standard error of the difference between means, Mean Square within) as well as the sample size N. In short, explain why statistical thinking is beautiful, and why Albert Einstein (if he ever said it) was wrong.