Distributions of Nominal Variables

Slides:



Advertisements
Similar presentations
CHI-SQUARE(X2) DISTRIBUTION
Advertisements

Basic Statistics The Chi Square Test of Independence.
Hypothesis Testing IV Chi Square.
Chapter 10 Chi-Square Tests and the F- Distribution 1 Larson/Farber 4th ed.
PSY 307 – Statistics for the Behavioral Sciences
BHS Methods in Behavioral Sciences I
Ch 15 - Chi-square Nonparametric Methods: Chi-Square Applications
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
Chi-Square Test.
1 Nominal Data Greg C Elvers. 2 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics.
Chi-Square Test A fundamental problem in genetics is determining whether the experimentally determined data fits the results expected from theory (i.e.
Distributions of Nominal Variables 12/02. Nominal Data Some measurements are just types or categories – Favorite color, college major, political affiliation,
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
11.4 Hardy-Wineberg Equilibrium. Equation - used to predict genotype frequencies in a population Predicted genotype frequencies are compared with Actual.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Section 10.1 Goodness of Fit. Section 10.1 Objectives Use the chi-square distribution to test whether a frequency distribution fits a claimed distribution.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
10.1: Multinomial Experiments Multinomial experiment A probability experiment consisting of a fixed number of trials in which there are more than two possible.
Chapter 20 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 These tests can be used when all of the data from a study has been measured on.
Chi-Square Test.
Chapter 10 Chi-Square Tests and the F-Distribution
Nonparametric Tests: Chi Square   Lesson 16. Parametric vs. Nonparametric Tests n Parametric hypothesis test about population parameter (  or  2.
GOODNESS OF FIT Larson/Farber 4th ed 1 Section 10.1.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Chi-Square Test James A. Pershing, Ph.D. Indiana University.
Non-parametric tests (chi-square test) Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Statistics in IB Biology Error bars, standard deviation, t-test and more.
Statistics 300: Elementary Statistics Section 11-2.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi Square Test for Goodness of Fit Determining if our sample fits the way it should be.
Section 10.1 Goodness of Fit © 2012 Pearson Education, Inc. All rights reserved. 1 of 91.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
The Chi Square Test A statistical method used to determine goodness of fit Chi-square requires no assumptions about the shape of the population distribution.
Basic Statistics The Chi Square Test of Independence.
Chi-Square Test.
M & M Statistics: A Chi Square Analysis
Chapter 9: Non-parametric Tests
AP Biology Intro to Statistics
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Distributions of Nominal Variables
Distributions of Nominal Variables
Hypothesis Testing Review
Active Learning Lecture Slides
Chapter 25 Comparing Counts.
Section 10-1 – Goodness of Fit
Data Analysis for Two-Way Tables
Elementary Statistics: Picturing The World
Chi-Square Test.
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
MENDELIAN GENETICS CHI SQUARE ANALYSIS
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
Chi-Square Analysis.
Chi-Square Test.
Goodness of Fit Test - Chi-Squared Distribution
Contingency Tables (cross tabs)
Statistical Process Control
The Chi Square Test A statistical method used to determine goodness of fit Goodness of fit refers to how close the observed data are to those predicted.
Statistical Analysis: Chi Square
Chi-Square Test.
Day 66 Agenda: Quiz Ch 12 & minutes.
Chapter 26 Comparing Counts.
Chapter 26 Comparing Counts Copyright © 2009 Pearson Education, Inc.
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
The Binomial Distributions
Chapter 26 Comparing Counts.
Will use Fruit Flies for our example
Presentation transcript:

Distributions of Nominal Variables

Nominal Data Some measurements are just types or categories Favorite color, college major, political affiliation, how you get to school, where you’re from Minimal mathematical structure, but we can still do hypothesis testing Hypotheses about frequencies or probabilities Are all categories equally likely? Do two groups differ in their distributions? Are two nominal variables related or independent?

Extending the Binomial Test Frequency of observations in yes/true category Compare to prediction of null hypothesis Normal approximation Treat binomial distribution as Normal Convert frequency to z-score mfreq f

Extending the Binomial Test Multinomial test Count observations in every category Observed frequencies, f obs Convert each to z-score H0 predicts each z should be near 0 Chi-square statistic Sum of squared z-scores Measures deviation from null hypothesis p-value Probability of result greater than c2 Uses chi-square distribution df = k – 1 (k is number of categories) Counts are not independent; last constrained by rest

Details of z-score Expected frequencies: Standard error Frequency of each category predicted by H0 Expected value or mean of sampling distribution Category probability times number of observations (n) If all categories equally likely: Standard error Denominator of z formula Standard deviation of sampling distribution (adjusted for degrees of freedom) Equals square root of expected frequency:

Example: Favorite Colors Choices: Red, Yellow, Green, Blue, Purple Are they all equally popular? Null hypothesis: For each color, Deviances: Squared z-scores: Chi-square statistic: Critical value (df = 4, a = 5%): 9.49

Independence of Nominal Variables Are two nominal variables related? Same question as correlation, but need different approach Do probabilities for one variable differ between categories of another? Experimental condition vs. success of learning; sex vs. political affiliation; origin vs. major Independent nominal variables Probabilities for each variable unaffected by other Example: 80% from CO, 10% psych majors 80% of psych majors are from CO 80%10% = 8% both psych and from CO p(x & y) = p(x)p(y)

Chi-square Test of Independence Null hypothesis: Variables are independent Use H0 to calculate expected frequencies Find observed marginal frequencies for each variable Total count for each category, ignoring levels of other variable Multiply marginal frequencies to get expected frequency for combination Same formula as before: y1 y2 y3 x1 40 20 10 40 30 20 80 x2 30 30 80 60 10 30 120 x3 20 40 90 80 50 40 160 90 180 360

General Principles of Chi-square Tests Can use any prediction about data as null hypothesis Very general approach Measure goodness of fit Actually badness of fit Deviation of data from prediction Nominal data Calculate z-score for each frequency within its sampling distribution Observed minus expected frequency, divided by sqrt(f exp) Square zs and sum, to get c2 Distribution of one variable; dependence between two variables Compare GoF to chi-square distribution to get p-value p = p(c2df > c2) df comes from number of parameters constrained by H0 k – 1 for multinomial test; (kX – 1)(kY – 1) for independence test