The Chi-Square Distribution and Test for Independence

The Chi-Square Distribution and Test for Independence
Hypothesis testing between two categorical variables

Agenda Wrapping up the t-test discussion: effect size, how big of a sample, and t-test.do program example Chi-Square and Chi-Square test of independence

Discussion from Last Class: Statistical Significance vs
Discussion from Last Class: Statistical Significance vs. Practical Significance

Effect Size Values of .8 or greater are usually large effects, .5 medium, .2 small. Effect size is a measure of practical significance, not statistical significance. Cohen’s d is a common effect size calculation. So, if you found a very large effect size (.9) but the effect was not statistically significant… 1) how would you interpret this result? 2) what could you do about it?

How Big of A Sample? We can use a power analysis to determine how big our sample needs to be in order to reject the null that there is no difference. Requires an idea of the two means and their S.D.’s (usually through pre-testing or literature) In STATA, use “sampsi” command. A power analysis is based on our best guesses about our sample(s). It is not a guarantee of significance.

Chi-Square test of independence and the chi-square distribution

Application of the Chi-Square
Used with two categorical variables. This will produce a crosstabulation (crosstab table) of cells. How do you know if the cell counts are different than what would be expected by chance alone? Or, are the observed cell counts in the cells independent from one another? “It’s like playing Russian roulette. If you keep on going, sooner or later you’re going to lose”

Chi-Square Distribution
The chi-square distribution results when independent variables with standard normal distributions are squared and summed.

Chi-square Degrees of freedom
df = (r-1) (c-1) Where r = # of rows, c = # of columns Thus, in any 2x2 contingency table, the degrees of freedom = 1. As the degrees of freedom increase, the distribution shifts to the right and the critical values of chi-square become larger.

Chi-Square Test of Independence

Using the Chi-Square Test
Often used with contingency tables (i.e., crosstabulations) E.g., gender x student The chi-square test of independence tests whether the columns are contingent on the rows in the table. In this case, the null hypothesis is that there is no relationship between row and column frequencies. H0: The 2 variables are independent.

Requirements for Chi-Square test
Must be a random sample from population Data must be in raw frequencies Variables must be independent Categories for each I.V. must be mutually exclusive and exhaustive

Example Crosstab: Gender x Student
Student Not Student Total Males 46 (40.97) 71 (76.02) 117 Females 37 (42.03) 83 (77.97) 120 154 237 Observed Expected

Special Cases Fisher’s Exact Test Strength of Association
When you have a 2 x 2 table with expected frequencies less than 5. Strength of Association Some use Cramer’s V (for any two nominal variables) or Phi (for 2 x 2 tables) to give a value of association between the variables. Cramer’s V is interpreted much like a correlation. Values range from 0-1, where under .2 is weak, .2 to .4 is strong, and anything higher is very strong.

Practical Examples: chi2dist.do chisquare.do Auto.dta

The Chi-Square Distribution and Test for Independence

Similar presentations

Presentation on theme: "The Chi-Square Distribution and Test for Independence"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

The Chi-Square Distribution and Test for Independence

Similar presentations

Presentation on theme: "The Chi-Square Distribution and Test for Independence"— Presentation transcript:

Similar presentations

About project

Feedback