Download presentation
Presentation is loading. Please wait.
Published byAusten Chandler Modified over 9 years ago
1
Lecture 8 Chi-Square STAT 3120 Statistical Methods I
2
STAT3120 – Chi Square Dependent Variable Independent (predictor) Variable Statistical Test Comments QuantitativeCategoricalT-TEST (one, two or paired sample) Determines if categorical variable (factor) affects dependent variable; typically used for experimental or planned change studies Quantitative Correlation /Regression Analysis Test establishes a regression model; used to explain, predict or control dependent variable Categorical Chi-SquareTests if variables are statistically independent (i.e. are they related or not?)
3
STAT3120 – Chi Square When presented with categorical data, one common method of analysis is the “Contingency Table” or “Cross Tab”. This is a great way to display frequencies - For example, lets say that a firm has the following data: 120 male and 80 female employees 40 males and 10 females have been promoted
4
STAT3120 – Chi Square Using this data, we could create the following 2x2 matrix: PromotedNot PromotedTotal Male4080120 Female1070 80 Total50150200
5
STAT3120 – Chi Square Now, a few questions… 1)From the data, what is the probability of being promoted? 2)Given that you are MALE, what is the probability of being promoted? 3)Given that you are promoted, what is the probability that you are MALE? 4)Given that you are FEMALE, what is the probability of being promoted? 5)Given that you are promoted, what is the probability that you are female?
6
STAT3120 – Chi Square The answers to these questions help us start to understand if promotion status and gender are related. Specifically, we could test this relationship using a Chi- Square. This is the test used to determine if two variables are related. The relevant hypothesis statements for a Chi-Square test are: H0: Variable 1 and Variable 2 are NOT Related Ha: Variable 1 and Variable 2 ARE Related Develop the appropriate hypothesis statements and testing matrix for the gender/promotion data.
7
STAT3120 – Chi Square The Chi-Square Test uses the Χ 2 test statistic, which has a distribution that is skewed to the right (it approaches normality as the number of obs increases). You can see an example of the distribution on pg 641. The Χ 2 test statistic calculation can be found on page 640. The observed counts are provided in the dataset. The expected counts are the counts which would be expected if there was NO relationship between the two variables.
8
STAT3120 – Chi Square PromotedNot PromotedTotal Male4080120 Female1070 80 Total50150200 Going back to our example, the data provided is “observed”: What would the matrix look like if there was no relationship between promotion status and gender? The resulting matrix would be “expected”…
9
STAT3120 – Chi Square From the data, 25% of all employees were promoted. Therefore, if gender plays no role, then we should see 25% of the males promoted (75% not promoted) and 25% of the females promoted… PromotedNot PromotedTotal Male 120*.25 = 30120*.75 = 90 120 Female 80*.25 = 2080*.75 = 60 80 Total50150200 Notice that the marginal values did not change…only the interior values changed.
10
STAT3120 – Chi Square Now, calculate the X 2 statistic using the observed and the expected matrices: ((40-30) 2 /30)+((80-90) 2 /90)+((10-20) 2 /20)+((70- 60) 2 /60) = 3.33+1.11+5+1.67 = 11.11 This is conceptually equivalent to a t-statistic or a z-score.
11
To determine if this is in the rejection region, we must determine the df and then use the table on page 732. Df = (r-1)*(c-1)… In the current example, we have two rows and two columns. So the df = 1*1 = 1. At alpha =.05 and 1df, the critical value is 3.84…our value of 11.11 is clearly in the reject region…so what does this mean? STAT3120 – Chi Square
12
From the book Outliers, Malcolm Glidewell makes the point that the month in which a boy is born will determine his probability of playing in the NHL. The months of birth for players in the NHL are on the next page… (data taken from http://sports.espn.go.com/espn/page2/story?pa ge=merron/081208)
13
January51 February46 March61 April49 May46 June49 July36 August41 September36 October34 November33 December30 STAT3120 – Chi Square Now, if there is NO relationship between birth month and playing hockey, what SHOULD the distribution of months look like? Lets do this one in EXCEL… Note that this is technically referred to as a “goodness of fit” test – where we are assessing if the actual distribution “fits” what would be expected.
14
STAT3120 – Chi Square Practice Problems for Chi-Square: 15.55 15.56 15.57 15.58 For all of these, identify the hypothesis statements, the testing matrix, and the decision.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.