Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.

Similar presentations


Presentation on theme: "Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables."— Presentation transcript:

1 Chi-Square X 2

2 Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables

3

4

5 The “null” hypothesis Inferential statistics use samples to make conclusions about a population Whenever we use inferential statistics the “null hypothesis” applies – Null hypothesis: Any apparent effect of the independent variable(s) on the dependent variable(s) was produced by chance – Unless you can show otherwise, THE NULL IS ALWAYS TRUE Researchers always want to REJECT the null hypothesis – Rejecting the null hypothesis is the same as confirming the working hypothesis The only way to reject the null is for the results of statistical tests (e.g., difference between the means) to be very, very substantial How substantial? The test statistic (e.g., r, t, b, X 2, etc.) must be of such magnitude - so large - that it goes way beyond what one would expect because of sampling error How far is that? To reject the null the probability that it’s true must be LESS than 5 in 100 (p <.05) How do we know if it is? – If you’re doing the computation, compare the test statistic to a table – If you’re reading a study, is there an asterisk by the test statistic? Usually one asterisk (*) means the probability the null is true is less than 5/100. Two asterisks (**) are better (p <.01, probability the null is true is less than 1/100). Three (***) is great (p <.001, probability the null is true is less than 1/1000.) If there are NO asterisks, the null hypothesis is true

6 Chi square (  2 ) Example: Does gender affect court disposition? Used with moderate size random samples Tests for relationship between two nominal variables (categorical, cannot be ordered) that have been cross tabulated Evaluates difference between Observed and Expected cell frequencies: – “Observed” means the cell frequencies that are actually present – “Expected” means the cell frequencies we would “expect” if there was no relationship between the variables (null hypothesis is true) – If there is no difference,  2 is zero – Greater the difference, the larger the value of  2 Chi-Square (X 2 ) is used when all variables are categorical (not ordinal)

7 Class exercise Court disposition GenderJailReleasedTotal Male8416100 Female302050 Total11436n = 150 Hypothesis: Gender  Disposition Observed cell frequencies

8 Creating the “Expected” table – cell frequencies if the null hypothesis is true Court disposition GenderJailReleasedTotal Male100 Female50 Total11436n = 150 Independent variable category total Grand total Male/Jail: 100/150 X 114 = 75.9 = 76 Male/Released: 100/150 X 36 = 23.9 = 24 Female/Jail: 50/150 X 114 = 37.9 = 38 Female/Released: 50/150 X 36 = 11.9 = 12 Dependent variable category total X

9 Expected frequencies Court disposition GenderJailReleasedTotal Male7624100 Female381250 Total11436n = 150

10 Obtaining X 2 (O - E) 2  2 =  ---------- E O= observed frequency E= expected frequency (what we would get if the null hypothesis is true)  2 is the ratio of systematic variation to chance variation The higher the ratio – the greater the systematic than the chance variation – the more likely that we can reject the null Chi-square is not a good measure because its significance level is closely tied to sample size Over-estimate significance with very large samples, under-estimate with very small samples

11 Observed frequencies Court disposition GenderJailReleasedTotal Male8416100 Female302050 Total11436n = 150 Expected frequencies Court disposition GenderJailReleasedTotal Male7624100 Female381250 Total11436n = 150 (O - E) 2 (84-76) 2 (16-24) 2 (30-38) 2 (20-12) 2  2 =  - -------- = ----------- + ------------ + ------------ + ------------ = 10.5 E 76 24 38 12

12  2 = 10.5 df = r-1 X c-1 = (2 – 1) X (2 – 1) = 1 Reject null hypothesis – there is less than one chance in a hundred that the relationship between gender and court disposition is due to chance (p = <.01)

13 Class exercise Hypothesis: More building alarms  Less crime Hypothesis: Building alarms lead to less crime Randomly sampled 120 businesses with alarms – 50 had crimes, 70 didn’t Randomly sample 90 businesses without alarms – 50 had crimes, 40 didn’t Build an observed table, then an expected table Remember, they’re tables, so place the values of the independent variable in rows Compute  2 (O - E) 2  2 =  ---------- E

14 Observed (obtained) frequencies Crime Alarm YNTotal Y5070120 N504090 Total100110210 (O - E) 2 (50-57) 2 (70-63) 2 (50-43) 2 (40-47) 2  2 =  - -------- = ----------- + ------------ + ------------ + ------------ = 3.82 E 57 63 43 47 Expected (by chance) frequencies Crime Alarm YNTotal Y120 N90 Total100110210 5763 4347

15  2 = 3.82 df = r-1 X c-1 = (2 – 1) X (2 – 1) = 1 To reject at.05 level need  2 = 3.841 or greater Accept null hypothesis – NO significant relationship; what’s there is due to chance

16 Expected (by chance) frequencies Crime Alarm YNTotal Y120 N90 Total100110210 5763 4347 Expected (by chance) frequencies Crime Alarm YNTotal Y47%53%120 N48%52%90 Total100110210 Checking the expected frequencies table by converting it into percentages In a properly done expected table as you change the value of the independent variable, the distribution across the dependent variable shouldn’t change A properly done expected table will always show no relationship -- it’s the null hypothesis) Demonstrating the meaning of “expected”

17 Back to the parking lots… Use the frequency (not percentage) table to create a “frequencies expected” table (meaning, expected if there is no relationship) This table should artificially reflect no relationship between income and car value Instructions on next slide…

18 Computing expected frequencies: Row marginal Total cases X Column marginal ROW MARGINALS TOTAL CASES COLUMN MARGINALS

19 Expected frequencies Now compute the Chi-Square Instructions on next slide

20 Computing Chi-Square 1. Cell by corresponding cell, subtract EXPECTED from OBSERVED. 2. Square each difference. 3. Divide each result by the frequency EXPECTED. 4. Total them up. Minus

21

22 In scientific research the greatest risk we can take of being wrong is five in one- hundred (.05 column). Our Chi-square, 8.66, is more than the minimum required of 7.815. So we can reject the NULL hypothesis and accept the WORKING hypothesis that higher income persons drive more expensive cars.

23 Homework

24 Homework exercise Hypothesis: Sergeants have more stress than patrol officers 1. Calculate expected cell frequencies (null hypothesis of no relationship is true) 2. Compute Chi-square 3. Use table in Appendix E to determine your chi-square’s probability level 4. Can we reject the null hypothesis?

25 Homework answer (30-52) 2 (60-38) 2 (86-64) 2 (24-46) 2  2 =  --------- + ---------- + --------- + --------- = 40.1 52 38 64 46 Observed Expected

26  2 = 40.1 df = r-1 X c-1 = (2 – 1) X (2 – 1) = 1 To reject at.05 level need  2 = 3.841 or greater Reject null hypothesis – Less than 1 chance in 1,000 that relationship is due to chance

27 Practice for the final

28 You will test a hypothesis using two categorical variables and determine whether the independent variable has a statistically significant effect. You will be asked to state the null hypothesis. You will used supplied data to create an Observed frequencies table. You will use it to create an Expected frequencies table. You will be given a formula but should know the procedure. You will compute the Chi-Square statistic and degrees of freedom. You will be given formulas but should know the procedures by heart. You will use the Chi-Square table to determine whether the results support the working hypothesis. – Print and bring to class: http://www.sagepub.com/fitzgerald/study/materials/appendices/app_e.pdf http://www.sagepub.com/fitzgerald/study/materials/appendices/app_e.pdf Sample question: Hypothesis is that alarm systems prevent burglary. Random sample of 120 business with an alarm system and 90 without. Fifty businesses of each kind were burglarized. – Null hypothesis: No significant difference in crime between businesses with and without alarms Observed frequencies Expected frequencies

29 (50-57) 2 (70-63) 2 (50-43) 2 (40-47) 2 --------- + ---------- + ----------- + ----------- = 57 63 43 47.86 +.78 + 1.14 + 1.04 = 3.82 – Chi-Square = 3.82 – Df = (r-1) X (c-1) = 1 – Check the table. Do the results support the working hypothesis? No - Chi-Square must be at least 3.84 to reject the null hypothesis of no relationship between alarm systems and crime, with only five chances in 100 that it is true


Download ppt "Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables."

Similar presentations


Ads by Google