Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”

Slides:



Advertisements
Similar presentations
 2 Test of Independence. Hypothesis Tests Categorical Data.
Advertisements

Chi Square Example A researcher wants to determine if there is a relationship between gender and the type of training received. The gender question is.
Bivariate Analysis Cross-tabulation and chi-square.
Hypothesis Testing IV Chi Square.
Chapter 13: The Chi-Square Test
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 25, Slide 1 Chapter 25 Comparing Counts.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Chi-square Test of Independence
Cross-Tabulations.
Chapter 11(1e), Ch. 10 (2/3e) Hypothesis Testing Using the Chi Square ( χ 2 ) Distribution.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
AM Recitation 2/10/11.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics S eventh Edition By Brase and Brase Prepared by: Lynn Smith.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 13: Nominal Variables: The Chi-Square and Binomial Distributions.
Chapter 11 Chi-Square Procedures 11.3 Chi-Square Test for Independence; Homogeneity of Proportions.
Chi-Square as a Statistical Test Chi-square test: an inferential statistics technique designed to test for significant relationships between two variables.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
CATEGORICAL VARIABLES Testing hypotheses using. Independent variable: Income, measured categorically (nominal variable) – Two values: low income and high.
Chapter 12 A Primer for Inferential Statistics What Does Statistically Significant Mean? It’s the probability that an observed difference or association.
1 The  2 test Sections 19.1 and 19.2 of Howell This section actually includes 2 totally separate tests goodness-of-fit test contingency table analysis.
Chi-Square X 2. Parking lot exercise Graph the distribution of car values for each parking lot Fill in the frequency and percentage tables.
Preparing for the final - sample questions with answers.
Difference Between Means Test (“t” statistic) Analysis of Variance (“F” statistic)
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Chapter 9 Three Tests of Significance Winston Jackson and Norine Verberg Methods: Doing Social Research, 4e.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
CATEGORICAL VARIABLES Testing hypotheses using. When only one variable is being measured, we can display it. But we can’t answer why does this variable.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 11, 12, 13, 14 and 16 Association at Nominal and Ordinal Level The Procedure in Steps.
© 2000 Prentice-Hall, Inc. Statistics The Chi-Square Test & The Analysis of Contingency Tables Chapter 13.
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Nonparametric Tests of Significance Statistics for Political Science Levin and Fox Chapter Nine Part One.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Chapter 11: Chi-Square  Chi-Square as a Statistical Test  Statistical Independence  Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
12/23/2015Slide 1 The chi-square test of independence is one of the most frequently used hypothesis tests in the social sciences because it can be used.
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Chi-Square X 2. Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis”
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.
Difference Between Means Test (“t” statistic) Analysis of Variance (F statistic)
Chi-Square Test (χ 2 ) χ – greek symbol “chi”. Chi-Square Test (χ 2 ) When is the Chi-Square Test used? The chi-square test is used to determine whether.
Bullied as a child? Are you tall or short? 6’ 4” 5’ 10” 4’ 2’ 4”
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Chapter 13 Understanding research results: statistical inference.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
Other tests of significance. Independent variables: continuous Dependent variable: continuous Correlation: Relationship between variables Regression:
Final exam practice questions (answers at the end)
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
Chi-Square X2.
CATEGORICAL VARIABLES
Hypothesis Testing Review
Hypothesis Testing Using the Chi Square (χ2) Distribution
Difference Between Means Test (“t” statistic)
Inferential Statistics
CATEGORICAL VARIABLES
UNDERSTANDING RESEARCH RESULTS: STATISTICAL INFERENCE
Presentation transcript:

Chi-Square X 2

Review: the “null” hypothesis Inferential statistics are used to test hypotheses Whenever we use inferential statistics the “null hypothesis” applies – Null hypothesis: There is no relationship between variables. Any apparent effect was produced by chance – To reject the null, the test statistic (e.g., R 2, t, b, X 2, etc.) must be so large that the probability the null is true is less than five in one-hundred (<.05) How do we know if the null is true? – Compare the test statistic to a table – “Probability” or p means the chance that the null hypothesis is true – In a study, look for asterisks in the statistic’s column. If there is no asterisk, the null for that relationship is true. – Usually one asterisk (*) means the probability the null is true is less than 5 in 100 (p <.05). Two asterisks (**) is better (p <.01, probability the null is true is less than one in 100). Three (***) is great (p <.001, probability less than one in 1,000.)  Null hypothesis is true Reject null hypothesis 

A test statistic, used to test hypotheses Tests for relationship between two categorical variables (nominal or ordinal) Yields a coefficient that can be looked up in a table – The larger the coefficient, the less the probability that the null hypothesis is correct Evaluates difference between Observed and Expected cell frequencies: – “Observed” means the actual data – “Expected” means what we would expect if there was no relationship between the variables – If there is no difference between observed and expected frequencies,  2 is zero and the null hypothesis is true – Greater the difference, the larger the value of  2, thus the smaller the probability that the null hypothesis is true We will always place the values of the IV in rows, and of the DV in columns. It can be done the other way, and does not affect computing  2. Chi-Square (X 2 ) Hypothesis: Gender  Court disposition Court disposition (observed) GenderJailReleasedTotal Male Female Total11436n = 150 Court disposition (expected) GenderJailReleasedTotal Male Female Total11436n = 150

Building the “expected” table Court disposition GenderJailReleasedTotal Male Female Total11436n = 150 Hypothesis: Gender  Court disposition “Observed” table - the actual data Court disposition GenderJailReleasedTotal Male100 Female50 Total11436n = 150 Create a new table from scratch “Expected” table - what you expect if the null hypothesis of no relationship is true 1. Bring over the “marginals” - all the totals Divide its row total by the grand total, then multiply by its column total 2. Fill in each cell, one at a time Male/Jail: Male/Released: Female/Jail: Female/Released:

Building the “expected” table Court disposition (observed) GenderJailReleasedTotal Male Female Total11436n = 150 Hypothesis: Gender  Court disposition “Observed” table - the actual data Court disposition (expected) GenderJailReleasedTotal Male Female Total11436n = 150 Create a new table from scratch “Expected” table -“expected” frequencies if the null hypothesis of no relationship between variables is true 1. Bring over the “marginals” - all the totals Divide its row total by the grand total, then multiply by its column total 2. Fill in each cell, one at a time Male/Jail: 100/150 X 114=75.9=76 Male/Released: 100/150 X 36=23.9=24 Female/Jail: 50/150 X 114=37.9=38 Female/Released: 50/150 X 36=11.9=12

Checking the expected frequencies table by converting it into percentages In an expected table, as the value of the independent variable changes, the distribution across the dependent variable should remain the same In this example, as we switch the value of independent variable gender, the distribution across dependent variable court disposition doesn’t change A properly done expected table will always show no relationship -- it’s the null hypothesis! Demonstrating the meaning of “expected” Court disposition (expected freqs.) GenderJailReleasedTotal Male Female Total11436n = 150 Court disposition (expected pcts.) GenderJailReleasedTotal Male76%24%100% Female76%24%100%

Comparing the observed and expected tables: the meaning of Chi-Square (X 2 ) The observed table is the data, as we find it The expected table is purposely built to demonstrate no relationship between variables. It “is” the null hypothesis. To determine whether the observed table demonstrates a relationship between variables, we compare its cell frequencies to those in the “expected” table – The less similar the tables, the more likely that the working hypothesis is true, and the less likely that the null hypothesis is true  2 is a ratio that reflects the dissimilarity in cell frequencies. The more dissimilar, the larger the  2. O= observed (actual) frequency E= expected frequency (if null hypothesis is true) (O - E) 2  2 =  E More formally,  2 is the ratio of systematic variation to chance variation. The larger the ratio, the more likely that we can reject the null hypothesis. Chi-square is not always a good measure because its accuracy is closely tied to sample size. – Over-estimate significance with large samples, under-estimate with small samples – Ideal sample size is around 150, with no cells less than 5

Observed frequencies Court disposition GenderJailReleasedTotal Male Female Total11436n = 150 Expected frequencies Court disposition GenderJailReleasedTotal Male Female Total11436n = 150 (O - E) 2 (84-76) 2 (16-24) 2 (30-38) 2 (20-12) 2  2 =  = = 10.5 E Computing X 2 Always pair up the corresponding cells and divide by the expected frequency

To reject the null hypothesis a test statistic, such as  2, must be of sufficient magnitude. The larger the better! df = rows minus 1 X columns minus 1 (r-1 X c-1)=(2 – 1) X (2 – 1)=1 In social science research we reject the null hypothesis when there are fewer than five chances in 1,000 (p=<.05) that it is true. Our chi-square is larger than what we need: there is less than one chance in a thousand (p=<.01) that the null is true. Our observed data has proven so different from what would be expected if there was no relationship between variables that we can reject the null hypothesis of no relationship. We thus confirm the working hypothesis that gender affects disposition. There is less than one chance in a thousand that we’re wrong! Assessing the significance of X 2  Null hypothesis is true Reject null hypothesis   2 =10.5

Class exercise Hypothesis: More building alarms  Less crime Randomly sampled 120 businesses with alarms 50 had crimes, 70 didn’t Randomly sampled 90 businesses without alarms 50 had crimes, 40 didn’t Build the observed and expected tables – Remember, they’re tables, so place the values of the independent variable in rows Compute  2 (O - E) 2  2 =  E Use the table to assess the probability that the null hypothesis is correct df= r-1 X c-1 Convey your findings using simple words. What does the data show about building alarms and crime? How certain are you of your conclusions?

Observed (obtained) frequencies Crime Alarm YNTotal Y N Total (O - E) 2 (50-57) 2 (70-63) 2 (50-43) 2 (40-47) 2  2 =  = = 3.82 E Expected (by chance) frequencies Crime AlarmYNTotal Y120 N90 Total

 2 = 3.82 df = r-1 X c-1 = (2 – 1) X (2 – 1) = 1 To reject the null hypothesis at.05 level we need a  2 of or greater Our chi-square is smaller, making the probability that the null hypothesis is true greater than the max of five in one-hundred (defaults to next lower level,.10, or ten chances in one-hundred that the null hypothesis is true) So we must accept the null hypothesis – there is NO significant relationship between crime and alarms

Parking lot exercise 1.Graph the distribution of car values for each parking lot 2.Fill in the frequency and percentage tables

Row marginal Total cases X 3.Use the frequency (not percentage!) table to create an “frequencies expected” table (meaning, expected if the null hypothesis of no relationship is correct) X 6 = 3 column marginal Frequencies observed Frequencies expected

4.Compute X 2 : Cell by corresponding cell, subtract EXPECTED from OBSERVED. Square each difference. Divide each result by EXPECTED. Then total them up.

The greatest risk we can take that the null hypothesis is true is five in one-hundred (.05) Our Chi-square, 8.66, is greater than 7.815, the required minimum We can thus reject the NULL hypothesis and accept the WORKING hypothesis that higher income persons drive more expensive cars, with only five chances in 100 of being wrong. Larger Chi-squares could have reduced the risk that the null hypothesis is true to two in one-hundred (.02), one in one-hundred (.01), or even one in one-thousand (.001) 5.Check the table. Begin with the largest probability level that allows you to reject the null hypothesis,.05. Is the Chi-square at least that large? If not, the null hypothesis is true.

Homework

Homework exercise Hypothesis: Sergeants have more stress than patrol officers 1. Calculate expected cell frequencies (null hypothesis of no relationship is true) 2. Compute Chi-square 3. Use table in Appendix E to determine your chi-square’s probability level 4. Can we reject the null hypothesis?

Homework answer (30-52) 2 (60-38) 2 (86-64) 2 (24-46) 2  2 =  = Observed Expected

 2 = 40.1 df = r-1 X c-1 = (2 – 1) X (2 – 1) = 1 To reject at.05 level need  2 = or greater Reject null hypothesis – Less than 1 chance in 1,000 that relationship is due to chance

Practice for the final

You will test a hypothesis using two categorical variables and determine whether the independent variable has a statistically significant effect. You will be asked to state the null hypothesis. You will used supplied data to create an Observed frequencies table. You will use it to create an Expected frequencies table. You will be given a formula but should know the procedure. You will compute the Chi-Square statistic and degrees of freedom. You will be given formulas but should know the procedures by heart. You will use the Chi-Square table to determine whether the results support the working hypothesis. – Print and bring to class: Sample question: Hypothesis is that alarm systems prevent burglary. Random sample of 120 business with an alarm system and 90 without. Fifty businesses of each kind were burglarized. – Null hypothesis: No significant difference in crime between businesses with and without alarms Observed frequencies Expected frequencies

(50-57) 2 (70-63) 2 (50-43) 2 (40-47) = = 3.82 – Chi-Square = 3.82 – Df = (r-1) X (c-1) = 1 – Check the table. Do the results support the working hypothesis? No - Chi-Square must be at least 3.84 to reject the null hypothesis of no relationship between alarm systems and crime, with only five chances in 100 that it is true