Chi-square A very brief intro. Distinctions The distribution The distribution –Chi-square is a probability distribution  A special case of the gamma.

Slides:



Advertisements
Similar presentations
Contingency Table Analysis Mary Whiteside, Ph.D..
Advertisements

Chapter 11 Other Chi-Squared Tests
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Statistical Inference for Frequency Data Chapter 16.
Sociology 601 Class 13: October 13, 2009 Measures of association for tables (8.4) –Difference of proportions –Ratios of proportions –the odds ratio Measures.
Chi-Squared tests (  2 ):. Use with nominal (categorical) data – when all you have is the frequency with which certain events have occurred. score per.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Statistical Tests Karen H. Hagglund, M.S.
Analysis of frequency counts with Chi square
Chi-square Basics. The Chi-square distribution Positively skewed but becomes symmetrical with increasing degrees of freedom Mean = k where k = degrees.
CJ 526 Statistical Analysis in Criminal Justice
Chi Square Test Dealing with categorical dependant variable.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Previous Lecture: Analysis of Variance
Data Analysis Statistics. Levels of Measurement Nominal – Categorical; no implied rankings among the categories. Also includes written observations and.
Summary of Quantitative Analysis Neuman and Robson Ch. 11
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Presentation 12 Chi-Square test.
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Categorical Data Test of Independence.
The Chi-square Statistic. Goodness of fit 0 This test is used to decide whether there is any difference between the observed (experimental) value and.
1 of 27 PSYC 4310/6310 Advanced Experimental Methods and Statistics © 2013, Michael Kalsher Michael J. Kalsher Department of Cognitive Science Adv. Experimental.
Cross Tabulation and Chi-Square Testing. Cross-Tabulation While a frequency distribution describes one variable at a time, a cross-tabulation describes.
How Can We Test whether Categorical Variables are Independent?
David Yens, Ph.D. NYCOM PASW-SPSS STATISTICS David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT l PRESENTATION.
Estimation and Hypothesis Testing Faculty of Information Technology King Mongkut’s University of Technology North Bangkok 1.
This Week: Testing relationships between two metric variables: Correlation Testing relationships between two nominal variables: Chi-Squared.
Inferential Statistics: SPSS
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Analysis of Categorical Data
Amsterdam Rehabilitation Research Center | Reade Testing significance - categorical data Martin van der Esch, PhD.
CJ 526 Statistical Analysis in Criminal Justice
1 Measuring Association The contents in this chapter are from Chapter 19 of the textbook. The crimjust.sav data will be used. cjsrate: RATE JOB DONE: CJ.
Dr.Shaikh Shaffi Ahamed Ph.D., Dept. of Family & Community Medicine
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Pearson Chi-Square Contingency Table Analysis.
Slide 26-1 Copyright © 2004 Pearson Education, Inc.
Analysis of Qualitative Data Dr Azmi Mohd Tamil Dept of Community Health Universiti Kebangsaan Malaysia FK6163.
CHI SQUARE TESTS.
HYPOTHESIS TESTING BETWEEN TWO OR MORE CATEGORICAL VARIABLES The Chi-Square Distribution and Test for Independence.
Chapter 13 CHI-SQUARE AND NONPARAMETRIC PROCEDURES.
Copyright © 2010 Pearson Education, Inc. Slide
Section 10.2 Independence. Section 10.2 Objectives Use a chi-square distribution to test whether two variables are independent Use a contingency table.
Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.
4 normal probability plots at once par(mfrow=c(2,2)) for(i in 1:4) { qqnorm(dataframe[,1] [dataframe[,2]==i],ylab=“Data quantiles”) title(paste(“yourchoice”,i,sep=“”))}
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Non-parametric Tests e.g., Chi-Square. When to use various statistics n Parametric n Interval or ratio data n Name parametric tests we covered Tuesday.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
Chapter 14 Chi-Square Tests.  Hypothesis testing procedures for nominal variables (whose values are categories)  Focus on the number of people in different.
Chi Square Tests PhD Özgür Tosun. IMPORTANCE OF EVIDENCE BASED MEDICINE.
Chapter 13- Inference For Tables: Chi-square Procedures Section Test for goodness of fit Section Inference for Two-Way tables Presented By:
1 Week 3 Association and correlation handout & additional course notes available at Trevor Thompson.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
Comparing Counts Chapter 26. Goodness-of-Fit A test of whether the distribution of counts in one categorical variable matches the distribution predicted.
Nonparametric Statistics
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Section 10.2 Objectives Use a contingency table to find expected frequencies Use a chi-square distribution to test whether two variables are independent.
I. ANOVA revisited & reviewed
The Chi-square Statistic
Nonparametric Statistics
Chi-square Basics.
Hypothesis testing. Chi-square test
Qualitative data – tests of association
Nonparametric Statistics
The Chi-Square Distribution and Test for Independence
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
BIVARIATE ANALYSIS: Measures of Association Between Two Variables
Presentation transcript:

Chi-square A very brief intro

Distinctions The distribution The distribution –Chi-square is a probability distribution  A special case of the gamma distribution –The t and F are derived from it  t = ratio of normal to chi-square  F = ratio of two chi-square distributions Goodness of fit tests Goodness of fit tests –You may see it as the test statistic in a variety of procedures to determine if some data ‘fits’ what is theoretically expected Tests of independence Tests of independence –Assesses whether paired observations on two categorical variables are independent of each other  Contingency table

Goodness of Fit Does the data conform to expectations? Does the data conform to expectations? The following are program numbers for 5700 The following are program numbers for 5700 If we expected a balanced distribution, does the data suggest that is true? If we expected a balanced distribution, does the data suggest that is true? Calculation: Sum the squared differences of the observed frequencies and expected frequencies, divided by the expected Calculation: Sum the squared differences of the observed frequencies and expected frequencies, divided by the expected X 2 = , df = 4, p-value = 0.84 X 2 = , df = 4, p-value = 0.84 Conclusion? Not statistically different from expectations Conclusion? Not statistically different from expectations Note however that we wouldn’t expect a balanced distribution, and could have changed our expected values to conform to a more reasonable estimate based on past entry rates. Note however that we wouldn’t expect a balanced distribution, and could have changed our expected values to conform to a more reasonable estimate based on past entry rates.

Independence Moving beyond the single variable, we can test for the independence of two categorical variables Moving beyond the single variable, we can test for the independence of two categorical variables What do undergrad stat students do with their free time? What do undergrad stat students do with their free time? Updating their Myspace/Facebook or whatever blog thing whose contents will get them fired from some job in the future Talking on cell phone about their drama loudly enough that now total strangers know how the ‘tests’ turned out Texting instead of just calling the person and actually talking to them Staring at Ceiling Males Females

Is there a relationship between gender and what the stats kids do with their free time? Is there a relationship between gender and what the stats kids do with their free time? Expected = (R i *C j )/N Expected = (R i *C j )/N Example for males Updating: (100*50)/200 = 25 Example for males Updating: (100*50)/200 = 25 Updating their Myspace/Facebook or whatever blog thing whose contents will get them not hired/ fired from some job in the future Talking on cell phone about their drama loudly enough that now total strangers know how the ‘tests’ turned out Texting instead of just calling the person and actually talking to them Staring at the ceiling Total Males Females

Table with expectations added df = (R-1)(C-1) df = (R-1)(C-1) Updating their Myspace/Facebook or whatever blog thing whose contents will get them not hired/ fired from some job in the future Talking on cell phone about their drama loudly enough that now total strangers know how the ‘tests’ turned out Texting instead of just calling the person and actually talking to them Staring at the ceiling Total Males (E) 30 (25) 40 (35) 20 (30) 10 (10) 100 Females (E) 20 (25) 30 (35) 40 (30) 10 (10)

Interpretation X 2 = , df = 3, p-value = X 2 = , df = 3, p-value = Reject H 0, there is some relationship between gender and how stats students spend their free time Reject H 0, there is some relationship between gender and how stats students spend their free time

Assumptions Obviously the data itself does not have to be any particular distribution Obviously the data itself does not have to be any particular distribution –Nonparametric Independence Independence –As usual, we assume observations are independent of one another Inclusion of non-occurences Inclusion of non-occurences –The data must include all categories of information –You put ‘Don’t know’ as a response on your survey, suffer the consequences! 1

Other Versions/Extensions For 2 x 2: Yates correction, Fisher’s exact test For 2 x 2: Yates correction, Fisher’s exact test Beyond the two-way setting: Loglinear analysis (covered in your Howell text) Beyond the two-way setting: Loglinear analysis (covered in your Howell text) Categorical X Ordinal outcomes Categorical X Ordinal outcomes –Tests of linear associations –Correlational approach (see Howell 10.4)

Effect Size 2 X 2 2 X 2 d family measures of difference d family measures of difference –Relative risk –Odds ratio r family measures of association r family measures of association –Phi and Cramer’s Phi Measure of agreement Measure of agreement –Kappa

Summary While you may see the chi-square statistic used frequently, the chi-squared tests are increasingly less common While you may see the chi-square statistic used frequently, the chi-squared tests are increasingly less common –The reason is that it is relatively rare that a research question would only entail categorical variables only However the tests are still viable for descriptive and exploratory forays into data, and often utilized as such However the tests are still viable for descriptive and exploratory forays into data, and often utilized as such