Chi-squared Association Index

Slides:



Advertisements
Similar presentations
Mann-Whitney U-test Testing for a difference U 1 = n 1 n 2 + ½ n 1 (n 1 + 1) – R 1.
Advertisements

T-test - unpaired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg areas of lichen found in two locations)
CHI-SQUARE(X2) DISTRIBUTION
Finish Anova And then Chi- Square. Fcrit Table A-5: 4 pages of values Left-hand column: df denominator df for MSW = n-k where k is the number of groups.
Chi Square Example A researcher wants to determine if there is a relationship between gender and the type of training received. The gender question is.
Hypothesis Testing IV Chi Square.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Naked mole rats are a burrowing rodent
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
Chapter 11(1e), Ch. 10 (2/3e) Hypothesis Testing Using the Chi Square ( χ 2 ) Distribution.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Chi-squared Testing for a difference. What does it do? Compares numbers of people/plants/species… in different categories (eg different pollution levels,
Chi-squared Goodness of fit. What does it do? Tests whether data you’ve collected are in line with national or regional statistics.  Are there similar.
Chi-square (χ 2 ) Fenster Chi-Square Chi-Square χ 2 Chi-Square χ 2 Tests of Statistical Significance for Nominal Level Data (Note: can also be used for.
1 The  2 test Sections 19.1 and 19.2 of Howell This section actually includes 2 totally separate tests goodness-of-fit test contingency table analysis.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Statistical Testing of Differences CHAPTER fifteen.
Chi Square Classifying yourself as studious or not. YesNoTotal Are they significantly different? YesNoTotal Read ahead Yes.
T-test - paired Testing for a difference. What does it do? Tests for a difference in means Compares two cases (eg soil moisture content north & south.
Reasoning in Psychology Using Statistics Psychology
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
State the ‘null hypothesis’ State the ‘alternative hypothesis’ State either one-tailed or two-tailed test State the chosen statistical test with reasons.
ContentFurther guidance  Hypothesis testing involves making a conjecture (assumption) about some facet of our world, collecting data from a sample,
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Chi-squared Association Index. What does it do? Looks for “links” between two factors  Do dandelions and plantains tend to grow together?  Does the.
Correlation – Spearman’s. What does it do? Measures rank correlation – whether highest value in the 1 st data set corresponds to highest in the 2 nd set.
DRAWING INFERENCES FROM DATA THE CHI SQUARE TEST.
Testing for a difference
Comparing Counts Chi Square Tests Independence.
Inferential Statistics 3: The Chi Square Test
Geog4B The Chi Square Test.
Chi-Squared (2) Analysis
Statistical Analysis: Chi Square
Chapter 12 Chi-Square Tests and Nonparametric Tests
Chi-Square hypothesis testing
Chi Square Review.
Testing for a difference
Chapter 13 Test for Goodness of Fit
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Testing for a difference
Chapter 25 Comparing Counts.
Qualitative data – tests of association
Hypothesis Testing Using the Chi Square (χ2) Distribution
Statistical Analysis Determining the Significance of Data
Chi-Square Test.
Different Scales, Different Measures of Association
Consider this table: The Χ2 Test of Independence
Chi Square Two-way Tables
Chi-Square Test.
Reasoning in Psychology Using Statistics
Chapter 11: Inference for Distributions of Categorical Data
Statistical Analysis Chi-Square.
Chi-square test or c2 test
Chi-Square Test.
Chi – square Dr. Anshul Singh Thapa.
The 2 (chi-squared) test for independence
Chapter 26 Comparing Counts.
Assistant prof. Dr. Mayasah A. Sadiq FICMS-FM
Reasoning in Psychology Using Statistics
Reasoning in Psychology Using Statistics
Inference for Two Way Tables
Looks at differences in frequencies between groups
Chi-Square Test A fundamental problem in Science is determining whether the experiment data fits the results expected. How can you tell if an observed.
Quadrat sampling & the Chi-squared test
Quadrat sampling & the Chi-squared test
Chi Square Test of Homogeneity
What is Chi-Square and its used in Hypothesis? Kinza malik 1.
Presentation transcript:

Chi-squared Association Index The other variants of chi-squared (looking for a difference and goodness of fit ) are covered separately

What does it do? There are two ways to use this test: Looking for association between two factors Eg: Are snail shell colour and habitat choice associated? Looking for a difference in population/employment structures Eg: Is the population structure the same in two villages? You do the calculations the same way in both cases! The structures version is strictly for comparing two sets of data that are “on a level” – this method should not be used to compare, say, local and national data – that’s covered in Chi-squared goodness of fit.

Planning to use it? Make sure that… You are working with numbers of things, not, eg area, weight, length, %… You have an average of at least 5 things (people/plants/species…) in each category

How does it work? For Association, you assume (null hypothesis) there is no association For Difference in Structures, you assume (null hypothesis) there is no difference in structures It compares observed values the data you collected expected values what you’d get if there was really no association or no difference in structures

Doing the test These are the stages in doing the test: Write down your hypotheses Work out the expected values Use the chi-squared formula to get a chi-squared value Work out your degrees of freedom Look at the tables Make a decision The underlined terms are hyperlinks to the appropriate slide Examples Association Difference in structures

Hypotheses Association: H0: There is no association H1: There is some association Difference in structures H0: There is no difference in structures H1: There is some difference in structures This is the standard form for hypotheses for this type of chi-squared test – the others are not the same

Expected Values Your data here will be in a table. To work out the expected values: Add up the totals of all the rows, and the totals of all the columns. Also find the overall total of all the data Work out expected values using Eg, to work out the expected value for something in 2nd row, 3rd column, multiply total of 2nd row by total of 3rd column and divide by overall total

Chi-Squared Formula For each cell in your table, work out O = Observed value – your data E = Expected value – which you’ve calculated Then add all your values up. This gives the chi-squared value S = “Sum of”

degrees of freedom = (rows – 1)(columns – 1) The formula here for degrees of freedom is degrees of freedom = (rows – 1)(columns – 1) You do not need to worry about what this means –just make sure you know the formula! But in case you’re interested – the larger your table, the more likely you are to get a “strange” result in one or more cells. The degrees of freedom is a way of allowing for this in the test. Worth emphasising that this is a different formula to the other types of chi-squared

Tables This is a chi-squared table These are your significance levels eg 0.05 = 5% These are your degrees of freedom (df) Perhaps a good time to check they’re OK on reading tables?

Make a decision If the value you calculated is bigger than the tables, you reject your null hypothesis If the value you calculated is smaller than the tables, you accept your null hypothesis. Remember in each case to refer back to the actual example! In all tests except MannWhitney & Wilcoxon, they’ll be rejecting if their value is bigger

Example: Snail shell colour & habitat Samples were taken from limestone woodland and limestone pavement, and the numbers of light- and dark-shelled snails noted. Hypotheses: H0:Shell colour and habitat preference are not associated H1 Shell colour and habitat preference are associated

The data Light Dark Pavement 115 76 Woodland 69 106

Totals Light Dark Totals Pavement 115 76 191 Woodland 69 106 175 If you have any keen mathematicians, they could also look up Yates’ correction (for 2 x 2 tables)

Row Total  Column Total Expected Values Row Total  Column Total Overall Total Expected value = Expected values: Light Dark Pavement Woodland As a check, they could see that the row & column totals still add up to the same thing

The calculations: (O-E)2/E Light Dark Pavement Woodland

Tables This is a chi-squared table These are your significance levels eg 0.05 = 5% These are your degrees of freedom (df) Perhaps a good time to check they’re OK on reading tables?

The test c2 = 15.776 Degrees of freedom = (2 – 1)(2 – 1) = 1 Critical value (5%) = 3.841 Reject H0 – there is some association between snail shell colour and habitat preference

Tables This is a chi-squared table These are your significance levels eg 0.05 = 5% These are your degrees of freedom (df) Perhaps a good time to check they’re OK on reading tables?

Example: Comparing population structures Data on the population age structures of two villages were obtained. The aim is to assess whether there is any difference in the age structures. Hypotheses: H0:There is no difference in the villages’ population structures H1 There is a difference in the villages’ population structures

The data Age Village A Village B 0 -10 16 25 11-20 12 32 21-30 32 50 0 -10 16 25 11-20 12 32 21-30 32 50 31-40 40 68 41-50 60 70 51+ 40 25

Totals Age A B Total 0 -10 16 25 41 11-20 12 32 44 21-30 32 50 82 31-40 40 68 108 41-50 60 70 130 51+ 40 25 65 Total 200 270 470

Row Total  Column Total Expected Values Row Total  Column Total Overall Total Expected Value = Age Village A Village B 0 -10 10-20 20-30 30-40 40-50 50+ 34.894 47.106 45.957 62.043 55.319 74.681 27.660 37.340 At this stage, check whether any expected < 5. If so, amalgamate some age categories

The calculations: (O-E)2/E Age Village A Village B 0 -10 10-20 20-30 30-40 40-50 50+ 0.089 2.414 1.788 0.240 0.178 0.772 0.572 0.396 0.293 5.505 4.078

Tables This is a chi-squared table These are your significance levels eg 0.05 = 5% These are your degrees of freedom (df) Perhaps a good time to check they’re OK on reading tables?

The test c2 = 0.120 + 0.089 + 2.414 + 1.788 + 0.240 + 0.178 + 0.772 + 0.572 + 0.396 + 0.293 + 5.505 + 4.078 c2 = 16.445 Degrees of freedom = (6 – 1)(2 – 1) = 5 Critical value (5%) = 11.070 Reject H0 – the population structures of the two villages are different

Tables This is a chi-squared table These are your significance levels eg 0.05 = 5% These are your degrees of freedom (df) Perhaps a good time to check they’re OK on reading tables?