Chi-square: Comparing Observed and Expected Counts

Slides:



Advertisements
Similar presentations
Chi-square, Goodness of fit, and Contingency Tables
Advertisements

Hypothesis Testing and Comparing Two Proportions Hypothesis Testing: Deciding whether your data shows a “real” effect, or could have happened by chance.
Chi-Square Test Chi-square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis.
Analysis of frequency counts with Chi square
Naked mole rats are a burrowing rodent
Chi-square notes. What is a Chi-test used for? Pronounced like kite, not like cheese! This test is used to check if the difference between expected and.
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Inferential Stats, Discussions and Abstracts!! BATs Identify which inferential test to use for your experiment Use the inferential test to decide if your.
1 Psych 5500/6500 Chi-Square (Part Two) Test for Association Fall, 2008.
Statistical Significance R.Raveendran. Heart rate (bpm) Mean ± SEM n In men ± In women ± The difference between means.
Copyright © 2012 by Nelson Education Limited. Chapter 7 Hypothesis Testing I: The One-Sample Case 7-1.
For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category.
Chapter 12 A Primer for Inferential Statistics What Does Statistically Significant Mean? It’s the probability that an observed difference or association.
Statistics in Biology. Histogram Shows continuous data – Data within a particular range.
Review of the Basic Logic of NHST Significance tests are used to accept or reject the null hypothesis. This is done by studying the sampling distribution.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Chi-square goodness of fit tests Chi-square goodness of fit.
Bullied as a child? Are you tall or short? 6’ 4” 5’ 10” 4’ 2’ 4”
Statistical Significance or Hypothesis Testing. Significance testing Learning objectives of this lecture are to Understand Hypothesis: definition & types.
HYPOTHESIS TESTING.
Introduction to Marketing Research
Statistical Analysis: Chi Square
I. CHI SQUARE ANALYSIS Statistical tool used to evaluate variation in categorical data Used to determine if variation is significant or instead, due to.
The Chi-square Statistic
Chi-Squared Χ2 Analysis
Comparing Multiple Groups:
Correlation Scientific
Chi-Square (Association between categorical variables)
Chapter 9: Non-parametric Tests
Presentation 12 Chi-Square test.
Statistics made simple Dr. Jennifer Capers
Objectives (BPS chapter 23)
10 Chapter Chi-Square Tests and the F-Distribution Chapter 10
Chi-square: Comparing Observed and
Hypothesis Testing: One Sample Cases
Using the t-distribution
The Chi Squared Test.
Practice & Communication of Science
Hypothesis Testing Review
Chapter 12 Tests with Qualitative Data
Sample Size Estimation
Scientific Practice Correlation.
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
Chapter 11 Goodness-of-Fit and Contingency Tables
Is a persons’ size related to if they were bullied
Consider this table: The Χ2 Test of Independence
Inference for Categorical Data
Is a persons’ size related to if they were bullied
Hypothesis Testing and Comparing Two Proportions
Chapter 10 Analyzing the Association Between Categorical Variables
Statistical Analysis Chi-Square.
P-VALUE.
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
CHAPTER 11 Inference for Distributions of Categorical Data
Lecture 38 Section 14.5 Mon, Dec 4, 2006
Analyzing the Association Between Categorical Variables
Facts from figures Having obtained the results of an investigation, a scientist is faced with the prospect of trying to interpret them. In some cases the.
Reasoning in Psychology Using Statistics
CHAPTER 11 Inference for Distributions of Categorical Data
How do you know if the variation in data is the result of random chance or environmental factors? O is the observed value E is the expected value.
CHAPTER 11 Inference for Distributions of Categorical Data
Looks at differences in frequencies between groups
CHAPTER 11 Inference for Distributions of Categorical Data
Graphs and Chi Square.
CHAPTER 11 Inference for Distributions of Categorical Data
Quadrat sampling & the Chi-squared test
Quadrat sampling & the Chi-squared test
CHI SQUARE (χ2) Dangerous Curves Ahead!.
Presentation transcript:

Chi-square: Comparing Observed and Expected Counts Scientific Practice Chi-square: Comparing Observed and Expected Counts

Where We Are/Where We Are Going Most of what we have done so far has looked at variables and how those might vary in relation to something else… BP when a subject takes a drug (eg paired t-test) what was the effect of the drug on the mean BP? Lung function in association with another variable (correlation/regression) how does carbon dioxide affect Minute Volume? A different approach involves putting subjects into ‘bins’ based on overall outcomes eg ‘the patient died’ we can compare the bin-size of the actual (observed) data against what we expected

Proportions: Observed vs Expected Here’s an example from Intuitive Biostatistics… On average, 10% of patients die following a particularly risky operation this month, 16 out of 75 died is this a ‘real change’ or just ‘coincidence’? it’s obviously a ‘real change’; however, we can estimate the probability of seeing a proportion of 16/75 (21.3%) if the background average still 10% We need to compare what we observed (16/75) with what we expected (7.5/75, 10%) can show this as a table…

Proportions: Observed vs Expected #Observed #Expected (10%) Alive 59 67.5 Dead 16 7.5 Total 75 75 Step 1 : The Null Hypothesis there has been no change in the proportion dying Step 2 : Generate Test Statistic, Chi-square (χ2) first work out what ‘expected’ under Null Hypo then, χ2 = Σ ((Observed – Expected)2 / Expected) the more the data do not fit the expected pattern, the bigger χ2 gets

Proportions: Observed vs Expected Step 2 : Generate Test Statistic, Chi-square (χ2) χ2 = Σ ((Observed – Expected)2 / Expected) χ2 = ((59-67.5)2 / 67.5) +((16-7.5)2 / 7.5) = 10.7 Step 3 : Calculate the probability use a table of critical values of χ2 cols = levels of significance (p-value, or alpha) rows = degrees of freedom α = 0.05 df = categories-1 2-1 = 1

Proportions: Observed vs Expected

Proportions: Observed vs Expected Critical value for χ2 = 3.84 Our value for χ2 = 10.70 So, p < 0.05 if fact, critical value for p = 0.0025 is 9.14, ours is 10.70 so p < 0.0025 Step 4 : Interpret the probability the probability of seeing our proportion (21.3%) dying if the underlying trend of 10% still applied is less than 0.25% (p < 0.0025) so reject the Null Hypo in favour of Alt Hypo… that some factor other than chance responsible

Comparing Two Proportions The above example just looked at one category with two ‘states’ (dead and alive) But what if we wanted to include male and female in our investigation? eg suspect more men than women dying χ2 can be used to look do this via a 2 x 2 table Example, disease progression in AZT/placebo patients… Disease progression No prog Total AZT 76 399 475 Placebo 129 332 461 Total 205 731 936

Comparing Two Proportions Disease progression No prog Total AZT 76 399 475 Placebo 129 332 461 Total 205 731 936 Step 1 : The Null Hypothesis no difference in disease progression in AZT vs placebo Step 2 : Generate the Test Statistic first, need to work out what we would expect if the Null Hypo were true…

Comparing Two Proportions Disease progression No prog Total AZT 76 399 475 Placebo 129 332 461 Total 205 731 936 Disease progression is 205/936 = 21.9% so, if Null Hypo is true, we’d expect… 21.9/100 x 475 = 104 AZT to progress 21.9/100 x 461 = 101 Placebo to progress that leaves 475 – 104 = 371 AZT to not prog and 461 – 101 = 360 Placebo to not prog (can do this with row total x col total / grand total)

Comparing Two Proportions OBSERVED Disease progression No prog Total AZT 76 399 475 Placebo 129 332 461 Total 205 731 936 EXPECTED AZT 104 371 475 Placebo 101 360 461

Comparing Two Proportions Step 2 : Generate Test Statistic, Chi-square (χ2) χ2 = Σ ((Observed – Expected)2 / Expected) χ2 = ((76-104)2 / 104) +((399-371)2 / 371) +((129-101)2 / 101) +((332-360)2 / 360) χ2 = 7.538 +2.113 +7.762 +2.117 = 19.53

Comparing Two Proportions Step 3 : Calculate the probability use a table of critical values of χ2 cols = levels of significance (p-value, or alpha) rows = degrees of freedom α = 0.05 df = (rows-1) x (cols-1) (2-1) x (2-1) = 1 Critical value for χ2 = 3.84 Our value for χ2 = 19.53 So, p < 0.05 if fact, critical value for p = 0.0025 is 10.83, ours is 19.53 so p < 0.001

Comparing Two Proportions Step 4 : Interpret the probability if the Null Hypo is true (no difference in disease progression in AZT vs placebo)… …there is < 0.1% chance of us observing the pattern of progression we actually saw. so reject the Null Hypo in favour of Alt Hypo… that AZT produces different progression rates a reduction! Phew!

Chi-square in Minitab C1 = AZT, C2 = Placebo Row 1 = Progression, Row 2 = No prog

Summary Where counts can be put in ‘bins’ then tests of proportions can be carried out One such test is the chi-square (χ2) test Involves working out what we expected to see if the Null Hypo applied χ2 = Σ ((Observed – Expected)2 / Expected) The bigger (χ2) is, the more unlikely that the Null Hypo (ie the ‘expected’ pattern) is true A common use of χ2 is the 2x2 table