Data Analysis: Simple Statistical Tests Modified for AP Biology Statistics Unit Lesson.

Slides:



Advertisements
Similar presentations
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Advertisements

The Chi-Square Test for Association
Quantitative Skills 4: The Chi-Square Test
Hypothesis Testing IV Chi Square.
Inferential Statistics & Hypothesis Testing
What is a χ2 (Chi-square) test used for?
Copyright ©2011 Brooks/Cole, Cengage Learning More about Inference for Categorical Variables Chapter 15 1.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Categorical Variables Chapter 15.
Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 Reading Assignment pp ; 485.
EPIDEMIOLOGY AND BIOSTATISTICS DEPT Esimating Population Value with Hypothesis Testing.
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
Hypothesis Tests for Means The context “Statistical significance” Hypothesis tests and confidence intervals The steps Hypothesis Test statistic Distribution.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Data Analysis: Simple Statistical Tests. Goals Understand confidence intervals and p- values Learn to use basic statistical tests including chi square.
Chi-square notes. What is a Chi-test used for? Pronounced like kite, not like cheese! This test is used to check if the difference between expected and.
Chapter 9 Hypothesis Testing.
Chi-square Goodness of Fit Test
Presentation 12 Chi-Square test.
Chapter 13 Chi-Square Tests. The chi-square test for Goodness of Fit allows us to determine whether a specified population distribution seems valid. The.
The Chi-Square Test Used when both outcome and exposure variables are binary (dichotomous) or even multichotomous Allows the researcher to calculate a.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
AM Recitation 2/10/11.
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap th Lesson Introduction to Hypothesis Testing.
Hypothesis Testing.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
CHP400: Community Health Program - lI Research Methodology. Data analysis Hypothesis testing Statistical Inference test t-test and 22 Test of Significance.
Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in heads half the time (i.e.,
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
The binomial applied: absolute and relative risks, chi-square.
Statistical Hypotheses & Hypothesis Testing. Statistical Hypotheses There are two types of statistical hypotheses. Null Hypothesis The null hypothesis,
Hypothesis Testing State the hypotheses. Formulate an analysis plan. Analyze sample data. Interpret the results.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Analysis of two-way tables - Inference for two-way tables IPS chapter 9.2 © 2006 W.H. Freeman and Company.
Chi square analysis Just when you thought statistics was over!!
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
CHAPTER INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 2 methods.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Chapter Outline Goodness of Fit test Test of Independence.
© Copyright McGraw-Hill 2004
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Chapter 13 Understanding research results: statistical inference.
Did Mendel fake is data? Do a quick internet search and can you find opinions that support or reject this point of view. Does it matter? Should it matter?
Section 10.4: Hypothesis Tests for a Population Mean.
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
Chi-Squared Test of Homogeneity Are different populations the same across some characteristic?
Chapter 11: Categorical Data n Chi-square goodness of fit test allows us to examine a single distribution of a categorical variable in a population. n.
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
Chi-Square (χ 2 ) Analysis Statistical Analysis of Genetic Data.
Chi Square Test of Homogeneity. Are the different types of M&M’s distributed the same across the different colors? PlainPeanutPeanut Butter Crispy Brown7447.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Warm Up Check your understanding on p You do NOT need to calculate ALL the expected values by hand but you need to do at least 2. You do NOT need.
Is this quarter fair?. Is this quarter fair? Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of.
Chapter 12 Tests with Qualitative Data
Chapter 11 Goodness-of-Fit and Contingency Tables
Chi square.
Chapter 10 Analyzing the Association Between Categorical Variables
Contingency Tables: Independence and Homogeneity
Statistical Analysis Chi-Square.
Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Analyzing the Association Between Categorical Variables
Chi2 (A.K.A X2).
Hypothesis Tests for a Standard Deviation
Is this quarter fair?. Is this quarter fair? Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of.
Chi square.
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Presentation transcript:

Data Analysis: Simple Statistical Tests Modified for AP Biology Statistics Unit Lesson

Sampling a Population When a random study and a sample of a general population are taken, there are some characteristics that need to be determined. Based on those corresponding properties, the conclusion reached at the end of the study may be assumed to be representative of that population.

Why Choose a statistical analysis? Choose an estimator function for the characteristic (of the population) to study and then apply this function to the sample to obtain an estimate. Use the appropriate statistical test to then determine whether this estimate is based solely on chance.

The Null hypothesis The hypothesis that the estimate is based solely on chance is called the null hypothesis(H 0 ). Thus, the null hypothesis is true if the observed data (in the sample) do not differ from what would be expected on the basis of chance alone. The complement of the null hypothesis is called the alternative hypothesis.

The Alternative hypothesis The alternative hypothesis, denoted by H 1 or H a, is the hypothesis that sample observations are influenced by some non-random cause. “For example, suppose we wanted to determine whether a coin was fair and balanced. A null hypothesis might be that half the flips would result in Heads and half, in Tails. The alternative hypothesis might be that the number of Heads and Tails would be very different. Symbolically, these hypotheses would be expressed as H 0 : p = 0.5 H a : p <> 0.5”

Chi-Square Statistics Example A common analysis is whether Disease X occurs as much among people in Group A as it does among people in Group B People are often sorted into groups based on their exposure to some disease risk factor We then perform a test of the association between exposure and disease in the two groups

Hypothetical outbreak of Salmonella on a cruise ship All 300 people on the cruise ship were interviewed, and 60 of them had symptoms consistent with Salmonella Questionnaires indicated many of the case-patients ate tomatoes from the salad bar

The Study and the Tested Population Research Question: To see if there is a statistical difference in the amount of illness between those who ate tomatoes (41/130) and those who did not (19/170) Null H 0 : Salmonella infection occurs as much among people in Group A (ate tomatoes) as it does among people in Group B (did not eat tomatoes) Alternative H 1 : Salmonella infection occurs much more among people in Group A than it does among people in Group B

Table 2a. Cohort study: Salmonella? YesNoTotal Tomatoes No Tomatoes Total Exposure to tomatoes and Salmonella infection

Characteristics of the Study: To conduct a chi-square the following conditions must be met: There must be at least a total of 30 observations (people) in the table Each cell must contain a count of 5 or more To conduct a chi-square test we compare the observed data (from study results) with the data we would expect to see(calculated)

Table 2b. How to calculate the Expected Values: Total Size YesNoTotal Tomatoes??130 No Tomatoes??170 Total Gives an overall distribution of people who ate tomatoes and became sick and those that did not Based on these distributions we can fill in the empty cells with the expected values

Calculating the Expected Values: Expected Value = Row Total x Column Total Grand Total For the first cell, people who ate tomatoes and became ill: Expected value = 130 x 60 = Same formula can be used to calculate the expected values for each of the other cells

Salmonella? YesNoTotal Tomatoes 130 x 60 = x 240 = No Tomatoes 170 x 60 = x 240 = Total Formula = [(Observed – Expected) 2 /Expected] for each cell of the table Table 2c. Complete Expected values for exposure to tomatoes

Salmonella? YesNoTotal Tomatoes (41-26) 2 = (89-104) 2 = No Tomatoes (19-34) 2 = ( ) 2 = Total The chi-square (χ 2 ) for this example is: = 19.2 Table 2d. Expected values for exposure to tomatoes 34

Analyze the Chi-Square Test In general, the higher the chi-square value, the greater the likelihood there is a statistically significant difference between the two groups you are comparing To know for sure, you need to look up the p-value in a chi-square table

P-Values Using our hypothetical cruise ship Salmonella outbreak: 32% of people who ate tomatoes got Salmonella as compared with 11% of people who did not eat tomatoes How do we know whether the difference between 32% and 11% is a “real” difference? In other words, how do we know that our chi- square value (calculated as 19.2) indicates a statistically significant difference? The p-value is our indicator

P-Values Many statistical tests give both a numeric result (e.g. a chi-square value) and a p-value The p-value ranges between 0 and 1 What does the p-value tell you? The p-value is the probability of getting the result you got, assuming that the two groups you are comparing are actually the same

P-Values Start by assuming there is no difference in outcomes between the groups Look at the test statistic and p-value to see if they indicate otherwise A low p-value means that (assuming the groups are the same) the probability of observing these results by chance is very small Difference between the two groups is statistically significant A high p-value means that the two groups were not that different A p-value of 1 means that there was no difference between the two groups

P-Values <0.05 Generally, if the p-value is less than 0.05, the difference observed is considered statistically significant, ie. the difference did not happen by chance

1)The chi-square value is calculated as )There are two groups 3)Degrees of freedom = = 1 If p-value >0.05 there is not a significant difference between groups If p-value < 0.05 there is a significant difference between groups

If p-value >0.05 there is not a significant difference between groups If p-value < 0.05 there is a significant difference between groups Null H 0 : Salmonella infection occurs as much among people in Group A as it does among people in Group B

There is a significant statistical difference between the two groups. The Salmonella outbreak might have been due to contaminated tomatoes at the salad bar. p-value < 0.05 X 2 = 19.2 Reject H 0 because 19.2 is greater than 3.84 (for p-value = 0.05) Null H 0 : Salmonella infection occurs as much among people in Group A as it does among people in Group B

References 1.Bruce MG, Curtis MB, Payne MM, et al. Lake-associated outbreak of Escherichia coli O157:H7 in Clark County, Washington, August Arch Pediatr Adolesc Med. 2003;157: Wheeler C, Vogt TM, Armstrong GL, et al. An outbreak of hepatitis A associated with green onions. N Engl J Med. 2005;353: Gregg MB. Field Epidemiology. 2nd ed. New York, NY: Oxford University Press; Aureli P, Fiorucci GC, Caroli D, et al. An outbreak of febrile gastroenteritis associated with corn contaminated by Listeria monocytogenes. N Engl J Med. 2000;342:

References 5.Schafer S, Gillette H, Hedberg K, Cieslak P. A community-wide pertussis outbreak: an argument for universal booster vaccination. Arch Intern Med. 2006;166: Centers for Disease Control and Prevention. Partner counseling and referral services to identify persons with undiagnosed HIV --- North Carolina, MMWR Morb Mort Wkly Rep.2003;52: Centers for Disease Control and Prevention. Outbreak of Salmonella Enteritidis infection associated with consumption of raw shell eggs, MMWR Morb Mort Wkly Rep. 1992;41: Centers for Disease Control and Prevention. Outbreak of invasive group A streptococcus associated with varicella in a childcare center -- Boston, Massachusetts, MMWR Morb Mort Wkly Rep. 1997;46: