Warm up On slide.

Slides:



Advertisements
Similar presentations
CHAPTER 23: Two Categorical Variables: The Chi-Square Test
Advertisements

Chapter 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 26: Comparing Counts. To analyze categorical data, we construct two-way tables and examine the counts of percents of the explanatory and response.
Chapter 13: Inference for Tables – Chi-Square Procedures
Testing Distributions Section Starter Elite distance runners are thinner than the rest of us. Skinfold thickness, which indirectly measures.
Chapter 11: Inference for Distributions of Categorical Data.
Chapter 11: Inference for Distributions of Categorical Data
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit Tests.
Section 11.1 Chi-Square Goodness-of-Fit Tests
Warm up On slide.
+ Section 11.1 Chi-Square Goodness-of-Fit Tests. + Introduction In the previous chapter, we discussed inference procedures for comparing the proportion.
11.1 Chi-Square Tests for Goodness of Fit Objectives SWBAT: STATE appropriate hypotheses and COMPUTE expected counts for a chi- square test for goodness.
CHAPTER 11: INFERENCE FOR DISTRIBUTIONS OF CATEGORICAL DATA 11.1 CHI-SQUARE TESTS FOR GOODNESS OF FIT OUTCOME: I WILL STATE APPROPRIATE HYPOTHESES AND.
Inference for Tables: Chi-Squares procedures (2 more chapters to go!)
2, 4, 6, 8, 12, 18, 28, 30, 32, 34, 36, 44, 46, 50, 52 Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit.
Check your understanding: p. 684
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Comparing Counts Chi Square Tests Independence.
11.1 Chi-Square Tests for Goodness of Fit
Chapter 11: Inference for Distributions of Categorical Data
Chi-square test or c2 test
Test for Goodness of Fit
Chapter 11: Inference for Distributions of Categorical Data
Chi-Square Goodness of Fit
Elementary Statistics: Picturing The World
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chi-Square - Goodness of Fit
Is a persons’ size related to if they were bullied
13.1 The Chi-square Goodness-of-Fit test
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Day 66 Agenda: Quiz Ch 12 & minutes.
Lesson 11 - R Chapter 11 Review:
Chapter 13 Inference for Tables: Chi-Square Procedures
Chapter 11: Inference for Distributions of Categorical Data
The Analysis of Categorical Data and Goodness of Fit Tests
Chi-square = 2.85 Chi-square crit = 5.99 Achievement is unrelated to whether or not a child attended preschool.
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chi-squared tests Goodness of fit: Does the actual frequency distribution of some data agree with an assumption? Test of Independence: Are two characteristics.
CHAPTER 11 Inference for Distributions of Categorical Data
The Analysis of Categorical Data and Goodness of Fit Tests
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 9 Analysis of Two-Way Tables
Chapter 11: Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
The Analysis of Categorical Data and Goodness of Fit Tests
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Inference for Distributions of Categorical Data
13.1 Test for Goodness of Fit
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 14.1 Goodness of Fit Test.
Chapter 11: Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
CHAPTER 11 Inference for Distributions of Categorical Data
Chapter 11 Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Chapter 13: Chi-Square Procedures
Chapter 11: Inference for Distributions of Categorical Data
Inference for Distributions of Categorical Data
Chapter 11: Inference for Distributions of Categorical Data
Presentation transcript:

Warm up On slide

Section 11.1 Chi-Square

Inference Summary Means Proportions One-sample Z procedures (Hypothesis Test and Confidence Intervals) Proportions One-sample Z procedures One Proportion Z Procedures One-sample t procedures Two Proportion Z Procedures Matched pairs t procedures Two-sample t procedures

The questions then are… What if we want to compare MORE than 2 proportions? i.e. Let’s examine the proportion of high school students who go on to four-year colleges. Is that proportion different based on race (White, African American, Asian, Hispanic)? We’d be comparing 4 proportions! What if we want to make a prediction of results based on a predicted model? i.e. We want to predict the results of mating two red-eyed fruit flies by comparing the actual results to the predicted model. What if we want to compare two categorical variables to see if there is a relationship? i.e. Is smoking behavior (current smoker, former smoker, never smoked) associated to socioeconomic status (high, medium, low)?

The answer is… Spelled Chi-Squared. Pronounced like KITE without the “te.”

Then there were three There are three types of tests Goodness of fit Homogeneity of Proportions Association / Independence Today our focus will be the Chi-Squared Goodness of Fit test.

Goodness of Fit The Chi-squared goodness of fit test measures whether an observed sample distribution is significantly different from the hypothesized distribution. The idea is to compare the observed counts in each category to the expected count for each category based on the hypothesized distribution.

H0: The specified distribution of the categorical variable is correct. Ha: The specified distribution of the categorical variable is not correct.

Conditions Use the chi-squared test if SRS The variable under study is categorical. -The expected value of the number of sample observations in each level of the variable is at least 5.

Mars, Incorporated makes milk chocolate candies Mars, Incorporated makes milk chocolate candies. Here’s what the company’s Consumer Affairs Department says about the color distribution of its M&M’S Milk Chocolate Candies: On average, the new mix of colors of M&M’S Milk Chocolate Candies will contain 13 percent of each of browns and reds, 14 percent yellows, 16 percent greens, 20 percent oranges and 24 percent blues

The one-way table below summarizes the data from a sample bag of M&M’S Milk Chocolate Candies. In general, one-way tables display the distribution of a categorical variable for the individuals in a sample Since the company claims that 24% of all M&M’S Milk Chocolate Candies are blue, we might believe that something fishy is going on. We could use the one-sample z test for a proportion from Chapter 9 to test the hypotheses H0: p = 0.24 Ha: p ≠ 0.24 where p is the true population proportion of blue M&M’S. We could then perform additional significance tests for each of the remaining colors.

Hypotheses H0: The company’s stated color distribution for The null hypothesis in a chi-square goodness-of-fit test should state a claim about the distribution of a single categorical variable in the population of interest. In our example, the appropriate null hypothesis is H0: The company’s stated color distribution for M&M’S Milk Chocolate Candies is correct. Ha: The company’s stated color distribution for M&M’S Milk Chocolate Candies is not correct.

pyellow = 0.14, pred = 0.13, pbrown = 0.13, We can also write the hypotheses in symbols as H0: pblue = 0.24, porange = 0.20, pgreen = 0.16, pyellow = 0.14, pred = 0.13, pbrown = 0.13, Ha: At least one of the pi’s is incorrect where pcolor = the true population proportion of M&M’S Milk Chocolate Candies of that color.

The formula Remember Σ means sum. So complete this equation for each and add them all up!!!!

P-value = .0703

Percent of the Population Example Back in 1980, the US population had the following distribution by age: Age Group Percent of the Population 0 to 24 41.39% 25 to 44 27.68% 45 to 64 19.64% 65 and older 11.28%

1996… Suppose I take a sample of 500 US residents in 1996 and find the following distribution: Age Group Count 0 to 24 177 25 to 44 158 45 to 64 101 65 and older 64 Total 500 I want to know: does the distribution of my sample in 1996 match the distribution of age from 1980?

(based on 1980 percentage * 500) Let’s Compare: Observed (based on sample of 500) Expected (based on 1980 percentage * 500) 177 206.95 158 138.4 101 98.2 64 56.4 Help me fill in the last column! 0-24 25-44 45-64 65+

We see that the distributions are different We see that the distributions are different. The question is ARE THEY SIGNIFICANTLY DIFFERENT?

Characteristics of the Chi-Squared Statistic Chi-Square is ALWAYS (always? Yes, always) skewed RIGHT. As the degrees of freedom increase, the graph becomes less skewed. It becomes more symmetric and looks more like a normal curve. The total area under a chi-square curve is 1. WHY?

In Calc Put Observed in L1 and Expected in L2 Stat, Test, χ2 GOF-Test Enter your df CAUTION!!!! You still need to know how to use the formula and table… Sometimes your calculator will give you an error! This happened in the 2008 Free Response!

How to recognize Χ2 Goodness of Fit You have many percents and you want to know if your sample matches the distribution.

Chapter 11 #9, 10, 13(a-c), 15, 19-22explain Homework Chapter 11 #9, 10, 13(a-c), 15, 19-22explain