Warm-up 1) The first 115 Kentucky Derby winners by color of horse were as follows: roan 1; gray, 4; chestnut, 36; bay, 53; dark bay, 17; and black, 4. Which of the following visual displays is most appropriate? a) Bar Chart b) Histogram c) Stemplot d) Boxplot e) Time plot 2) Suppose the average score on a national test is 500 with a standard deviation of 100. If each score is increased by 25%, what are the new mean and standard deviation? a) 500, 100 b) 525, 100 c) 625, 100 d) 625, 105 e) 625, 125
Chi-Squared Goodness of Fit Chapter 14
Do you remember what we did the first day?? Portion of data for our Day 1 Activity: Color Blue Orange Green Yellow Red Brown Total Count 9 8 12 15 10 6 60 According to the Mars Company, we should have gotten 24% blue M&Ms...did we? Nope...we got 9/60 or about 15%
So, like all hypothesis tests we have null and alternative So, like all hypothesis tests we have null and alternative. Null: Alternative: The idea of the chi-squared goodness-of-fit is this: we compare the observed counts from our sample with the counts that would be expected if Ho is true. So how do we get the expected counts?
A large difference between the observed and expected is good evidence against the null. But what we want to know is... How likely is it that differences this large or larger would occur just by chance in random samples of size 60 from the population distribution claimed by Mars, Inc? The smaller the X2 – The larger the X2 –
Computing Chi-Squared
Chi – Squared Distribution: Is a family of distributions specified by the degree of freedom (df) that has the following properties: .
Finding the P-Value Option 1: Table D In Table D, look up df=5. Our test statistic is between critical values 9.24 and 11.07. This corresponds to ________ and _________. Option 2: Calculator X2cdf(
So what would be conclude at the .05 significance level?
Chi – Squared Test The null hypothesis for the X2 test is: The alternate hypothesis is:
Conditions/ Assumptions for the Goodness of Fit test 1. 2. 3.
Example: When Were You Born? Are births evenly distributed across the days of the week? The one-way table below shows the distribution of births across the days of the week in a random sample of 140 births from local records in a large city. Do these data give significant evidence that local births are not equally likely on all days of the week? Days Sun Mon Tues Wed Thurs Fri Sat Births 13 23 24 20 27 18 15
Hypothesis: Assumptions: Name of Test: Test Statistic: Obtain P-Value: Make Decision: Statement in Context:
Goodness of Fit recap Test uses univariate data Wants to see how well the observed counts “fit” what we expect the counts to be Use X2cdf function of the calculator to find p-values. Based on df where df = number of categories – 1 Hypotheses is written in words (be sure to write in context) Ho: the observed count equals the expected counts Ha: the observed counts are not equal to the expected counts