Presentation is loading. Please wait.

Presentation is loading. Please wait.

Inference for Tables: Chi-Squares procedures (2 more chapters to go!)

Similar presentations


Presentation on theme: "Inference for Tables: Chi-Squares procedures (2 more chapters to go!)"— Presentation transcript:

1 Inference for Tables: Chi-Squares procedures (2 more chapters to go!)

2 Sometimes we want to examine the distribution of proportions in a single population.
The chi-square test for goodness of fit allows us to determine whether a specified population distribution seems valid. We can compare two or more population proportions using a chi-square test for homogeneity of populations In doing so, we will organize our data in a two-way table. It is also possible to use the information provided in a two-way table to determine whether the distribution of one variable has been influenced The chi-square test of association/independence helps us decide this issue.

3 Does your zodiac sign determine how successful you will be in later life? Fortune magazine collected the zodiac signs of 256 heads of the largest 400 companies. Here are the numbers of births for each sign: Births Sign 23 Aries 20 Taurus 18 Gemini Cancer Leo 19 Virgo Libra 21 Scorpio Sagittarius 22 Capricorn 24 Aquarius 29 Pisces We can see some variation in the number of births per sign and there are more Pisces, but is it enough to claim that successful people are more likely to be born under some signs than others? How closely do the observed numbers of births per sign fit this simple “null” model?

4 Goodness-of-fit We have specified a model for the distribution and want to know whether it fits. There is no single parameter to estimate so a confidence interval wouldn’t make any sense.

5 M&M’s To see if you got a fair share of blue M&M’s you could perform significance tests (like we have been doing) You could then perform significance tests on the other colors This would be inefficient and would not tell us how likely it is that six sample proportions differ from the values stated by the company as much as our sample does.

6 Chi-square Test for Goodness of Fit
Compare the distribution of each bag

7 Goodness of Fit H0: the actual population proportions are equal to the hypothesized proportions, Ha: the actual population proportions are different from the hypothesized proportions In order to determine whether the distribution is different calculate the quantity Χ 2 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡−𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 2 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 The sum is (χ2) and degrees of freedom are = number of categories - 1

8 Properties of the chi-squared distribution
The chi-square distributions are a family of distributions that take only positive values and are skewed to the right. A specific chi-squared distribution is specified by only one parameter called the degrees of freedom (number of categories -1)

9 Chi squared density curve
The total area under a chi-squared curve is equal to 1 Chi square begins at 0 on the horizontal axis, increase to a peak, and then approaches the horizontal axis asymptotically Chi squared curve is skewed to the right Number of degrees of freedom increase, the curve becomes more and more symmetrical and look more like a normal curve

10 Density curves for three members of the chi-square family of distributions
As the degrees of freedom increase, the density curve becomes less skewed and larger values become more probable

11 Hypotheses used for this test
Follow same 4 step process- remember when no critical value is given use 0.05 Conditions: all individual expected counts are at least 1 no more than 20% of the expected counts are less than 5 Random

12 Example 13.2: Biologists wish to mate two fruit flies having genetic makeup RrCc, indicating that it has one dominant gene (R) and one recessive gene (r) for eye color, along with one dominant (C) and one recessive (c) gene for wing type. Each offspring will receive one gene for each of the two traits from both parents. The following table, often called a Punnett square, shows the possible combinations of genes received by the offspring. Parent 2 Passes on RC Rc rC rc RRCC (x) RRCc (x) RrCC (x) RrCc (x) RRcc (y) Rrcc (y) rrCC (z) rrCc (z) rrCc (z) Rrcc (w) Parent 1 Passes on

13 Any offspring receiving an R gene will have red eyes, and any offspring receiving a C gene will have straight wings. So based on this Punnett square, the biologists predict a ratio of 9 red-eyed, straight wing (x): 3 red-eyed, curly wing (y): 3 white-eyed, straight (z): 1 white-eyed, curly (w) offspring. In order to test their hypothesis, the biologists mate the fruit flies. Of 200 offspring, 101 had red eyes and straight wings, 42 had red eyes and curly wings, 49 had white eyes and straight wings, and 10 had white eyes and curly wings. Do these data differ significantly from what the biologists have predicted?

14 Step 1: State The biologists are interested in the proportion of offspring that fall into each genetic category for the population of all fruit flies that would result from crossing two parents with genetic makeup RrCc. Ho = pred,straight = , pred,curly = , pwhite,straight = , pwhite,curly = Ha = at least one of these proportions is incorrect

15 Step 2: Plan/Conditions
We will use a chi-square goodness of fit test provided all conditions are met. Random- fruit flies selected randomly Conditions: check expected counts red-eyed, straight-wing: * = 112.5 red-eyed, curly-wing: * = 37.5 white-eyed, straight-wing: 200* = 37.5 white-eyed, curly-wing: 200* = 12.5 Since all expected counts are >5, then we can continue

16 Step 3: Do Χ 2 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡−𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 2 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡
Χ 2 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡−𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 2 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 Χ 2 = 101− − − − Χ 2 = =5.742 Using Table E we get a p-value between 0.10 and 0.15. Using our calculator: go to DISTR, down to option 8 χ2cdf(χ2, 999, df) = χ2cdf(5.742, 999, 3) = Try STAT, TESTS, go down to D: χ2GOF-Test

17 Step 4: Conclude The P-value of 0
Step 4: Conclude The P-value of indicates that the probability of obtaining a sample of 200 fruit fly offspring in which the proportions differ from the hypothesized values by at least as much as the ones in our sample is over 12%, assuming that the null hypothesis is true. This is not enough evidence to reject the biologists’ predicted distribution.

18 Follow-up analysis Even though there is evidence that the distribution has changed significantly, one must look at the individual components of 2 to see where the largest changes have occurred

19 Going back to the zodiac problem, I want to know whether births of successful people are uniformly distributed across the signs of the zodiac or not. H0: births are uniformly distributed over zodiac signs Ha: births are not uniformly distributed over zodiac signs Use a chi-square goodness-of-fit test if conditions are met.

20 Check conditions: Random: this is a convenience sample of executives but there’s no reason to suspect bias Expected number: 1/12*256 = 21.3 These are > 5 so this condition is satisfied. Df = 12-1 = 11

21 Χ 2 = 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑜𝑢𝑛𝑡−𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡 2 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑐𝑜𝑢𝑛𝑡
χ2 = P-value = χ2 cdf(5.094, 999,11) = The P-value of says that if the zodiac signs of executives were in fact distributed uniformly, an observed chi-square value of 5.09 or higher would occur about 93% of the time. This certainly isn’t unusual, so I fail to reject the null hypothesis, and conclude that these data show virtually no evidence of nonuniform distribution of zodiac signs among executives.


Download ppt "Inference for Tables: Chi-Squares procedures (2 more chapters to go!)"

Similar presentations


Ads by Google