Presentation is loading. Please wait.

Presentation is loading. Please wait.

381 Goodness of Fit Tests QSCI 381 – Lecture 40 (Larson and Farber, Sect 10.1)

Similar presentations


Presentation on theme: "381 Goodness of Fit Tests QSCI 381 – Lecture 40 (Larson and Farber, Sect 10.1)"— Presentation transcript:

1 381 Goodness of Fit Tests QSCI 381 – Lecture 40 (Larson and Farber, Sect 10.1)

2 381 Multinomial Experiments A is a probability experiment consisting of a fixed number of trials in which there are more than two possible outcomes for each independent trial. The probability for each outcome is fixed and each outcome is classified into. Examples of multinomial experiments include: You sample 100 animals from a population. The categories could be age, length, maturity state. You sample 1000 poppies in a field. The categories could be colour. You sample 20 animals and calculate the frequency that each has a particular genetic haplotype.

3 381 Goodness-of-fit Tests A is used to test whether an observed frequency distribution fits an expected distribution. We need to specify a null and an alternative hypothesis. Generally the null hypothesis is that the observed frequency distribution (the data) fits the expected distribution. The alternative hypothesis is that this is not the case.

4 381 Example-I We expect that a “healthy” marine mammal population should consist of an equal number of males and females, and that 60% of the population should be mature. We sample 150 animals and assess the fraction in each of four categories to be: Mature Female Mature Male Immature Female Immature Male 30403248

5 381 Observed and Expected Frequencies The of a category is the frequency for the category observed in the data. The of a category is the calculated frequency for the category. Expected frequencies are obtained by assuming the specified (or hypothesized) distribution is correct. The expected frequency for the i th category is: Where n is the number of trials, and p i is the assumed probability for the i th category.

6 381 Observed and Expected Frequencies (Example) Mature Female Mature Male Immature Female Immature Male Observed frequency 30403248 Assumed probability 0.3 0.2 Expected frequency 45 (150 x 0.3) 45 (150 x 0.3) 30 (150 x 0.2) 30 (150 x 0.2)

7 381 The Chi-square goodness-of-fit Test-I IF: 1. the observed frequencies are obtained from a random sample, and 2. the expected frequencies are greater than or equal to 5 (pool categories if this is not the case). then the sampling distribution for the goodness-of-fit test is a chi-square distribution with k-1 degrees of freedom where k is the number of categories. The test statistic is:

8 381 The Chi-square goodness-of-fit Test-II 1. Identify the claim and state the null and alternative hypotheses. 2. Specify the level of significance, . 3. Determine the degrees of freedom, d.f=k-1. 4. Find the critical value of the chi-square distribution and hence define the rejection region for the test. 5. Calculate the test statistic. 6. Check whether or not the value of the test statistic is in the rejection region.

9 381 Example (Test using  =0.01) H 0 : the distribution of animals between sex and maturity classes equals that expected for a healthy population. The degrees of freedom=k-1=3. The critical value of the chi-square distribution is 11.34 (CHIINV(0.01,3))

10 381 Example (Test using  =0.01) Mature Female Mature Male Immature Female Immature Male Observed frequency 30403248 Expected frequency 45 30 50.560.1310.80 -We reject the null hypothesis at the 1% level of significance.

11 381 Example-A-1 (  =0.05) The probability of a particular bird species utilizing each of five habitats is known. We collect data for a different species (n=137) and wish to assess whether the two species differ in their habitat requirements. Habitat type 12345 Expected p0.20.10.050.50.15 Observed301707218

12 381 Example-A-2 (  =0.05) Habitat type 12345 Observed frequency 301707218 Expected frequency 27.413.76.8568.520.55 0.250.796.850.180.32 The critical value is 9.49 – we fail to reject the null hypothesis

13 381 Testing for Normality We can use the chi-square test in some cases to assess whether a variable is normally distributed. The null and alternative hypotheses are that: The variable has a normal distribution. The variable does not have a normal distribution.

14 381 Example Class boundaries Frequency 5-156 15-2523 25-3553 35-4545 45-5522 Can we assume that these data are normal (assume  =0.05)?

15 381 Calculating the Test Statistic Class boundaries Observed frequency O Cumulative normalExpected p Expected Frequency E LowerUpperDifference 5-1560.00300.03680.03385.00.1822 15-25230.03680.20370.166924.90.1407 25-35530.20370.55260.348852.00.0202 35-45450.55260.86270.310246.20.0318 45-55220.86370.98000.117217.51.1746  1490.977145.571.5497 x i is the mid-point of each class E i =p i x 149


Download ppt "381 Goodness of Fit Tests QSCI 381 – Lecture 40 (Larson and Farber, Sect 10.1)"

Similar presentations


Ads by Google