Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006

Slides:



Advertisements
Similar presentations
Hypothesis Testing. To define a statistical Test we 1.Choose a statistic (called the test statistic) 2.Divide the range of possible values for the test.
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Multinomial Experiments Goodness of Fit Tests We have just seen an example of comparing two proportions. For that analysis, we used the normal distribution.
Hypothesis Testing IV Chi Square.
Business Statistics - QBM117
Definitions In statistics, a hypothesis is a claim or statement about a property of a population. A hypothesis test is a standard procedure for testing.
Goodness of Fit Test for Proportions of Multinomial Population Chi-square distribution Hypotheses test/Goodness of fit test.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Tue, Oct 23, 2007.
Testing Distributions Section Starter Elite distance runners are thinner than the rest of us. Skinfold thickness, which indirectly measures.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 1.. Section 11-2 Goodness of Fit.
Testing Hypotheses about a Population Proportion Lecture 30 Sections 9.3 Wed, Oct 24, 2007.
Chapter 14: Chi-Square Procedures – Test for Goodness of Fit.
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Chapter Outline Goodness of Fit test Test of Independence.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Fri, Nov 12, 2004.
Testing Hypotheses about a Population Proportion Lecture 29 Sections 9.1 – 9.3 Wed, Nov 1, 2006.
Section 12.2: Tests for Homogeneity and Independence in a Two-Way Table.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 11 Analyzing the Association Between Categorical Variables Section 11.2 Testing Categorical.
Testing Hypotheses about a Population Proportion Lecture 31 Sections 9.1 – 9.3 Wed, Mar 22, 2006.
Test of Homogeneity Lecture 45 Section 14.4 Tue, Apr 12, 2005.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
The Chi Square Equation Statistics in Biology. Background The chi square (χ 2 ) test is a statistical test to compare observed results with theoretical.
11.1 Chi-Square Tests for Goodness of Fit Objectives SWBAT: STATE appropriate hypotheses and COMPUTE expected counts for a chi- square test for goodness.
Confidence Interval Estimation for a Population Proportion Lecture 33 Section 9.4 Mon, Nov 7, 2005.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
Student’s t Distribution Lecture 32 Section 10.2 Fri, Nov 10, 2006.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
11.1 Chi-Square Tests for Goodness of Fit
The Chi-square Statistic
Student’s t Distribution
Testing Hypotheses about a Population Proportion
Chapter 13 Test for Goodness of Fit
Test for Goodness of Fit
Testing a Claim About a Mean:  Not Known
Chapter 12 Tests with Qualitative Data
Qualitative data – tests of association
Testing Goodness of Fit
Hypothesis Tests for a Population Mean,
Chi Square Two-way Tables
Lecture 18 Section 8.3 Objectives: Chi-squared distributions
AP Stats Check In Where we’ve been… Chapter 7…Chapter 8…
Chapter 10 Analyzing the Association Between Categorical Variables
Chi Square (2) Dr. Richard Jackson
Inference on Categorical Data
Lecture 3. The Multinomial Distribution
Hypothesis Tests for Proportions
Hypothesis Tests for Two Population Standard Deviations
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Lecture 42 Section 14.4 Wed, Apr 17, 2007
Lecture 37 Section 14.4 Wed, Nov 29, 2006
Lecture 38 Section 14.5 Mon, Dec 4, 2006
Analyzing the Association Between Categorical Variables
Lecture 43 Sections 14.4 – 14.5 Mon, Nov 26, 2007
Hypothesis Tests for a Standard Deviation
Testing Hypotheses about a Population Proportion
Copyright © Cengage Learning. All rights reserved.
Section 11-1 Review and Preview
UNIT V CHISQUARE DISTRIBUTION
S.M.JOSHI COLLEGE, HADAPSAR
Testing Hypotheses about a Population Proportion
Testing Hypotheses about a Population Proportion
Chapter Outline Goodness of Fit test Test of Independence.
Lecture 42 Section 14.3 Mon, Nov 19, 2007
Testing Hypotheses about a Population Proportion
Lecture 46 Section 14.5 Wed, Apr 13, 2005
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Presentation transcript:

Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006 Test of Goodness of Fit Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006

Count Data Count data – Data that counts the number of observations that fall into each of several categories. The data may be univariate or bivariate. Univariate example – Observe a person’s opinion on a subject (strongly agree, agree, etc.). Bivariate example – Observe a opinion on a subject and their education level (< high school, high school, etc.)

Univariate Example Observe a person’s opinion on a question. Strongly Agree Agree Neutral Disagree Strongly Disagree 100 120 80 50

Bivariate Example Observe students’ final grade in statistics and year in college. Strongly Agree Agree Neutral Disagree Strongly Disagree < High School 35 25 30 5 High School 40 15 10 College 20 > College

Observed and Expected Counts Observed counts – The counts that were actually observed in the sample. Expected counts – The counts that would be expected if the null hypothesis were true.

Tests of Goodness of Fit The goodness-of-fit test applies only to univariate data. The null hypothesis specifies a discrete distribution for the population. We want to determine whether a sample from that population supports this hypothesis.

Examples If we rolled a die 60 times, we expect 10 of each number. If we get frequencies 8, 10, 14, 12, 9, 7, does that indicate that the die is not fair? If we toss a fair coin, we should get two heads ¼ of the time, two tails ¼ of the time, and one of each ½ of the time. Suppose we toss a coin 100 times and get two heads 16 times, two tails 36 times, and one of each 48 times. Is the coin fair?

Examples If we selected 20 people from a group that was 60% male and 40% female, we would expect to get 12 males and 8 females. If we got 15 males and 5 females, would that indicate that our selection procedure was not random (i.e., discriminatory)? What if we selected 100 people from the group and got 75 males and 25 females?

Null Hypothesis The null hypothesis specifies the probability (or proportion) for each category. In other words, it specifies a discrete probability distribution. Each probability is the probability that a random observation would fall into that category.

Null Hypothesis To test a die for fairness, the null hypothesis would be H0: p1 = 1/6, p2 = 1/6, …, p6 = 1/6. The alternative hypothesis will always be a simple negation of H0: H1: At least one of the probabilities is not 1/6. or more simply, H1: H0 is false.

Expected Counts To find the expected counts, we apply the hypothetical (H0) probabilities to the sample size. For example, if the hypothetical probabilities are 1/6 and the sample size is 60, then the expected counts are (1/6)  60 = 10.

Example We will use the sample data given for 60 rolls of a die to calculate the 2 statistic. Make a chart showing both the observed and expected counts (in parentheses). 1 2 3 4 5 6 8 (10) 10 14 12 9 7

The Chi-Square Statistic Denote the observed counts by O and the expected counts by E. Define the chi-square (2) statistic to be Clearly, if the observed counts are close to the expected counts, then 2 will be small. But if even a few observed counts are far from the expected counts, then 2 will be large.

Chi-Square Degrees of Freedom The chi-square distribution has an associated degrees of freedom, just like the t distribution. Each chi-square distribution has a slightly different shape, depending on the number of degrees of freedom.

Chi-Square Degrees of Freedom

Chi-Square Degrees of Freedom 2(2)

Chi-Square Degrees of Freedom 2(2) 2(5)

Chi-Square Degrees of Freedom 2(2) 2(5) 2(10)

Properties of 2 The chi-square distribution with df degrees of freedom has the following properties. 2  0. It is unimodal. It is skewed right (not symmetric!) 2 = df. 2 = (2df). If df is large, then 2(df) is approximately N(df, (2df)).

Chi-Square vs. Normal

Chi-Square vs. Normal 2(128)

Chi-Square vs. Normal 2(128) N(128, 16)

The Chi-Square Table See page A-11. The left column is degrees of freedom: 1, 2, 3, …, 15, 16, 18, 20, 24, 30, 40, 60, 120. The column headings represent areas of lower tails: 0.005, 0.01, 0.025, 0.05, 0.10, 0.90, 0.95, 0.975, 0.99, 0.995. Of course, the lower tails 0.90, 0.95, 0.975, 0.99, 0.995 are the same as the upper tails 0.10, 0.05, 0.025, 0.01, 0.005.

Example If df = 10, what value of 2 cuts off an lower tail of 0.05? If df = 10, what value of 2 cuts off a upper tail of 0.05?

TI-83 – Chi-Square Probabilities To find a chi-square probability (p-value) on the TI-83, Press DISTR. Select 2cdf (item #7). Press ENTER. Enter the lower endpoint, the upper endpoint, and the degrees of freedom. The probability appears.

Example Now calculate 2.

Computing the p-value The number of degrees of freedom is 1 less than the number of categories in the table. In this example, df = 5. To find the p-value, use the TI-83 to calculate the probability that 2(5) would be at least as large as 3.4. p-value = 2cdf(3.4, E99, 5) = 0.6386. Therefore, p-value = 0.6386 (accept H0).

The Effect of the Sample Size What if the previous sample distribution persisted in a much larger sample, say n = 600? Would it be significant? 1 2 3 4 5 6 80 (100) 100 140 120 90 70

TI-83 – Goodness of Fit Test The TI-83 will not automatically do a goodness-of-fit test. The following procedure will compute 2. Enter the observed counts into list L1. Enter the expected counts into list L2. Evaluate the expression (L1 – L2)2/L2. Select LIST > MATH > sum and apply the sum function to the previous result, i.e., sum(Ans). The result is the value of 2.

The List of Expected Counts To get the list of expected counts, you may Store the list of hypothetical probabilities in L3. Multiply L3 by the sample size and store in L2. For example, if the probabilities for 6 categories are p1 = 1/6 , p2 = 1/6, p3 = 1/6 , p4 = 1/6, p5 = 1/6, and p6 = 1/6, and the sample size is n = 60, then Store {1/6,1/6,1/6,1/6,1/6,1/6} in L3. Compute L3*60 and store in L2.

Example To test whether the coin is fair, the null hypothesis would be H0: pHH = 1/4, pTT = 1/4, pHT = 1/2. The alternative hypothesis would be H1: H0 is false. Let  = 0.05.

Expected Counts To find the expected counts, we apply the hypothetical probabilities to the sample size. Expected HH = (1/4) 100 = 25. Expected TT = (1/4)  100 = 25. Expected HT = (1/2)  100 = 50.

Example We will use the sample data given for 60 rolls of a die to calculate the 2 statistic. Make a chart showing both the observed and expected counts (in parentheses). HH TT HT 16 (25) 36 48 (50)

Example Now calculate 2.

Compute the p-value In this example, df = 2. To find the p-value, use the TI-83 to calculate the probability that 2(2) would be at least as large as 8.16. 2cdf(8.16, E99, 2) = 0.0169. Therefore, p-value = 0.0169 (reject H0). The coin appears to be unfair.

Example Suppose we select 20 people from a group that is 60% male and 40% female and we get 15 males and 5 females. Is it reasonable to believe that we selected the 20 people at random?

The Hypotheses To test that the process was random, the null hypothesis would be H0: pM = 0.60, pF = 0.40. The alternative hypothesis would be H1: H0 is false. Let  = 0.05.

Calculate the Expected Counts To find the expected counts, multiply the hypothetical probabilities to the sample size. Expected no. of males = 0.60  20 = 12. Expected no. of females = 0.40  20 = 8.

Make the Chart Make a chart showing both the observed and expected counts (in parentheses). M F 15 (12) 5 (8)

Compute 2 Now calculate 2.

Compute the p-value In this example, df = 1. To find the p-value, use the TI-83 to calculate the probability that 2(1) would be at least as large as 1.875. 2cdf(1.875, E99, 1) = 0.1709. Therefore, p-value = 0.1709 (accept H0). There is no evidence that the people were not selected at random.

Increase the Sample Size Repeat the previous problem with a sample of 75 males and 25 females.