1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples.

Slides:



Advertisements
Similar presentations
Categorical Data Analysis
Advertisements

Chi Squared Tests. Introduction Two statistical techniques are presented. Both are used to analyze nominal data. –A goodness-of-fit test for a multinomial.
Binomial Distribution & Bayes’ Theorem. Questions What is a probability? What is the probability of obtaining 2 heads in 4 coin tosses? What is the probability.
Copyright © Cengage Learning. All rights reserved.
Section 9.1 ~ Fundamentals of Hypothesis Testing Introduction to Probability and Statistics Ms. Young.
Chapter 5 Probability Distributions. E.g., X is the number of heads obtained in 3 tosses of a coin. [X=0] = {TTT} [X=1] = {HTT, THT, TTH} [X=2] = {HHT,
Statistics for the Social Sciences Psychology 340 Spring 2005 Sampling distribution.
12.The Chi-square Test and the Analysis of the Contingency Tables 12.1Contingency Table 12.2A Words of Caution about Chi-Square Test.
Chapter 16 Chi Squared Tests.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
Statistics for the Social Sciences Psychology 340 Spring 2005 Hypothesis testing.
Statistics for the Social Sciences Psychology 340 Fall 2006 Hypothesis testing.
8-2 Basics of Hypothesis Testing
11-2 Goodness-of-Fit In this section, we consider sample data consisting of observed frequency counts arranged in a single row or column (called a one-way.
Inferences About Process Quality
The Kruskal-Wallis Test The Kruskal-Wallis test is a nonparametric test that can be used to determine whether three or more independent samples were.
Probability Distributions: Finite Random Variables.
Goodness of Fit Test for Proportions of Multinomial Population Chi-square distribution Hypotheses test/Goodness of fit test.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
Hypothesis Testing:.
Overview of Statistical Hypothesis Testing: The z-Test
Lecture Slides Elementary Statistics Twelfth Edition
1 © Lecture note 3 Hypothesis Testing MAKE HYPOTHESIS ©
Fundamentals of Hypothesis Testing: One-Sample Tests
Tests of significance & hypothesis testing Dr. Omar Al Jadaan Assistant Professor – Computer Science & Mathematics.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
Is this quarter fair? How could you determine this? You assume that flipping the coin a large number of times would result in heads half the time (i.e.,
The Binomial Distribution Permutations: How many different pairs of two items are possible from these four letters: L, M. N, P. L,M L,N L,P M,L M,N M,P.
Chapter 11: Applications of Chi-Square. Count or Frequency Data Many problems for which the data is categorized and the results shown by way of counts.
Chapter 11: Applications of Chi-Square. Chapter Goals Investigate two tests: multinomial experiment, and the contingency table. Compare experimental results.
Introduction In probability, events are either dependent or independent. Two events are independent if the occurrence or non-occurrence of one event has.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. 1.. Section 11-2 Goodness of Fit.
Chapter 16 – Categorical Data Analysis Math 22 Introductory Statistics.
Copyright © 2009 Cengage Learning 15.1 Chapter 16 Chi-Squared Tests.
Chapter 13: Categorical Data Analysis Statistics.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 11 Inference for Distributions of Categorical.
Chapter 11: Inference for Distributions of Categorical Data Section 11.1 Chi-Square Goodness-of-Fit Tests.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Chi-Square Procedures Chi-Square Test for Goodness of Fit, Independence of Variables, and Homogeneity of Proportions.
Hypothesis Testing State the hypotheses. Formulate an analysis plan. Analyze sample data. Interpret the results.
Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc Chapter 16 Chi-Squared Tests.
Thursday August 29, 2013 The Z Transformation. Today: Z-Scores First--Upper and lower real limits: Boundaries of intervals for scores that are represented.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Chapter Outline Goodness of Fit test Test of Independence.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
Random Variables Learn how to characterize the pattern of the distribution of values that a random variable may have, and how to use the pattern to find.
© Copyright McGraw-Hill 2004
Statistics 300: Elementary Statistics Section 11-2.
1 Binomial Random Variables Lecture 5  Many experiments are like tossing a coin a fixed number of times and recording the up-face.  The two possible.
MATH 256 Probability and Random Processes Yrd. Doç. Dr. Didem Kivanc Tureli 14/10/2011Lecture 3 OKAN UNIVERSITY.
Outline of Today’s Discussion 1.The Chi-Square Test of Independence 2.The Chi-Square Test of Goodness of Fit.
Discrete Math Section 16.3 Use the Binomial Probability theorem to find the probability of a given outcome on repeated independent trials. Flip a coin.
Psych 230 Psychological Measurement and Statistics Pedro Wolf October 21, 2009.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Binomial Probability Theorem In a rainy season, there is 60% chance that it will rain on a particular day. What is the probability that there will exactly.
Section 5.1 Day 2.
Outline.
Chapter 12 Tests with Qualitative Data
Chapter 11 Goodness-of-Fit and Contingency Tables
Statistics for the Social Sciences
Statistics for the Social Sciences
Introduction In probability, events are either dependent or independent. Two events are independent if the occurrence or non-occurrence of one event has.
Two-dimensional Chi-square
Presentation transcript:

1 Outline 1.Count data 2.Properties of the multinomial experiment 3.Testing the null hypothesis 4.Examples

2 Count data Sometimes, the data we have to analyze are produced by counting things. – How many people choose each of Brands A, B, and C of coffee?

3 Count data Usually, we count things in a sample in order to make an inference to a population. – E.g., are the proportions of people choosing each brand different from one another? – Or, are the proportions of people choosing each brand different from some hypothetical values in the population?

4 Count data To answer such questions, we need to know approximately how much difference between the various counts could be produced by sampling error. We determine that quantity using the ‘multinomial probability distribution,’ an extension of the binomial probability distribution.

5 Properties of the Multinomial Experiment 1.There are n identical trials 2.There are k possible outcomes on each trial 3.The probabilities of the outcomes are the same across trials 4.Trials are all independent of each other 5.The multinomial random variables are the k values n 1, n 2, …, n k.

6 Testing the null hypothesis We often want to test the null hypothesis that all the categories are equal in frequency. If we asked 60 people which of Brands A, B, and C they prefer, equal frequency would look like this: ABC

7 Testing the null hypothesis At other times, we might want to test a specific null hypothesis, such as that B and C are equally popular, but A is twice as popular as either: ABC In both cases, we call the values shown the “expected values.”

8 Testing the null hypothesis The null hypothesis can be tested using the statistic χ 2. χ 2 = Σ[n i – E(n i )] 2 E(n i ) χ 2 increases as the observed values, n i, get further from the expected values E(n i ).

9 Chi-square – example Suppose we want to know whether there is any population preference for brands of coffee among brands A, B, and C. We need to know two things: – How should choices among the brands be distributed in a sample if there is no preference (all are equally popular)? – How are choices distributed in our sample?

10 Chi-square – example We ask a sample of 90 people for their preference If there is no preference, each brand should be chosen by ⅓ of the people asked: ABC These are the “expected values” – – expected if the null hypothesis is true

11 Chi-square – example We ask a sample of 90 people for their preference The actual choices look like this: ABC These are the “observed values”

12 Expected vs. Observed Values ABC ABC Expected values – each value = ⅓ * 90 Observed values

13 Chi-square – example χ 2 = Σ[n i – E(n i )] 2 E(n i ) χ 2 = (15-30) 2 + (42-30) 2 + (33-30) = 12.6

14 Chi-square – the formal hypothesis test H O : P A = P B = P C = ⅓ H A : Something different – at least one P > ⅓ Test statistic: χ 2 = Σ[n i – E(n i )] 2 E(n i ) where d.f. = (k-1; k = number of categories)

15 Chi-square – the formal hypothesis test Rejection region: χ 2 obt > χ 2 crit = χ 2 (.05, 2) = (note: rejection region is always > χ 2 crit ) Decision: since χ 2 obt > χ 2 crit, reject HO. Brands are not equally popular

16 Chi-square – Example 1 At a recent meeting of the Coin Flippers Society, each member flipped three coins simultaneously and the number of tails occurring was recorded. Shown below are the numbers of members who had certain numbers of tails. Is there evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair? (α =.01) Number of TailsNumber of Members

17 Chi-square – Example 1 Shown below are the numbers of members who had certain numbers of tails. Number of tails = the categories people fall into Number of members = number of people in each category. Number of members is the dependent variable. Do you see why?

18 Chi-square – Example 1 To begin, we need to compute the expected values for each of the categories. That is, we need to figure out how many of our 500 members would fall into each category if all the coins used were fair. Wait a minute! How do we know there are 500 members?

19 Chi-square – Example 1 At a recent meeting of the Coin Flippers Society, each member flipped three coins simultaneously and the number of tails occurring was recorded. Shown below are the numbers of members who had certain numbers of tails. Is there evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair? (α =.01) Number of TailsNumber of Members Σ = 500

20 Chi-square – Example 1 How many possible outcomes are there for one trial? HHH HHT HTH THH HTT THT THH TTT There are 8 possible outcomes

21 Chi-square – Example 1 Of these eight possible outcomes, how many involve getting 0 tails? Just one – HHH. How many involve getting 1 tail? 3 – HHT, HTH, THH. How many involve getting 2 tails? 3 – HTT, THT, TTH. How many involve getting 3 tails? 1 - TTT

22 Chi-square – Example 1 H O : P 0 =.125, P 1 =.375, P 2 =.375, P 3 =.125 H A : At least one P is different from the value specified in H O. Test statistic: χ 2 = Σ[n i – E(n i )] 2 E(n i ) Rejection region: χ 2 obt > χ 2 crit = χ 2 (.01, 3) =

23 Chi-square – Example 1 Now we compute the expected values using (a) the probabilities in H O and (b) our sample n: P 0 * 500 =.125 * 500 = 62.5 P 1 * 500 =.375 * 500 = P 2 * 500 =.375 * 500 = P 3 * 500 =.125 * 500 = 62.5

24 Chi-square – Example 1 χ 2 = [65–62.5] 2 + [182–187.5] 2 + [194–187.5] 2 + [59–62.5] = Decision: Do not reject. There is no evidence that the coin flipping outcomes were different from what would be expected if all the coins used were fair.

25 Chi-square – Example 2 There is an “old wives’ tale” that babies don’t tend to be born randomly during the day but tend more to be born in the middle of the night, specifically between the hours of 1 AM and 5 AM. To investigate this, a researcher collects birth-time data from a large maternity hospital. The day was broken into 4 parts: Morning (5 AM to 1 PM), Mid-day (1 PM to 5 PM), Evening (5 PM to 1 AM), and Mid-night (1 AM to 5 AM). The number of births at these times for the last three months (January to March) are shown on the next slide.

26 Chi-square – Example 2 Morning110 Mid-day 50 Evening 100 Mid-night100 Does it appear that births are not randomly distributed throughout the day? (α =.01)

27 Chi-square – Example 2 The critical thing about a chi-square question is usually the expected values. In the previous example, we computed the expected values on the basis of probabilities of various outcomes for a fair coin. In this question, expected values for the number of births in each segment of the day will be based on one variable: how long in hours is each segment.

28 Chi-square – Example 2 Morning: 5 AM to 1 PM = 8 hours Mid-day: 1 PM to 5 PM = 4 hours Evening: 5 PM to 1 AM = 8 hours Mid-night: 1 AM to 5 AM = 4 hours These periods are not all equal in length!

29 Chi-square – Example 2 If time of day was irrelevant to when babies are born, we would expect every period of, say, 4 hours to produce the same number of babies. Since the Morning and Evening segments each contain two 4- hour periods and the Mid-day and Midnight segments each contain one 4-hour period, our expected values will be: MorningMid-dayEveningMidnight 1/3 1/6 1/3 1/6

30 Chi-square – Example 2 Our sample totals 360 babies. In 1/6 of a day (4 hours) we would expect 360/6 = 60 babies to be born, under the null hypothesis, giving these expected values for the four segments of the day: MorningMid-dayEveningMidnight

31 Chi-square – Example 2 H O : P morn = 1/3, P midday = 1/6, P even = 1/3, P midnight = 1/3 H A : At least one P different from value specified in H O. Test statistic: χ 2 = Σ[n i – E(n i )] 2 E(n i ) Rejection region: χ 2 obt > χ 2 crit = χ 2 (.05, 3) = 7.81

32 Chi-square – Example 2 χ 2 obt = [ ] 2 + … + [100-60] =

33 Chi-square – Example 2 χ 2 obt = Decision: Reject H O. Births are not randomly scattered throughout the day.