Lecture 3. The Multinomial Distribution

Slides:



Advertisements
Similar presentations
Categorical Data Analysis
Advertisements

Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
AP STATISTICS Simulation “Statistics means never having to say you're certain.”
MA-250 Probability and Statistics Nazar Khan PUCIT Lecture 13.
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Goodness of Fit Test for Proportions of Multinomial Population Chi-square distribution Hypotheses test/Goodness of fit test.
1 1 Slide IS 310 – Business Statistics IS 310 Business Statistics CSU Long Beach.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Copyright © Cengage Learning. All rights reserved. 11 Applications of Chi-Square.
1 In this case, each element of a population is assigned to one and only one of several classes or categories. Chapter 11 – Test of Independence - Hypothesis.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
Chi- square test x 2. Chi Square test Symbolized by Greek x 2 pronounced “Ki square” A Test of STATISTICAL SIGNIFICANCE for TABLE data.
Fitting probability models to frequency data. Review - proportions Data: discrete nominal variable with two states (“success” and “failure”) You can do.
Copyright © 2010 Pearson Education, Inc. Slide
Test of Goodness of Fit Lecture 43 Section 14.1 – 14.3 Fri, Apr 8, 2005.
Chapter Outline Goodness of Fit test Test of Independence.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
ESTIMATION OF THE MEAN. 2 INTRO :: ESTIMATION Definition The assignment of plausible value(s) to a population parameter based on a value of a sample statistic.
Probability Distribution. Probability Distributions: Overview To understand probability distributions, it is important to understand variables and random.
Chi Square Chi square is employed to test the difference between an actual sample and another hypothetical or previously established distribution such.
Test of Goodness of Fit Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007.
Applied statistics Usman Roshan.
Sampling and Sampling Distributions
APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS
Keller: Stats for Mgmt & Econ, 7th Ed Chi-Squared Tests
Chapter 11 – Test of Independence - Hypothesis Test for Proportions of a Multinomial Population In this case, each element of a population is assigned.
Outline.
Lecture8 Test forcomparison of proportion
Chapter 4. Inference about Process Quality
Inference and Tests of Hypotheses
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
Distribution functions
Chapter 12 Tests with Qualitative Data
John Loucks St. Edward’s University . SLIDES . BY.
Virtual University of Pakistan
Chapter 5 Sampling Distributions
Lecture 6 Comparing Proportions (II)
SA3202 Statistical Methods for Social Sciences
Lecture 2. The Binomial Distribution
Chapter 9 Hypothesis Testing.
Goodness-of-Fit Tests
Lecture 4. The Multinomial Distribution (II)
CONCEPTS OF ESTIMATION
PROBABILITY AND STATISTICS
Lecture Slides Elementary Statistics Tenth Edition
Chapter 7 Estimation: Single Population
Lecture 18 Section 8.3 Objectives: Chi-squared distributions
Econ 3790: Business and Economics Statistics
Lecture 5, Goodness of Fit Test
Chapter 11: Inference for Distributions of Categorical Data
Lecture 10 Comparing 2xk Tables
Lecture 36 Section 14.1 – 14.3 Mon, Nov 27, 2006
Categorical Data Analysis
Lecture 9 Sampling Procedures and Testing Independence
Chapter 13 – Applications of the Chi-Square Statistic
Chapter 5 Sampling Distributions
Lecture 41 Section 14.1 – 14.3 Wed, Nov 14, 2007
Analyzing the Association Between Categorical Variables
Chi2 (A.K.A X2).
Lecture Slides Elementary Statistics Twelfth Edition
Test for Equality of Several Proportions
Chapter Outline Goodness of Fit test Test of Independence.
Lecture 43 Section 14.1 – 14.3 Mon, Nov 28, 2005
Presentation transcript:

Lecture 3. The Multinomial Distribution Outlines for Today 1. Definition 2. Examples 3. Statistical Applications 4. Basic Properties 5. Estimation 6. Testing Simple Hypotheses 2/5/2019 SA3202, Lecture 3

Definition Multi-outcome Trial: a trial or experiment that has k possible outcomes A1, A2, …Ak, respectively with probability p1,p2,…,pk to happen. That is P(Aj)=pj, j= 1,2,…k. e.g. A student’s grade in a course may be A, B, C, D, or F (5 possible outcomes). Let X= j if Aj happens. Then X is a Multi-value R.V.: a r.v. that takes k possible values. e.g. Let X be the GPA point for A,B, C, D, and F respectively. Then P(X=4)=P(A), P(X=0)=P(F). 2/5/2019 SA3202, Lecture 3

Multinomial Distribution: the distribution of a multinomial r.v. Multivariate/Multi-dimensional R.V. : a random variable has several components X=(X1,X2, ….,Xk)’ Multinomial R.V.: Let Xj be the number of times that outcome Aj occurs in n times of multi-outcome trials, j=1,2,…,k. Then X=(X1,X2, …,Xk)’ is a multinomial r.v. Multinomial Distribution: the distribution of a multinomial r.v. P(X=l)= Denoted as X~M(n;p1,p2,…,pk) with index n and parameters p1,p2,..,pk. 2/5/2019 SA3202, Lecture 3

Examples Example 1 :Consider an experiment with k=3 possible outcomes, A,B, and C, with probabilities p1,p2, and p3 respectively. Suppose the experiment is repeated n=4 times. What is the probability that A appears twice, B appears once and C once? Experiment 1 2 3 4 Probability A A B C p1p1p2p3 A A C B p1p1p3p2 B A A C p2p1p1p3 C A A B p3p1p1p2 B C A A p2p3p1p1 C B A A p3p2p1p1 ……………………. The number of possibilities is 4!/(2!1!1!)=12, each with probability p1^2p2p3. Thus P(A twice, B once, C once)=12p1^2p2p3. 2/5/2019 SA3202, Lecture 3

Example 2 Suppose a die is thrown 20 times Example 2 Suppose a die is thrown 20 times. Let Xj denote the number of times that the number “j” appears. Then X=(X1,X2,..,X6)’~M Example 3 Suppose 100 random digits are generated. Let Xj denote the number of times that the digit “j” is obtained. Then X=(X0,X1,X2,..,X9)’~M Example 4 Suppose a pair of coins is tossed 50 times. Let X1, X2, X3 denote the number of times that HH (two heads appear), TT (two tails appear) and HT (a head, a tail) respectively. Then X=(X1,X2,X3)’~M 2/5/2019 SA3202, Lecture 3

X=(X0,X1,X2,X3,X4)’~M(3343; p0,p1,p2,p3,p4) Example 5 The Number of Boys Data: The following table shows the number of boys among the first 4 children in 3343 Swedish families of size 4 or more. Number of boys 0 1 2 3 4 Total Frequency 183 789 1250 875 246 3343 Let Xj , j=0,1,2,3,4 be the number of families with j boys among the first four children in 3343 families, and let pj, j=0,1,2,3,4 denote the associated probabilities. Then X=(X0,X1,X2,X3,X4)’~M(3343; p0,p1,p2,p3,p4) Under the usual assumption , the number of boys, Y, say, follows a binomial distribution, Y~ Binom(4,1/2). Thus, the probabilities are pj=P(Y=j)= Thus the distribution of X is 2/5/2019 SA3202, Lecture 3

Statistical Applications Suppose that each member of a population can be classified into one of k categories (cells): Category 1 2 3 …….k Probability p1 p2 p3 …….pk A random sample of size n is drawn from the population. Let Xj be the number of sample units in the j-th category. Then X=(X1,X2,…,Xk)’~M(n; p1,p2,…,pk) Example: According to recent census figures, the proportion of adults in US associated with 5 age categories were Age 18-24 25-34 35-44 45-64 65--- Probability .18 .23 .16 .27 .16 If 5 adults are drawn at random then the probability that the sample would contain 1 person from the age 18-24 group, 2 from the age 25-34 group and 2 from the 45-64 age group is 2/5/2019 SA3202, Lecture 3

The (marginal) distribution of Xj is Binomial Xj~Binom(n; pj) Some Basic Properties The (marginal) distribution of Xj is Binomial Xj~Binom(n; pj) Thus E(Xj)=npj, Var(Xj)=npj(1-pj) Moreover Cov(Xj,Xl)=- npjpl 2/5/2019 SA3202, Lecture 3

The natural estimator of pj is pj=Xj/n with Mean Estimation The natural estimator of pj is pj=Xj/n with Mean Varinace and Standard Error 2/5/2019 SA3202, Lecture 3

Testing Simple Hypotheses The usual procedure for testing hypotheses about the parameters of a multinomial distribution is to compare the observed frequencies with their expected values under the hypothesis. Consider testing the simple hypothesis H0: pj=pj*, j=1,2, …k, pj* are some given but reasonable values Under H0, the expected frequencies and the observed frequencies are mj*=npj*, (expected) Xj (observed), j=1,2,..k When H0 is true, the expected frequencies mj* should be close to the observed frequencies Xj for j=1,2,…k, or alternatively, the hypothetical (population) proportions should be close to the observed (sample) proportions. 2/5/2019 SA3202, Lecture 3

The H0 can be tested by The Pearson’s Goodness of Fit Test Statistic Or by The Wilk’s Likelihood Ratio Test Statistic Both the test statistics have chi-square distributions with degrees of freedom ( for the Equi-probability model): df=k-1, k=the number of the categories Note that the effect of the sample size n: when n is larger, it is more easier to detect small difference. 2/5/2019 SA3202, Lecture 3

Examples Example 1 Consider the Random Numbers Data again. The H0 is H0: pj=.1, j=0,1,2,…9. The expected frequencies under H0 are mj*=100 *.1=10, all j. The computed Pearson’s Goodness of Fit Test Statistic T=9.4 with 10-1=9 df. Thus, the H0 is accepted. That is, the calculator random number generator is OK. Example 2 Consider the Number of Boys Data. A simple hypothesis is that the number of boys among the first 4 children follows a binomial distribution Binom(4; ½). That is H0: p0=1/16, p1=4/16, p2=6/16, p3=4/16, p4=1/16. 2/5/2019 SA3202, Lecture 3