Download presentation
Presentation is loading. Please wait.
1
Lecture 3. The Multinomial Distribution
Outlines for Today 1. Definition 2. Examples 3. Statistical Applications 4. Basic Properties 5. Estimation 6. Testing Simple Hypotheses 2/5/2019 SA3202, Lecture 3
2
Definition Multi-outcome Trial: a trial or experiment that has k possible outcomes A1, A2, …Ak, respectively with probability p1,p2,…,pk to happen. That is P(Aj)=pj, j= 1,2,…k. e.g. A student’s grade in a course may be A, B, C, D, or F (5 possible outcomes). Let X= j if Aj happens. Then X is a Multi-value R.V.: a r.v. that takes k possible values. e.g. Let X be the GPA point for A,B, C, D, and F respectively. Then P(X=4)=P(A), P(X=0)=P(F). 2/5/2019 SA3202, Lecture 3
3
Multinomial Distribution: the distribution of a multinomial r.v.
Multivariate/Multi-dimensional R.V. : a random variable has several components X=(X1,X2, ….,Xk)’ Multinomial R.V.: Let Xj be the number of times that outcome Aj occurs in n times of multi-outcome trials, j=1,2,…,k. Then X=(X1,X2, …,Xk)’ is a multinomial r.v. Multinomial Distribution: the distribution of a multinomial r.v. P(X=l)= Denoted as X~M(n;p1,p2,…,pk) with index n and parameters p1,p2,..,pk. 2/5/2019 SA3202, Lecture 3
4
Examples Example 1 :Consider an experiment with k=3 possible outcomes, A,B, and C, with probabilities p1,p2, and p3 respectively. Suppose the experiment is repeated n=4 times. What is the probability that A appears twice, B appears once and C once? Experiment Probability A A B C p1p1p2p3 A A C B p1p1p3p2 B A A C p2p1p1p3 C A A B p3p1p1p2 B C A A p2p3p1p1 C B A A p3p2p1p1 ……………………. The number of possibilities is 4!/(2!1!1!)=12, each with probability p1^2p2p3. Thus P(A twice, B once, C once)=12p1^2p2p3. 2/5/2019 SA3202, Lecture 3
5
Example 2 Suppose a die is thrown 20 times
Example 2 Suppose a die is thrown 20 times. Let Xj denote the number of times that the number “j” appears. Then X=(X1,X2,..,X6)’~M Example 3 Suppose 100 random digits are generated. Let Xj denote the number of times that the digit “j” is obtained. Then X=(X0,X1,X2,..,X9)’~M Example 4 Suppose a pair of coins is tossed 50 times. Let X1, X2, X3 denote the number of times that HH (two heads appear), TT (two tails appear) and HT (a head, a tail) respectively. Then X=(X1,X2,X3)’~M 2/5/2019 SA3202, Lecture 3
6
X=(X0,X1,X2,X3,X4)’~M(3343; p0,p1,p2,p3,p4)
Example 5 The Number of Boys Data: The following table shows the number of boys among the first 4 children in 3343 Swedish families of size 4 or more. Number of boys Total Frequency Let Xj , j=0,1,2,3,4 be the number of families with j boys among the first four children in 3343 families, and let pj, j=0,1,2,3,4 denote the associated probabilities. Then X=(X0,X1,X2,X3,X4)’~M(3343; p0,p1,p2,p3,p4) Under the usual assumption , the number of boys, Y, say, follows a binomial distribution, Y~ Binom(4,1/2). Thus, the probabilities are pj=P(Y=j)= Thus the distribution of X is 2/5/2019 SA3202, Lecture 3
7
Statistical Applications
Suppose that each member of a population can be classified into one of k categories (cells): Category …….k Probability p1 p2 p3 …….pk A random sample of size n is drawn from the population. Let Xj be the number of sample units in the j-th category. Then X=(X1,X2,…,Xk)’~M(n; p1,p2,…,pk) Example: According to recent census figures, the proportion of adults in US associated with 5 age categories were Age Probability If 5 adults are drawn at random then the probability that the sample would contain 1 person from the age group, 2 from the age group and 2 from the age group is 2/5/2019 SA3202, Lecture 3
8
The (marginal) distribution of Xj is Binomial Xj~Binom(n; pj)
Some Basic Properties The (marginal) distribution of Xj is Binomial Xj~Binom(n; pj) Thus E(Xj)=npj, Var(Xj)=npj(1-pj) Moreover Cov(Xj,Xl)=- npjpl 2/5/2019 SA3202, Lecture 3
9
The natural estimator of pj is pj=Xj/n with Mean
Estimation The natural estimator of pj is pj=Xj/n with Mean Varinace and Standard Error 2/5/2019 SA3202, Lecture 3
10
Testing Simple Hypotheses
The usual procedure for testing hypotheses about the parameters of a multinomial distribution is to compare the observed frequencies with their expected values under the hypothesis. Consider testing the simple hypothesis H0: pj=pj*, j=1,2, …k, pj* are some given but reasonable values Under H0, the expected frequencies and the observed frequencies are mj*=npj*, (expected) Xj (observed), j=1,2,..k When H0 is true, the expected frequencies mj* should be close to the observed frequencies Xj for j=1,2,…k, or alternatively, the hypothetical (population) proportions should be close to the observed (sample) proportions. 2/5/2019 SA3202, Lecture 3
11
The H0 can be tested by The Pearson’s Goodness of Fit Test Statistic
Or by The Wilk’s Likelihood Ratio Test Statistic Both the test statistics have chi-square distributions with degrees of freedom ( for the Equi-probability model): df=k-1, k=the number of the categories Note that the effect of the sample size n: when n is larger, it is more easier to detect small difference. 2/5/2019 SA3202, Lecture 3
12
Examples Example 1 Consider the Random Numbers Data again. The H0 is
H0: pj=.1, j=0,1,2,…9. The expected frequencies under H0 are mj*=100 *.1=10, all j. The computed Pearson’s Goodness of Fit Test Statistic T=9.4 with 10-1=9 df. Thus, the H0 is accepted. That is, the calculator random number generator is OK. Example 2 Consider the Number of Boys Data. A simple hypothesis is that the number of boys among the first 4 children follows a binomial distribution Binom(4; ½). That is H0: p0=1/16, p1=4/16, p2=6/16, p3=4/16, p4=1/16. 2/5/2019 SA3202, Lecture 3
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.