Multinomial Distribution

Slides:



Advertisements
Similar presentations
Estimation of Means and Proportions
Advertisements

General Linear Model With correlated error terms  =  2 V ≠  2 I.
CHI-SQUARE(X2) DISTRIBUTION
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Hypothesis: It is an assumption of population parameter ( mean, proportion, variance) There are two types of hypothesis : 1) Simple hypothesis :A statistical.
Chap 9: Testing Hypotheses & Assessing Goodness of Fit Section 9.1: INTRODUCTION In section 8.2, we fitted a Poisson dist’n to counts. This chapter will.
Chap 8: Estimation of parameters & Fitting of Probability Distributions Section 6.1: INTRODUCTION Unknown parameter(s) values must be estimated before.
1 Bernoulli and Binomial Distributions. 2 Bernoulli Random Variables Setting: –finite population –each subject has a categorical response with one of.
Chapter 10 Simple Regression.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
7-2 Estimating a Population Proportion
1 Chapter 20 Two Categorical Variables: The Chi-Square Test.
Presentation 12 Chi-Square test.
AS 737 Categorical Data Analysis For Multivariate
Log-linear Models For 2-dimensional tables. Two-Factor ANOVA (Mean rot of potatoes) Bacteria Type Temp123 1=Cool 2=Warm.
AM Recitation 2/10/11.
Chapter 13: Inference in Regression
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Maximum Likelihood See Davison Ch. 4 for background and a more thorough discussion. Sometimes.
EDRS 6208 Analysis and Interpretation of Data Non Parametric Tests
Slide 1 Copyright © 2004 Pearson Education, Inc..
Today’s lesson Confidence intervals for the expected value of a random variable. Determining the sample size needed to have a specified probability of.
QMS 6351 Statistics and Research Methods Regression Analysis: Testing for Significance Chapter 14 ( ) Chapter 15 (15.5) Prof. Vera Adamchik.
Statistics 11 Correlations Definitions: A correlation is measure of association between two quantitative variables with respect to a single individual.
Chapter 9: Non-parametric Tests n Parametric vs Non-parametric n Chi-Square –1 way –2 way.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Ch9. Inferences Concerning Proportions. Outline Estimation of Proportions Hypothesis concerning one Proportion Hypothesis concerning several proportions.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 8-3 Testing a Claim About a Proportion.
Chapter 26 Chi-Square Testing
Chapter 16 The Chi-Square Statistic
Introduction Many experiments result in measurements that are qualitative or categorical rather than quantitative. Humans classified by ethnic origin Hair.
Sections 7-1 and 7-2 Review and Preview and Estimating a Population Proportion.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Section 7-1 Review and Preview.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Data Analysis for Two-Way Tables. The Basics Two-way table of counts Organizes data about 2 categorical variables Row variables run across the table Column.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chapter Outline Goodness of Fit test Test of Independence.
The table shows a random sample of 100 hikers and the area of hiking preferred. Are hiking area preference and gender independent? Hiking Preference Area.
Copyright © Cengage Learning. All rights reserved. Chi-Square and F Distributions 10.
Dan Piett STAT West Virginia University Lecture 12.
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
1 Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved. Example: In a recent poll, 70% of 1501 randomly selected adults said they believed.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
1 G Lect 7a G Lecture 7a Comparing proportions from independent samples Analysis of matched samples Small samples and 2  2 Tables Strength.
© Copyright McGraw-Hill 2004
Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.
Statistics 300: Elementary Statistics Section 11-2.
1 Math 4030 – 10b Inferences Concerning Proportions.
Chapter 9 Inferences Based on Two Samples: Confidence Intervals and Tests of Hypothesis.
Chapter 14 – 1 Chi-Square Chi-Square as a Statistical Test Statistical Independence Hypothesis Testing with Chi-Square The Assumptions Stating the Research.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 12 Tests of Goodness of Fit and Independence n Goodness of Fit Test: A Multinomial.
ENGR 610 Applied Statistics Fall Week 7 Marshall University CITE Jack Smith.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
BPS - 5th Ed. Chapter 221 Two Categorical Variables: The Chi-Square Test.
Objectives (BPS chapter 12) General rules of probability 1. Independence : Two events A and B are independent if the probability that one event occurs.
Slide 1 Copyright © 2004 Pearson Education, Inc. Chapter 11 Multinomial Experiments and Contingency Tables 11-1 Overview 11-2 Multinomial Experiments:
Class Seven Turn In: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 For Class Eight: Chapter 20: 18, 20, 24 Chapter 22: 34, 36 Read Chapters 23 &
 Confidence Intervals  Around a proportion  Significance Tests  Not Every Difference Counts  Difference in Proportions  Difference in Means.
CHI SQUARE DISTRIBUTION. The Chi-Square (  2 ) Distribution The chi-square distribution is the probability distribution of the sum of several independent,
Log-linear Models Please read Chapter Two. We are interested in relationships between variables White VictimBlack Victim White Prisoner151 (151/160=0.94)
Basic Estimation Techniques
Chapter 4. Inference about Process Quality
Chapter 9: Inferences Involving One Population
Data Analysis for Two-Way Tables
Basic Estimation Techniques
Chapter 7 Estimation: Single Population
Statistical Inference about Regression
Presentation transcript:

Multinomial Distribution Multinomial coefficients Definition Marginals are binomial Maximum likelihood Hypothesis tests

Multinomial Coefficient: From n objects, number of ways to choose n1 of type 1 n2 of type 2 nk of type k

Of 30 graduating students, how many ways are there for 15 to be employed in a job related to their field of study, 10 to be employed in a job unrelated to their field of study, and 5 unemployed?

Multinomial Distribution Statistical experiment with k outcomes Repeated independently n times Pr(Outcome j) = pj, j = 1, …, k Number of times outcome j occurred is xj, j = 1, …, k A multivariate distribution

But if one x_j = n, all the others are zero.

Marginals are also multinomial This is too messy -- students are not responsible for it Using binomial theorem …

Observe Adding over xk-1 throws it into the “leftover” category. Labels 1, …, k are arbitrary, so this means you can combine any 2 categories and the result is still multinomial. k is arbitrary, so you can keep doing it and combine any number of categories. When only two categories are left, the result is binomial E(xj) = npj, Var(xj) = npj(1-pj) You are responsible for these IMPLICATIONS of the last slide

Sample problem P(Job related to field of study) = 0.60 P(Job unrelated to field of study) = 0.30 P(No job) = 0.10 Of 30 randomly chosen students, what is probability that 15 are employed in a job related to their field of study, 10 are employed in a job unrelated to their field of study, and 5 are unemployed? What is the probability that exactly 5 are unemployed? How did I get that exact answer?!! Alternative is dmultinom(c(15,10,5), prob=c(60,30,10))

Data File Case Job x1 x2 x3 1 2 3 4 N Total 2 3 4 N Total Data file almost always has a var with category membership - almost never true multinomial setup

Lessons from the data file Cases (N of them) are independent M(1,p), so E(xi,j) = pj. Column totals count the number of times each category occurs: Joint distribution is M(N,p) These are the table (cell) frequencies! They are random variables, and now we know their joint distribution. Each individual table frequency is B(N,pj) Expected value of frequency j is mj = Npj Tables of 2 and or more dimensions present no problems -- combination variables. Expected frequencies are important. Note the notation m_j.

More about the frequencies We are in the familiar situation of estimating expected values with sample means. And these sample means are just sample proportions.

Simple Tools for Estimation So the (multivariate) sample mean is an unbiased estimator of the vector of multinomial probabilities. The Law of Large numbers says CLT says multivariate sample mean has an approximate multivariate normal distribution for large N. Basis of large-sample tests and confidence intervals.

Maximum Likelihood Product of N probability mass functions, each M(1,p) Depends upon the sample data only through the vector of k frequency counts. By the factorization theorem, a sufficient statistic All the information about the parameter in the sample data is contained in the sufficient statistic. Actually it’s minimal sufficient and complete -- no need to go there.

Following the book’s notation Write the frequencies as x1, …, xk. Later, x values with multiple subscripts will refer to frequencies in a multi-dimensional table, like xi,j,k will be the frequency in row i and column j of sub-table k. Write likelihood function as To maximize likelihood function, must allow for the facts that the probabilities sum to one and frequencies sum to N -- this is easier than Lagrange multipliers.

Log likelihood: p-1 parameters It’s unique, too, if no frequency equals zero. Set all k-1 derivatives to zero and solve for p1, …, pk. Verify that pi = xi /N for i = 1, … k–1 works: MLE is the sample mean.

Likelihood Ratio Tests Under H0, G2 has an approximate chi-square distribution for large N. Degrees of freedom = number of (non-redundant, linear) equalities specified by H0. Reject when G2 is large. Need more detail about degrees of freedom.

Degrees of Freedom Express H0 as a set of linear combinations of the parameters, set equal to constants (usually zeros). Degrees of freedom = number of non-redundant linear combinations. df=3

p = (p1,p2,p3,p4,p5) H0: p1=0.25, p2=(p3+p4)/2,p4=p5 so df=3 H0: p1=1/5, p2=1/5, p3=1/5, p4=1/5, p5=1/5 so df=4 not 5, because probabilities add to one, so one equality is redundant. Matrix stuff is just there for completeness. If is a kx1 vector and H0: C = h where C is an rxk matrix, the degrees of freedom is the row rank (number of linearly independent rows) of C --- usually r. But remember, if = p for the multinomial, there are really k-1 parameters.

Example University administrators recognize that the percentage of students who are unemployed after graduation will vary depending upon economic conditions, but they claim that still, about twice as many students will be employed in a job related to their field of study, compared to those who get an unrelated job. To test this hypothesis, they select a random sample of 200 students from the most recent class, and observe 106 employed in a job related to their field of study, 74 employed in a job unrelated to their field of study, and 20 unemployed. Test the hypothesis using a large-sample likelihood ratio test and significance level = 0.05. State your conclusions in symbols and words.

What is the model? What is the null hypothesis, in symbols? What are the degrees of freedom for this test?

What is the restricted MLE. Your answer is a symbolic expression What is the restricted MLE? Your answer is a symbolic expression. It’s a vector. Show your work. Just one parameter in this restricted model. Makes sense because x_1/N -> 2p, x_2/N -> p Note H0 does not concern p3, so estimate is just the mean.

What is the unrestricted MLE What is the unrestricted MLE? Your answer is a numeric vector: 3 numbers. What is the restricted MLE? Your answer is a numeric vector: 3 numbers. What are the estimated expected frequencies under the null hypothesis? Your answer is a numeric vector: 3 numbers. Notice the NATURAL way of estimating the expected frequencies. When our text says expected frequencies, they almost always mean estimated expected frequencies.

Calculate G2. Show your work.

Or, with R Note log is ln, Don't need tables, p-value is easy.

State your conclusions In symbols: Reject H0: p1=2p2 at alpha = 0.05 In words: More graduates appear to be employed in jobs unrelated to their fields of study than expected. Of course that's estimated expected. STATEMENT IN WORDS IS VALUABLE! Obs-Exp is kind of like a residual in regression Statement in words is justified because Observed 106 74 20 Expected 120 60 20 Obs-Exp -14 14 0

For a general hypothesis about a multinomial Summation is over all cells.

Two chi-square formulas Likelihood Ratio Pearson Summation is over all cells By expected frequency, we mean estimated expected frequency. Asymptotically equivalent Same degrees of freedom Book's formula for df applies only to log-linear models. Use the approach given here, for now. X2 comes directly from CLT, G2 indirectly

Pearson Chi-square on the jobs data Observed 106 74 20 Expected 120 60 20 Asymptotically equivalent -- this is N=200