Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose.

Similar presentations


Presentation on theme: "1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose."— Presentation transcript:

1 1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose that each of n independent, identical trials can have outcome in any of c categories. if trial i has outcome in category j = 0 otherwise  represents a multinomial trial, with  Let denote the number of trials having outcome in category j.  The counts have the multinomial distribution. Note: are random variables

2 2 STA 517 – Introduction: Distribution and Inference Example: Mendel’s theory  To test Mendel’s theories of natural inheritance. Mendel crossed pea plants of pure yellow strain with plants of pure green strain.  He predicted that second-generation hybrid seeds would be 75% yellow and 25% green, yellow being the dominant strain.  One experiment: produce n=8023 seeds, and observed n 1 =6022 yellow, n 2 =2001 green.  He want to test whether it follows 3:1 ratio.

3 3 STA 517 – Introduction: Distribution and Inference 1.5.1 Estimation of Multinomial Parameters  To obtain MLE, the multinomial probability mass function is proportional to the kernel  The MLE are the { j } that maximize (1.14).  Log likelihood  Differentiating L with respect to  j gives the likelihood equation  ML solution satisfies

4 4 STA 517 – Introduction: Distribution and Inference MLE  Now  Thus  MLE  The MLE are the sample proportions.

5 5 STA 517 – Introduction: Distribution and Inference 1.5.2 Pearson Statistic for Testing a Specified Multinomial  In 1900 the eminent British statistician Karl Pearson introduced a hypothesis test that was one of the first inferential methods.  It had a revolutionary impact on categorical data analysis, which had focused on describing associations.  Pearson’s test evaluates whether multinomial parameters equal certain specified values.

6 6 STA 517 – Introduction: Distribution and Inference Pearson Statistic  Consider  When H 0 is true, the expected values of {n j }, called expected frequencies, are  Pearson proposed the test statistics  Greater difference produce greater X 2 values, for fixed n.  Let denote the observed value of X 2. The P-value is

7 7 STA 517 – Introduction: Distribution and Inference 1.5.3 Example: Testing Mendel’s Theories  n 1 =6022 yellow, n 2 =2001 green  MLE:  test whether it follows 3:1 ratio, i.e.  Expected frequencies are  This does not contradict Mendel’s hypothesis.

8 8 STA 517 – Introduction: Distribution and Inference SAS code data D; input outcome $ w; cards; yellow 6022 green 2001 ; proc freq; weight w; table outcome/chisq TESTP=(0.25 0.75); run;

9 9 STA 517 – Introduction: Distribution and Inference Pearson statistic  When c=2, it can be proved Pearson chi-square statistic is squared score statistic  PROOF: by Maple in matlab  How about c>2? syms y n pi0 f=(y-n*pi0)^2/pi0+((n-y)-n*(1-pi0))^2/(1-pi0); f1=simple(f) %result: -(-y+pi0*n)^2/n/pi0/(-1+pi0)

10 10 STA 517 – Introduction: Distribution and Inference  An alternative test for multinomial parameters uses the likelihood-ratio test.  The kernel of the multinomial likelihood is  Under H0 the likelihood is maximized when  In the general case, it is maximized when  The ratio of the likelihoods equals  Thus, the likelihood-ratio statistic is 1.5.5 Likelihood-Ratio Chi-Squared

11 11 STA 517 – Introduction: Distribution and Inference LR  In the general case, the parameter space consists of { j } subject to  j =1, so the dimensionality is c-1. Under H0, the { j } are specified completely, so the dimension is 0. The difference in these dimensions equals c-1.  For large n, G 2 has a chi-squared null distribution with df c-1.

12 12 STA 517 – Introduction: Distribution and Inference  Both chi-squared dist. With df=c-1  Asymptotically equivalent

13 13 STA 517 – Introduction: Distribution and Inference Wu, Ma, George (2007)

14 14 STA 517 – Introduction: Distribution and Inference 1.5.6 Testing with Estimated Expected Frequencies  Pearson’s chi-square was proposed for testing H 0 :  j = j0, where  j0 are fixed.  In some application,  j0 = j0 () are function of a small set of unknown parameters .  ML estimates of  determine ML estimates of { j0 = j0 ()} and hence ML estimates of expected frequencies in X 2.  Replacing by estimates affects the distribution of X 2.  the true df=(c-1)-dim()

15 15 STA 517 – Introduction: Distribution and Inference Example  A sample of 156 dairy calves born in Okeechobee County, Florida, were classified according to whether they caught pneumonia within 60 days of birth.  Calves that got a pneumonia infection were also classified according to whether they got a secondary infection within 2 weeks after the first infection cleared up.  Hypothesis: the primary infection had an immunizing effect that reduced the likelihood of a secondary infection.  How to test it?

16 16 STA 517 – Introduction: Distribution and Inference Data structure  Calves that did not get a primary infection could not get a secondary infection, so no observations can fall in the category for ‘‘no’’ primary infection and ‘‘yes’’ secondary infection.  That combination is called a structural zero.

17 17 STA 517 – Introduction: Distribution and Inference Test: whether the probability of primary infection was the same as the conditional probability of secondary infection, given that the calf got the primary infection.   ab denotes the probability that a calf is classified in row a and column b of this table, the null hypothesis is  Let = 11 + 12 denote the probability of primary infection. Then hypothesis probability is

18 18 STA 517 – Introduction: Distribution and Inference MLE and chi-squared test  Likelihood  Log likelihood  Differentiation with respect to   Solution  For the example  Expected counts for each cell Conclusion: the primary infection had an immunizing effect that reduced the likelihood of a secondary infection.

19 19 STA 517 – Introduction: Distribution and Inference Standard Error  Since  the information is its expected value, which is  which simplifies to  The asymptotic standard error is the square root of the inverse information, or

20 20 STA 517 – Introduction: Distribution and Inference How about confidence limits?

21 21 STA 517 – Introduction: Distribution and Inference SAS code - MLE, test for binomial proc IML; y=842; n=1824;pi0=0.5; /*data*/ pihat=y/n; SE=sqrt(pihat*(1-pihat)/n); /*MLE*/ WaldStat=(pihat-pi0)**2/SE**2; pWald=1-CDF('CHISQUARE', WaldStat, 1); LR=2*(y*log(pihat/(pi0)) +(n-y)*log((1-pihat)/(1-pi0))); pLR=1-CDF('CHISQUARE',LR, 1); ScoreStat=(pihat-pi0)**2/(pi0*(1-pi0)/n); pScore=1-CDF('CHISQUARE',ScoreStat, 1); print WaldStat pWald; print LR pLR; print ScoreStat pScore;

22 22 STA 517 – Introduction: Distribution and Inference SAS code - MLE, test for binomial data D; input outcome $ w; cards; Yes 842 No 982 ; proc freq; weight w; table outcome/all CL BINOMIAL(P=0.5 LEVEL="Yes"); exact binomial; run;

23 23 STA 517 – Introduction: Distribution and Inference SAS code – multinomial data D; input outcome $ w; cards; yellow 6022 green 2001 ; proc freq; weight w; table outcome/chisq TESTP=(0.25 0.75); run;


Download ppt "1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS  Recall multi(n, =( 1,  2, …,  c ))  Suppose."

Similar presentations


Ads by Google