Download presentation
Presentation is loading. Please wait.
Published byAugust Edwards Modified over 9 years ago
1
1 STA 517 – Introduction: Distribution and Inference 1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS Recall multi(n, =( 1, 2, …, c )) Suppose that each of n independent, identical trials can have outcome in any of c categories. if trial i has outcome in category j = 0 otherwise represents a multinomial trial, with Let denote the number of trials having outcome in category j. The counts have the multinomial distribution. Note: are random variables
2
2 STA 517 – Introduction: Distribution and Inference Example: Mendel’s theory To test Mendel’s theories of natural inheritance. Mendel crossed pea plants of pure yellow strain with plants of pure green strain. He predicted that second-generation hybrid seeds would be 75% yellow and 25% green, yellow being the dominant strain. One experiment: produce n=8023 seeds, and observed n 1 =6022 yellow, n 2 =2001 green. He want to test whether it follows 3:1 ratio.
3
3 STA 517 – Introduction: Distribution and Inference 1.5.1 Estimation of Multinomial Parameters To obtain MLE, the multinomial probability mass function is proportional to the kernel The MLE are the { j } that maximize (1.14). Log likelihood Differentiating L with respect to j gives the likelihood equation ML solution satisfies
4
4 STA 517 – Introduction: Distribution and Inference MLE Now Thus MLE The MLE are the sample proportions.
5
5 STA 517 – Introduction: Distribution and Inference 1.5.2 Pearson Statistic for Testing a Specified Multinomial In 1900 the eminent British statistician Karl Pearson introduced a hypothesis test that was one of the first inferential methods. It had a revolutionary impact on categorical data analysis, which had focused on describing associations. Pearson’s test evaluates whether multinomial parameters equal certain specified values.
6
6 STA 517 – Introduction: Distribution and Inference Pearson Statistic Consider When H 0 is true, the expected values of {n j }, called expected frequencies, are Pearson proposed the test statistics Greater difference produce greater X 2 values, for fixed n. Let denote the observed value of X 2. The P-value is
7
7 STA 517 – Introduction: Distribution and Inference 1.5.3 Example: Testing Mendel’s Theories n 1 =6022 yellow, n 2 =2001 green MLE: test whether it follows 3:1 ratio, i.e. Expected frequencies are This does not contradict Mendel’s hypothesis.
8
8 STA 517 – Introduction: Distribution and Inference SAS code data D; input outcome $ w; cards; yellow 6022 green 2001 ; proc freq; weight w; table outcome/chisq TESTP=(0.25 0.75); run;
9
9 STA 517 – Introduction: Distribution and Inference Pearson statistic When c=2, it can be proved Pearson chi-square statistic is squared score statistic PROOF: by Maple in matlab How about c>2? syms y n pi0 f=(y-n*pi0)^2/pi0+((n-y)-n*(1-pi0))^2/(1-pi0); f1=simple(f) %result: -(-y+pi0*n)^2/n/pi0/(-1+pi0)
10
10 STA 517 – Introduction: Distribution and Inference An alternative test for multinomial parameters uses the likelihood-ratio test. The kernel of the multinomial likelihood is Under H0 the likelihood is maximized when In the general case, it is maximized when The ratio of the likelihoods equals Thus, the likelihood-ratio statistic is 1.5.5 Likelihood-Ratio Chi-Squared
11
11 STA 517 – Introduction: Distribution and Inference LR In the general case, the parameter space consists of { j } subject to j =1, so the dimensionality is c-1. Under H0, the { j } are specified completely, so the dimension is 0. The difference in these dimensions equals c-1. For large n, G 2 has a chi-squared null distribution with df c-1.
12
12 STA 517 – Introduction: Distribution and Inference Both chi-squared dist. With df=c-1 Asymptotically equivalent
13
13 STA 517 – Introduction: Distribution and Inference Wu, Ma, George (2007)
14
14 STA 517 – Introduction: Distribution and Inference 1.5.6 Testing with Estimated Expected Frequencies Pearson’s chi-square was proposed for testing H 0 : j = j0, where j0 are fixed. In some application, j0 = j0 () are function of a small set of unknown parameters . ML estimates of determine ML estimates of { j0 = j0 ()} and hence ML estimates of expected frequencies in X 2. Replacing by estimates affects the distribution of X 2. the true df=(c-1)-dim()
15
15 STA 517 – Introduction: Distribution and Inference Example A sample of 156 dairy calves born in Okeechobee County, Florida, were classified according to whether they caught pneumonia within 60 days of birth. Calves that got a pneumonia infection were also classified according to whether they got a secondary infection within 2 weeks after the first infection cleared up. Hypothesis: the primary infection had an immunizing effect that reduced the likelihood of a secondary infection. How to test it?
16
16 STA 517 – Introduction: Distribution and Inference Data structure Calves that did not get a primary infection could not get a secondary infection, so no observations can fall in the category for ‘‘no’’ primary infection and ‘‘yes’’ secondary infection. That combination is called a structural zero.
17
17 STA 517 – Introduction: Distribution and Inference Test: whether the probability of primary infection was the same as the conditional probability of secondary infection, given that the calf got the primary infection. ab denotes the probability that a calf is classified in row a and column b of this table, the null hypothesis is Let = 11 + 12 denote the probability of primary infection. Then hypothesis probability is
18
18 STA 517 – Introduction: Distribution and Inference MLE and chi-squared test Likelihood Log likelihood Differentiation with respect to Solution For the example Expected counts for each cell Conclusion: the primary infection had an immunizing effect that reduced the likelihood of a secondary infection.
19
19 STA 517 – Introduction: Distribution and Inference Standard Error Since the information is its expected value, which is which simplifies to The asymptotic standard error is the square root of the inverse information, or
20
20 STA 517 – Introduction: Distribution and Inference How about confidence limits?
21
21 STA 517 – Introduction: Distribution and Inference SAS code - MLE, test for binomial proc IML; y=842; n=1824;pi0=0.5; /*data*/ pihat=y/n; SE=sqrt(pihat*(1-pihat)/n); /*MLE*/ WaldStat=(pihat-pi0)**2/SE**2; pWald=1-CDF('CHISQUARE', WaldStat, 1); LR=2*(y*log(pihat/(pi0)) +(n-y)*log((1-pihat)/(1-pi0))); pLR=1-CDF('CHISQUARE',LR, 1); ScoreStat=(pihat-pi0)**2/(pi0*(1-pi0)/n); pScore=1-CDF('CHISQUARE',ScoreStat, 1); print WaldStat pWald; print LR pLR; print ScoreStat pScore;
22
22 STA 517 – Introduction: Distribution and Inference SAS code - MLE, test for binomial data D; input outcome $ w; cards; Yes 842 No 982 ; proc freq; weight w; table outcome/all CL BINOMIAL(P=0.5 LEVEL="Yes"); exact binomial; run;
23
23 STA 517 – Introduction: Distribution and Inference SAS code – multinomial data D; input outcome $ w; cards; yellow 6022 green 2001 ; proc freq; weight w; table outcome/chisq TESTP=(0.25 0.75); run;
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.