Estimating 0 Estimating the proportion of true null hypotheses with the method of moments By Jose M Muino.
The objective Objective To obtain some information ( 0 and moments) to help in the construction of the critical region in a multiple hypotheses problem The situation: Low sample size The distribution under the null hypothesis is unknown But the expectation of the null distribution is known
Definitions t Let T i, i = 1,...,m, be the test statistics for testing null hypotheses H 0,i based on observable random variables. Assume that H 0,i is true with probability 0 and false with probability (1- 0 ) Assume T i follows a density function f 0 (T) under H 0,i and f 1 (T) if H 0,i is false. Assume that the first m 0 =m* 0 H 0,i are true, and the next m 1 =m*(1- 0 ) H 0,i are false
The idea Define: Then:
The estimators Assumed known
Any moment Because: Then:
Estimators Sample levelTest value level
Example: The mean value as test statistic The properties of the estimators can be studied by taking Taylor series. The properties will be illustrated with the example of the mean value as test statistic Testing m hypotheses regarding m observed samples x i,j, i=1,…m, j=1,…n, using as test statistic the mean of the observations
Properties Assuming independence
Properties Assuming independence
Properties Assuming independence
Numerical Simulations m0=450,m1=50, H0->N(0,1), H1->N(1,1) (2000 simulations) 5000 simulations
Numerical Simulations 5000 simulations
From moments to quantiles A family of distributions (eg: Pearson family) can be used to calculate the quantiles error type I n=3 n=4 n=5 MMClassicalMMClassicalMMClassical 0,50,4870,4990,4930,50,4960,499 0,10,0960,0990,0970,0990,0980,099 0,050,0560,0490,0520,0490,0510,049 0,010,0220,0090,0140,010,0120,009 0,0010,010,0010,0030,0010,0020, simulations
From moments to quantiles error type I n=3 n=4 n=5 MMClassicalMMClassicalMMClassical 0,50,4760,4990,4890,50,4940,5 0,10,0960,0990,0960,0990,0970,1 0,050,0630,0490,0540,0490,0520,05 0,010,0380,0090,0190,010,0150,01 0,0010,0290,00090,0070,0010,0030,001 error type I n=3 n=4 n=5 MMClassicalMMClassicalMMClassical 0,50,4890,50,4950,50,4960,5 0,10,0970,10,0980,0990,0980,1 0,050,0540,050,0520,0490,0510,05 0,010,0180,010,0130,010,0120,01 0,0010,0060,00090,0020,0010,0020,001
Advantages Combine information from sample and test level. No assumptions about the shape of the distribution (finite moments required) Analytical solution Properties can be obtained Estimator can be improved
Disadvantages Different estimator for different test statistic Estimators of the central moments of the test statistics are required The estimation can be outside of the parameter space
Thanks for your attention!! Questions?, Suggestions? Or write me at: Work funded by Marie Curie RTN: “Transistor” (“Trans-cis elements regulating key switches in plant development”)