The Need For Resampling In Multiple testing. Correlation Structures Tukey’s T Method exploit the correlation structure between the test statistics, and.

Slides:



Advertisements
Similar presentations
Tests of Hypotheses Based on a Single Sample
Advertisements

Significance Tests Hypothesis - Statement Regarding a Characteristic of a Variable or set of variables. Corresponds to population(s) –Majority of registered.
Hypothesis Testing A hypothesis is a claim or statement about a property of a population (in our case, about the mean or a proportion of the population)
Chapter 10 Section 2 Hypothesis Tests for a Population Mean
AP Statistics – Chapter 9 Test Review
Significance Testing Chapter 13 Victor Katch Kinesiology.
Hypothesis testing Week 10 Lecture 2.
PSY 307 – Statistics for the Behavioral Sciences
MARE 250 Dr. Jason Turner Hypothesis Testing II To ASSUME is to make an… Four assumptions for t-test hypothesis testing: 1. Random Samples 2. Independent.
MARE 250 Dr. Jason Turner Hypothesis Testing II. To ASSUME is to make an… Four assumptions for t-test hypothesis testing:
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Probability & Statistics for Engineers & Scientists, by Walpole, Myers, Myers & Ye ~ Chapter 10 Notes Class notes for ISE 201 San Jose State University.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Basic Business Statistics.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 14 Goodness-of-Fit Tests and Categorical Data Analysis.
Significance Tests P-values and Q-values. Outline Statistical significance in multiple testing Statistical significance in multiple testing Empirical.
Experimental Evaluation
IENG 486 Statistical Quality & Process Control
The Analysis of Variance
© 2013 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Introductory Statistics: Exploring the World through.
Chapter 11: Inference for Distributions
Inferences About Process Quality
BCOR 1020 Business Statistics
5-3 Inference on the Means of Two Populations, Variances Unknown
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Chapter 14 Inferential Data Analysis
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 8 Tests of Hypotheses Based on a Single Sample.
Chapter 15 Nonparametric Statistics
Statistics 11 Hypothesis Testing Discover the relationships that exist between events/things Accomplished by: Asking questions Getting answers In accord.
Hypothesis Testing:.
Hypothesis Testing Statistics for Microarray Data Analysis – Lecture 3 supplement The Fields Institute for Research in Mathematical Sciences May 25, 2002.
Chapter 10 Hypothesis Testing
Intermediate Statistical Analysis Professor K. Leppel.
Multiple testing in high- throughput biology Petter Mostad.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 9-1 Chapter 9 Fundamentals of Hypothesis Testing: One-Sample Tests Business Statistics,
Hypothesis Testing.
Chapter 8 Inferences Based on a Single Sample: Tests of Hypothesis.
Fundamentals of Hypothesis Testing: One-Sample Tests
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Introduction to Statistical Inferences
Intermediate Applied Statistics STAT 460
NONPARAMETRIC STATISTICS
Week 8 Fundamentals of Hypothesis Testing: One-Sample Tests
+ Chapter 9 Summary. + Section 9.1 Significance Tests: The Basics After this section, you should be able to… STATE correct hypotheses for a significance.
Inference for a Single Population Proportion (p).
1 CSI5388: Functional Elements of Statistics for Machine Learning Part I.
Chapter 9: Testing Hypotheses
Chapter 8 Introduction to Hypothesis Testing
CHAPTER 18: Inference about a Population Mean
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Hypothesis Testing Hypothesis Testing Topic 11. Hypothesis Testing Another way of looking at statistical inference in which we want to ask a question.
Multiple Testing in Microarray Data Analysis Mi-Ok Kim.
Copyright © Cengage Learning. All rights reserved. 14 Elements of Nonparametric Statistics.
Chapter 8 Introduction to Hypothesis Testing ©. Chapter 8 - Chapter Outcomes After studying the material in this chapter, you should be able to: 4 Formulate.
Back to basics – Probability, Conditional Probability and Independence Probability of an outcome in an experiment is the proportion of times that.
Chap 8-1 Fundamentals of Hypothesis Testing: One-Sample Tests.
Ex St 801 Statistical Methods Inference about a Single Population Mean.
Introduction to Hypothesis Testing
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Assumptions and Conditions –Randomization Condition: The data arise from a random sample or suitably randomized experiment. Randomly sampled data (particularly.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
Copyright © Cengage Learning. All rights reserved. 15 Distribution-Free Procedures.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. 1 FINAL EXAMINATION STUDY MATERIAL III A ADDITIONAL READING MATERIAL – INTRO STATS 3 RD EDITION.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
STA248 week 121 Bootstrap Test for Pairs of Means of a Non-Normal Population – small samples Suppose X 1, …, X n are iid from some distribution independent.
Test for Mean of a Non-Normal Population – small n
Chapter 9 Hypothesis Testing.
Presentation transcript:

The Need For Resampling In Multiple testing

Correlation Structures Tukey’s T Method exploit the correlation structure between the test statistics, and have somewhat smaller critical value than the Bonferroni-style critical values. It is easier to obtain a statistically significant result when correlation structures are incorporated.

Correlation Structures The incorporation of correlation structure results in a smaller adjusted p-value than Bonferroni-style adjustment, again resulting in more powerful tests. The incorporation of correlation structures can be very important when the correlations are extremely large.

Correlation Structures Often, certain variables are recognized as duplicating information and are dropped, or perhaps the variables are combined into a single measure. In the case, the correlations among the resulting variables is less extreme.

Correlation Structures In cases of moderate correlation structures, the distribution between the Bonferroni adjustment and the exact adjustment can be very slight. Bonferroni inequality: Pr{∩ 1 r A i } ≧ 1-Σ 1 r Pr{A i c } A small value of ‘’Pr{A i c }’’corresponds to a small per-comparison error rate.

Correlation Structures The incorporating dependence structure becomes less important for smaller significant levels. If a Bonferroni-style correction is reasonable, then why bother with resampling?

Distributional Characteristics Other distributional characteristics, such as discreteness and skewness, can have dramatic effect, even for small p-value. The nonnormality is of equal or greater concern than correlation structure in multiple testing application.

The Need For Resampling In Multiple testing Distribution Of Extremal Statistics Under Nonnormality

Noreen’s analysis of tests for a single lognormal mean Y ij are observations. i=1,..,10, j=1,..,n All observations are independent and identically distributed as e z, where Z denotes a standard normal random variable. The hypotheses tested are H i : E(Y ij )=√e, with upper or lower-tailed alternatives. t=(y-√e)/(s/√n) _

Distributions of t-statistics For each graph there were t-statistics, all simulated using lognormal y ij. The solid lines (actual) show the distribution of t when sampling from lognormal population, and the dotted lines (nominal) show the distribution of t when sampling from normal population.

Distributions of t-statistics

The lower tail area of the actual distribution of t-statistic is larger than the corresponding tail of the approximating Student’s t- distribution, the lower-tailed test rejects H more often than it should. The upper tail area of the actual distribution is smaller than that of the approximating t- distribution, yielding fewer rejections than expected.

Distributions of t-statistics As can be expected with larger sample sizes, the approximations become better, and the actual proportion of rejections more closely approximates the nominal proportion.

Distributions of minimum and maximal t-statistics When one considers maximal and minimal t- statistics, the effect of the skewness is greatly amplified.

Distributions of minimum t-statistics

Distributions of minimum t-statistics (lower-tail) Because values in the extreme lower tails of the actual distributions are much more likely than under the corresponding t-distribution, the possibility of observing a significant result can be much larger than expected under the assumption of normal data. This cause false significances.

Distributions of minimum t-statistics (upper-tail) It is quit difficult to achieve a significant upper-tailed test, since the true distributions are so sharply curtailed in the upper tails. It has very lower power, and will likely fail to detect alternative hypotheses.

Distributions of maximum t-statistics

Distributions of minimum and maximal t-statistics We can expect that these results will become worse as the number of tests (k) increases.

The normal-based tests are much robust when testing contrasts involving two or more groups. T=(Y 1 -Y 2 )/s√(1/n 1 +1/n 2 ) _ Two-sample Tests _

There is an approximate cancellation skewness terms for the distribution of T, leaving the distribution roughly symmetric. We expected the normal-based procedures to perform better than in the one-sample case.

Two-sample Tests According to the rejection proportions, both procedures perform fairly well. Still, the bootstrap performs better than the normal approximation.

The Need For Resampling In Multiple testing The performance of Bootstrap Adjustments

Bootstrap Adjustments Use the adjusted p-values for the lower-tailed tests The pivotal statistics used to test the ten hypotheses are

Bootstrap Adjustments For Ten Independent Samples

Bootstrap Adjustments The adjustment algorithm in Algorithm 2.7 was placed within an ‘outer loop’, in which the data y ij were repeatedly generated iid from the standard lognormal distribution.

Bootstrap Adjustments We generate NSIM=4000 data sets, all under the complete null hypothesis. For each data set, we computed the bootstrap adjusted p-value using NBOOT 1000 bootstrap samples. The proportion of the NSIM samples having an adjusted p-value below α estimates the true FEW level of the method.

Rejection Proportions

The bootstrap adjustments The bootstrap adjustments are much better approximation. The bootstrap adjustments may have fewer excess Type I errors than the parametric Sidak adjustments. (lower-tail) The bootstrap adjustments may be more powerful than the parametric Sidak adjustments. (upper-tail)

Step-down Methods For Free Combination

Step-down methods Rather than adjust all p-values according to the min P j distribution, only adjust the minimum p-value using this distribution. Then adjust the remaining p-values according to smaller and smaller sets of p- value. It makes the adjusted p-value smaller, thereby improving the power of the single- step adjustment method.

Free combinations If, for every subcollection of j hypotheses {H i1,..,H ij }, the simultaneous truth of {H i1,..,H ij } and falsehood of the remaining hypotheses is plausible event, then the hypotheses satisfy the free combinations condition. In other words, each of the 2 k outcomes of the k-hypothesis problem is possible.

Holm’s method (Step-down methods)

Boferroni Step-down Adjusted p- values An consequence of the max adjustment is that the adjusted p-values have the same monotonicity as the original p-values.

Example Consider a multiple testing situation with k=5 where the ordered p-values p (i) are 0.009,0.011,0.012,0.134, and Let H (1) be the hypothesis corresponding to the p-value , H (2) be the hypothesis corresponding to 0.011, and so on. α=0.05

Example

Monotonicity enforcement In stages 2 and 3, the adjusted p-values were set equal to the first adjusted p- value, Without such monotonicity enforcement, the adjusted p-values p 2 and p 3 would be smaller than p 1. One might accept H (1) yet reject H (2) and H (3). It would run contrary to Holm’s algorithm.

Bonferroni Step-down Method Using the single-step method, the adjusted p-values are obtained by multiplying every raw p-value by five. Only H (1) test would be declared significant at the FEW=0.05. The step-down Bonferroni method is clearly superior to the single-step Bonferroni method. Slightly less conservative adjustments are possible by using the Sidak inequality, taking the adjustments to be 1-(1-p (j) ) (k-j+1) at step j.

The free step-down adjusted p-values (Resampling) The adjustments may be made less conservative by incorporating the precise dependence characteristics. Let the ordered p-values have indexes r 1,r 2,…,so that p (1) =p r1,p (2) =p r2,…,p (k) =p rk

The free step-down adjusted p-values (Resampling)

The adjustments are uniformly smaller than the single-step adjusted p-values, since the minima are taken over successively smaller sets.

Free Step-down Resampling Method

Example K=5 P-values are 0.009, 0.011, 0.012, 0.134, and Suppose these correspond to the original hypotheses H 2,H 4,H 1,H 3, and H 5.

A Specific Step-down Illustration

Step-Down Methods For Restricted Combinations

When the hypotheses are restricted, then certain combinations of true hypotheses necessarily imply truth or falsehood of other hypotheses. In these cases, the adjustments may be made smaller than the free step-down adjusted p-values.

Step-Down Methods For Restricted Combinations The restricted step-down method starts with the ordered p-values,p (1) ≦ … ≦ p (k),p (j) =p rj. If H (j) is rejected, then H (1),…,H (j-1) must have been previously rejected. The multiplicity adjustment for the restricted step-down method at stage j considers only those hypotheses that possibly can be true, given that the previous j-1 hypotheses are all false.

Step-Down Methods For Restricted Combinations Define sets s j of hypotheses which include H (j) that can be true at stage j, given that all previous hypotheses are false. S={r 1,…,r k }={1,…,k}, define __

The Bonferroni adjustments Define |K|= the number of elements in the set K.

Step-Down Methods For Restricted Combinations(Bonferroni) The adjusted p-values can be no larger than the free Bonferroni adjustments, since M j ≦ k-j+1. In the case of free combinations, the truth of a collection of null hypotheses indexed by {r j,…,r k } cannot contradict the falsehood of all nulls indexed by {r 1,..,r j-1 }. In this case, S j ={{r j,…,r k }}, thus M j =k-j+1,and the restricted method reduces to the free method as a special case.

Step-Down Methods For Restricted Combinations(resampling)

At each step of (2.13),the probabilities are computed over subsets of the sets in (2.10). Thus, the restricted adjustments (2.13) can be no larger than the free adjustments.

Error Rate Control For Step-Down Method Error Rate Control Under H o k

The probability of rejecting at least one H 0i is no larger than α, no matter what subset of the K of hypotheses happens to be true. K o ={i 1,…,i j } denote the collection of hypotheses H0i which are true. Let x k α denote theα quantile of min P t | H o c

Error Rate Control Under H o k Define

Critical Value-Based Sequentially Rejective Algorithm

Error Rate Control Under H o k We have the following relationships: Where j ≦ k-|K 0 |+1 is defined by min P t =P (j) =P rj

Error Rate Control Under H o k

which demonstrates that the restricted step- down adjustments strongly control the FEW.

Error Rate Control Under H k Suppose that H k is true, then the distribution of Y is G. Suppose also that there exist random variables P i0, defined on the same probability space as the Pi, for which P i ≧ P i0 for all i.

Error Rate Control Under H k The error rate is controlled:

Error Rate Control Under H k Such P i0 frequently exist in parametric analyses; for example, the two-sample t- statistic for testing H 0 :μ 1 ≦ μ 2 may be written:

Error Rate Control Under H k The p-value for this test is p=Pr(T 2(n-1) ≧ t). Letting the p 0 be defined by p 0 =Pr(T 2(n-1) ≧ t 0 ), p 0 < p whenever μ 1 < μ 2.