Sequence comparison: Multiple testing correction

Sequence comparison: Multiple testing correction
Genome 559: Introduction to Statistical and Computational Genomics Prof. William Stafford Noble

One-minute response Difficulty connecting conceptual part of course to programming. This will improve now that you know more programming constructs. Allocate more time to practice problems. Remind how can you exit an if test for a command line argument? sys.exit() will stop the program It might be helpful to have the previous question’s answer on the screen at the same time as the next question. I don’t know if I have space. More sample problems with solutions would be helpful. There are more sample problems linked from the last page of the slides. It would be helpful to do a brief recap at the beginning of class with Most Important Functions we’ve covered so far. Explanation of string formatting was helpful. Nice pace. Python is really fun! I liked the examples for the if statement problems. Love going through Python in class.

The show so far … Most statistical tests compare observed data to the expected result according to the null hypothesis. Sequence similarity scores follow an extreme value distribution, which is characterized by a larger tail. The p-value associated with a score is the area under the curve to the right of that score. The p-value for score x for an extreme value distribution with parameters μ and λ is

What p-value is significant?

What p-value is significant?
The most common thresholds are 0.01 and 0.05. A threshold of 0.05 means you are 95% sure that the result is significant. Is 95% enough? It depends upon the cost associated with making a mistake. Examples of costs: Doing expensive wet lab validation. Making clinical treatment decisions. Misleading the scientific community. Most sequence analysis uses more stringent thresholds because the p- values are not very accurate.

Two types of errors In statistical testing, there are two types of errors: False positive (aka “Type I error”): incorrectly rejecting the null hypothesis. I.e., saying that two sequences are homologous when they are not. False negative (aka “Type II error”): incorrectly accepting the null hypothesis. I.e., saying that two sequences are not homologous when they are. We use p-values to control the false positive rate.

Multiple testing Say that you perform a statistical test with a 0.05 threshold, but you repeat the test on twenty different observations. Assume that all of the observations are explainable by the null hypothesis. What is the chance that at least one of the observations will receive a p-value less than 0.05?

Multiple testing Say that you perform a statistical test with a 0.05 threshold, but you repeat the test on twenty different observations. Assuming that all of the observations are explainable by the null hypothesis, what is the chance that at least one of the observations will receive a p-value less than 0.05? Pr(making a mistake) = 0.05 Pr(not making a mistake) = 0.95 Pr(not making any mistake) = = 0.358 Pr(making at least one mistake) = = 0.642 There is a 64.2% chance of making at least one mistake.

Family-wise error rate
For a single test, the Type I error rate is the probability of incorrectly rejecting the null hypothesis. For multiple tests, the family-wise error rate is the probability of incorrectly rejecting at least one null hypothesis.

Bonferroni correction
Divide the desired p-value threshold by the number of tests performed. For the previous example, 0.05 / 20 = Pr(making a mistake) = Pr(not making a mistake) = Pr(not making any mistake) = = Pr(making at least one mistake) = =

Proof: Bonferroni adjustment controls the family-wise error rate
Note: Bonferroni adjustment does not require that the tests be independent. Boole’s inequality Definition of p-value m = number of hypotheses m0 = number of null hypotheses ⍺ = desired control pi = ith p-value Definition of m and m0

Database searching Say that you search the non-redundant protein database at NCBI, containing roughly one million sequences. What p-value threshold should you use?

Database searching Say that you search the non-redundant protein database at NCBI, containing roughly one million sequences. What p-value threshold should you use? Say that you want to use a conservative p-value of Recall that you would observe such a p-value by chance approximately every 1000 times in a random database. A Bonferroni correction would suggest using a p-value threshold of / 1,000,000 = = 10-9.

E-values The E-value is the expected number of times that the given score would appear in a random database of the given size. One simple way to compute the E-value is to multiply the p-value times the size of the database. Thus, for a p-value of and a database of 1,000,000 sequences, the corresponding E-value is × 1,000,000 = 1,000. BLAST actually calculates E-values in a more complex way.

Summary Confidence thresholds of 0.01 and 0.05 are arbitrary.
In practice, the threshold should be selected based on the costs associated with false positive and false negative conclusions. The Bonferroni correction divides the desired threshold by the number of tests performed. The E-value is equivalent to Bonferroni adjustment, but multiplies the p-value by the number of tests performed.

Sequence comparison: Multiple testing correction

Similar presentations

Presentation on theme: "Sequence comparison: Multiple testing correction"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sequence comparison: Multiple testing correction

Similar presentations

Presentation on theme: "Sequence comparison: Multiple testing correction"— Presentation transcript:

Similar presentations

About project

Feedback