Download presentation
Presentation is loading. Please wait.
Published byMavis Elliott Modified over 8 years ago
1
Statistical Genomics Zhiwu Zhang Washington State University Lecture 4: Statistical inference
2
Homework1, due Feb 3, Wednesday, 3:10PM Administration
3
X2 test on contingency table Empirical null distribution X2 test on variance t test Hypothesis test two types of error Power Outline
4
TransgeneticNon transgeneticSUM Herbicide35540 No herbicide352560 SUM7030100 Observed and expected frequency TransgeneticNon transgeneticSUM Herbicide281240 No herbicide421860 SUM7030100
5
Poisson distribution: Mean=Var=Expected (Observed-Expected)/Sqrt(Expected) ~ N(0,1) SUM(Observed-Expected) 2 / Expected ~ X 2 (df) df=number of independent cells Approximate Distributions
6
TransgeneticNon transgeneticSUM Herbicide35540 No herbicide352560 SUM7030100 Observed and expected frequency TransgeneticNon transgeneticSUM Herbicide281240 No herbicide421860 SUM7030100 49/28+49/12+49/42+49/18=9.72
7
Distribution of x2(1) Observed 9.72 P<1% 99% percentile 6.97 par(mfrow=c(2,2),mar = c(3,4,1,1)) x=rchisq(k,1) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.99)
8
A sample has mean of 103.6 and variance of 27.82 The sample has 10 observations Q1: What is the probability that the sample was from a normal distribution with variance of 25? Q2: What is the probability that the sample was from a normal distribution with mean of 100? Tests on samples
9
Empirical solution: Sample ten observations from a normal distribution with variance of 25. Calculate observed variance. Repeat the sampling and get null distribution of the sample variances Find percentile of observed variance on the null distribution Q1: distribution with variance of 25
10
x=replicate(10000, {s=rnorm(10,0,5) var=var(s) }) Observed 27.82 P>25% 75% percentile 31.6 > length(x[x>27.82])/10000 [1] 0.3516 par(mfrow=c(2,2),mar = c(3,4,1,1)) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.75)
11
Theoretical solution: Q1: distribution with variance of 25 v=(10-1)*27.82/25=10.026 > 1-pchisq(10.026,9) [1] 0.3483845 vs. 0.3516 from empirical
12
Q2: distribution with mean of 100 Empirical solution Sample ten observations from N(100, 25) Calculate mean Repeat the process 10,000 times Null distribution of of the 10,000 means Determine the percentile of testing mean (103.6) on the null distribution
13
Q2: distribution with mean of 100 x=replicate(10000, {s=rnorm(10,100,5) m=mean(s) }) Observed 103.6 1%<P<5% 95% percentile 102.6 > length(x[x>103.6])/10000 [1] 0.0132 par(mfrow=c(2,2),mar = c(3,4,1,1)) d=density(x) plot(x) plot(d) hist(x) plot(ecdf(x)) quantile(x,.95) quantile(x,.99) 99% percentile 102.6
14
t test
15
T=(103.6-100)/(5/sqrt(10)) P=1-pt(T,9) c(T,P) 2.27683992 0.02440704 Under 5% of threshold, reject the hypothesis that the sample was from a distribution with mean of 100
16
F test
17
Null hypothesis (H0): Initial assumption Alternative hypothesis (Ha): Opposite to the assumption Find the probability of H0 If the probability is too low (e.g. 5%), reject Ho and accept Ha Otherwise, accept Ho Hypothesis test
18
Type I error: Reject true H0, False positive, the probability is the threshold used, e.g. α=5% Type II error: Accept false H0, false negative, β Power: Probability to reject false H0, (1-β) Two types of errors and power
19
TestH0 is TrueHo is False Positive (reject H0) False positive Type I: α Power=1-β Negative (Accept H0) Specificity=1-α False negative Type II: β Sum100% Summary
20
Highlight X2 test on contingency table Empirical null distribution X2 test on variance t test Hypothesis test two types of error Power
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.