Assume we have two experimental conditions (j=1,2) We measure expression of all genes n times under both experimental conditions (n two- channel microarrays) For a specific gene (focusing on a single gene) x ij = i th measurement under condition j Statistical models for expression measurements under two different Identifying Differentially Expressed Genes 1, 2, are unknown model parameters - j represents the average expression measurement in the large number of replicated experiments, represents the variability of measurements Question if the gene is differentially expressed corresponds to assessing if 1 2 Strength of evidence in the observed data that this is the case is expressed in terms of a p- value
Estimate the model parameters based on the data P-value Calculating t-statistic which summarizes information about our hypothesis of interest ( 1 2 ) Establishing the null-distribution of the t-statistic (the distribution assuming the “null- hypothesis” that 1 = 2 ) The “null-distribution” in this case turns out to be the t-distribution with n-1 degrees of freedom P-value is the probability of observing as extreme or more extreme value under the “null- distribution” as it was calculated from the data (t * )
t-distribution Number of experimental replicates affects the precision at two levels 1.Everything else being equal, increase in sample size increases the t * 2.Everything else being equal, increase in sample size “shrinks” the “null-distribution” Suppose that t * =3. What is the difference in p-values depending on the sample size alone. p-value = 0.2 p-value = 0.1 p-value = 0.01 p-value = 0.003
t-distribution #Plotting t-distributions with different degrees of freedom x<-seq(-5,5,.1) plot(x,dt(x,100),type="l",col="black",lwd=2,ylab="") lines(x,dt(x,10),col="green",lwd=2) lines(x,dt(x,2),col="blue",lwd=2) lines(x,dt(x,1),col="red",lwd=2) legend(2, y =0.4, c("df = 1; p-value = ","df = 2","df = 10","df = 100"), col = c("red","blue","green","black"), lty=rep("solid",4), lwd=2) #Calculating two-tailed p-values > 2*pt(3,100,lower.tail=FALSE) [1] > 2*pt(3,10,lower.tail=FALSE) [1] > 2*pt(3,2,lower.tail=FALSE) [1] > 2*pt(3,1,lower.tail=FALSE) [1] >
Performing t-test >loadURL(" > SimpleData[1,] Name ID W1 C1 W2 C2 W3 C3 C4 W4 C5 W5 C6 W6 1 no name Rn > LSimpleData<-SimpleData > LSimpleData[,3:14]<-log(SimpleData[,3:14],base=2) > LSimpleData[1,] Name ID W1 C1 W2 C2 W3 C3 C4 W4 C5 W5 C6 W6 1 no name Rn > grep("W",dimnames(SimpleData)[[2]]) [1] > grep("C",dimnames(SimpleData)[[2]]) [1] > W<-grep("W",dimnames(SimpleData)[[2]]) > C<-grep("C",dimnames(SimpleData)[[2]]) > t.test(LSimpleData[1,W],LSimpleData[1,C],var.equal=TRUE) Two Sample t-test data: LSimpleData[1, W] and LSimpleData[1, C] t = , df = 10, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y
Performing t-test > MW<-mean(t(LSimpleData[1,W])) > MW [1] > MC<-mean(t(LSimpleData[1,C])) > MC [1] > VW<-var(t(LSimpleData[1,W])) > VW > VC<-var(t(LSimpleData[1,C])) > VC > NW<-sum(!is.na(LSimpleData[1,W])) > NW [1] 6 > NC<-sum(!is.na(LSimpleData[1,C])) > NC [1] 6 > VWC<-(((NW-1)*VW)+((NC-)*VC))/(NC+NW-2) > VWC > DF<-NW+NC-2 > DF [1] 10 > TStat<-abs(MW-MC)/((VWC*((1/NW)+(1/NC)))^0.5) > TStat > TPvalue<-2*pt(TStat,DF,lower.tail=FALSE) > TPvalue > >t.test(LSimpleData[1,W],LSimpleData[1,C],var.equal=TRUE) Two Sample t-test data: LSimpleData[1, W] and LSimpleData[1, C] t = , df = 10, p-value = alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: sample estimates: mean of x mean of y source( source(
Statistical Inference and Statistical Significance – P-value Statistical Inference consists of drawing conclusions about the measured phenomenon (e.g. gene expression) in terms of probabilistic statements based on observed data. P-value is one way of doing this. P-value is NOT the probability of null hypothesis being true. Rigorous interpretation of p-value is tricky. It was introduced to measure the level of evidence against the “null-hypothesis” or better to say in favor of a “positive experimental finding” In this context p-value of could be interpreted as a stronger evidence than the p- value of 0.01 Establishing Statistical Significance (is a difference in expression level statistically significant or not) requires that we establish “cut-off” points for our “measure of significance” (p-value) For various historic reasons the cut-off 0.05 is generally used to establish “statistical significance”. It’s a rather arbitrary cut-off, but it is taken as a gold standard Originally the p-value was introduced as a descriptive measure to be used in conjuction with other criteria to judge the strength of evidence one way or another
Statistical Inference and Statistical Significance-Hypothesis Testing The 5% cut-off points comes from the Hypothesis testing world In this world the exact magnitude of p-value does not matter. It only matters if it is smaller than the pre-specified statistical significance cut-off ( ). The null hypothesis is rejected in favor of the alternative hypothesis at a significance level of = 0.05 if p-value<0.05 Type I error is committed when the null-hypothesis is falsely rejected Type II error is committed when the null-hypothesis is not rejected but it is false By following this “decision making scheme” you will on average falsely reject 5% of null- hypothesis If such a “decision making scheme” is adopted to identify differentially expressed genes on a microarray, 5% of non-differentially expressed genes will be falsely implicated as differentially expressed. Family-wise Type I Error is committed if any of a set of null hypothesis is falsely rejected Establishing statistical significance is a necessary but not sufficient step in assuring the “reproducibility” of a scientific finding – Important point that will be further discussed when we start talking about issues in experimental design The other essential ingredient is a “representative sample” from the “population of interest” This is still a murky point in molecular biology experimentation