Presentation is loading. Please wait.

Presentation is loading. Please wait.

1-18-20051 Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable.

Similar presentations


Presentation on theme: "1-18-20051 Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable."— Presentation transcript:

1 1-18-20051 Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable with new ones Issues of the background subtraction Limma as the general tool for analyzing microarray data Outline

2 1-18-20052 limma... is a package for the analysis of microarray data, especially the use of linear models for analyzing designed experiments and the assessment of differential expression. Specially constructed data objects to represent various aspects of microarray data Specially constructed "object methods" for importing, normalizing, displaying and analyzing microarray data All objects and methods are transparent All objects can be accessed and modified outside of limma Unique in the implementation of the empirical Bayes procedure for identifying differentially expressed genes by "borrowing" information from different genes (everything so far has been gene by gene)

3 1-18-20053 Measurement Error Model With Additive Background There are other models for accounting for the background signal Simple subtraction of the background intensities often introduces additional variability in the observed signal The problem is in the fact that we use a single-observation estimate for  B With this in mind, various strategies have been proposed to pool background information from more than one spot to estimate  B Foreground (F) Background (B) Old Model New Model

4 1-18-20054 limma Data to import: http://eh3.uc.edu/data/51-C1-3-vs-W1-5.gpr http://eh3.uc.edu/data/51-C1-3-vs-W1-5.gpr File descriptions: http://eh3.uc.edu/data/WTargets.txt http://eh3.uc.edu/data/WTargets.txt Spot descriptions: http://eh3.uc.edu/data/WSpotTypes.txt http://eh3.uc.edu/data/WSpotTypes.txt Importing data: source("http://eh3.uc.edu/LimmaDataImport.R")http://eh3.uc.edu/LimmaDataImport.R

5 1-18-20055 limma library(limma) data.directory<-"http://eh3.uc.edu/data/" targets<-readTargets("http://eh3.uc.edu/data/WTargets.txt") spottypes<-readSpotTypes("http://eh3.uc.edu/data/WSpotTypes.txt") LimmadataC<-read.maimages(files=targets$FileName,source="genepix", path = data.directory, columns=list(Gf = "F532 Median", Gb ="B532 Median", Rf = "F635 Median", Rb = "B635 Median"), annotation=c("Name","ID","Block","Row","Column"),wt.fun=wtflags(0))

6 1-18-20056 RGList class > attributes(LimmadataC) $names [1] "R" "G" "Rb" "Gb" "weights" "targets" "genes" $class [1] "RGList" attr(,"package") [1] "limma"

7 1-18-20057 RGList class > LimmadataC$genes[1,] Name ID Block Row Column 1 no name Rn30000100 1 1 1 > LimmadataC$R[1:3,] 51-C1-3-vs-W1-5 60-W2-3-vs-C2-5 72-C3-3-vs-W3-5 79-W4-3-vs-C4-5 82-C5-3-vs-W5-5 97-W6-3-vs-C6-5 [1,] 85 57 91 71 67 111 [2,] 358 1102 2394 1685 575 882 [3,] 168 376 620 670 206 293 > LimmadataC$Rb[1:3,] 51-C1-3-vs-W1-5 60-W2-3-vs-C2-5 72-C3-3-vs-W3-5 79-W4-3-vs-C4-5 82-C5-3-vs-W5-5 97-W6-3-vs-C6-5 [1,] 81 55 65 51 52 72 [2,] 81 56 65 51 51 72 [3,] 82 57 64 50 48 69

8 1-18-20058 RGList class LimmadataC$genes$Status<-controlStatus(spottypes,LimmadataC) LimmadataC$weights[LimmadataC$genes$ID=="Blank"]<-0 LimmadataC$printer<-getLayout(LimmadataC$genes) > attributes(LimmadataC) $names [1] "R" "G" "Rb" "Gb" "weights" "targets" "genes" "printer" $class [1] "RGList" attr(,"package") [1] "limma" > LimmadataC$genes[1,] Name ID Block Row Column Status 1 no name Rn30000100 1 1 1 cDNA

9 1-18-20059 Plotting data in a RGList object > plotMA(LimmadataC,array=1,xlim=c(-1,16),ylim=c(-3,8))

10 1-18-200510 limma PlotMA automatically subtracts the background intensities before plotting data Does not plot data with weight 0 If you want to plot all data or the data without subtracting background, you need to do a little work source("http://eh3.uc.edu/BackgroundEffects.R")http://eh3.uc.edu/BackgroundEffects.R

11 1-18-200511 limma > NBLimmadataC<-backgroundCorrect(LimmadataC,method="none") > attributes(NBLimmadataC) $names [1] "R" "G" "weights" "targets" "genes" "printer" $class [1] "RGList" attr(,"package") [1] "limma" Note that background measurements are gone

12 1-18-200512 Scatter with and without background subtraction Background subtracted data is more spread More data points without background subtractions

13 1-18-200513 Plotting all data points Want to plot data points with weight 0 as well Create datasets with and without subtracting background and set all weights to 1 SpotsPerArray<-dim(LimmadataC$R)[1] Narrays<-dim(LimmadataC$R)[2] Limmadata<-LimmadataC Limmadata$weights[1:SpotsPerArray,1:Narrays]<-1 NBLimmadata<-NBLimmadataC NBLimmadata$weights[1:SpotsPerArray,1:Narrays]<-1

14 1-18-200514 Plotting all data points Background Subtracted Raw All dataZero-weight data removed

15 1-18-200515 Which one to use? Removing points with the weight zero seems reasonable Subtracting background costs us some data points even if one channel is above background since differences of log-transformed measurements are used only Subtracting background seems to increase the variability, but it is unclear how would this affect results For now proceed without background subtraction, but compare results at the end Exploring other proposed background-adjustment methods also seems like a good idea

16 1-18-200516 Data Analysis Loess normalization source("eh3.uc.edu/LimmaLoess.R") > NNBLimmadataC<-normalizeWithinArrays(NBLimmadataC, method="loess") > attributes(NNBLimmadataC) $names [1] "weights" "targets" "genes" "printer" "M" "A" $class [1] "MAList" attr(,"package") [1] "limma"

17 1-18-200517 Loess-normalized data

18 1-18-200518 source("http://eh3.uc.edu/LimmaTTest.R")http://eh3.uc.edu/LimmaTTest.R > design<-modelMatrix(targets, ref="C") Found unique target names: C W > design W 51-C1-3-vs-W1-5 1 60-W2-3-vs-C2-5 -1 72-C3-3-vs-W3-5 1 79-W4-3-vs-C4-5 -1 82-C5-3-vs-W5-5 1 97-W6-3-vs-C6-5 -1 Paired t-test using limma

19 1-18-200519 > LimmaPTT<-lmFit(MA,design) Error in.class1(object) : Object "MA" not found > LimmaPTT<-lmFit(NNBLimmadataC,design) > > attributes(LimmaPTT) $names [1] "coefficients" "stdev.unscaled" "sigma" "df.residual" [5] "cov.coefficients" "pivot" "method" "design" [9] "genes" "Amean" $class [1] "MArrayLM" attr(,"package") [1] "limma" Paired t-test using limma

20 1-18-200520 > mean(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[2,]) [1] -0.03068036 > var(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[2,])^0.5 [1] 0.3176068 > 1/(6^0.5) [1] 0.4082483 > mean(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[2,])/((var(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[2,])^0.5)*(1/(6^0.5))) [1] -0.2366172 > > LimmaPTT$coefficients[2] [1] -0.03068036 > LimmaPTT$stdev.unscaled[2] [1] 0.4082483 > LimmaPTT$sigma[2] [1] 0.3176068 > LimmaPTT$coefficients[2]/(LimmaPTT$sigma[2]*LimmaPTT$stdev.unscaled[2]) [1] -0.2366172 Paired t-test using limma

21 1-18-200521 > mean(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[1,]) [1] 0.1425021 > var(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[1,])^0.5 [1] 0.2395690 > 1/(6^0.5) [1] 0.4082483 > mean(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[1,])/((var(c(1,-1,1,-1,1,-1)*NNBLimmadataC$M[1,])^0.5)*(1/(6^0.5))) [1] 1.457023 > > LimmaPTT$coefficients[1] [1] 0.1875361 > LimmaPTT$stdev.unscaled[1] [1] 0.5 > LimmaPTT$sigma[1] [1] 0.2831248 > LimmaPTT$coefficients[1]/(LimmaPTT$sigma[1]*LimmaPTT$stdev.unscaled[1]) [1] 1.324760 > NNBLimmadataC$weights[1,] 51-C1-3-vs-W1-5 60-W2-3-vs-C2-5 72-C3-3-vs-W3-5 79-W4-3-vs-C4-5 82-C5-3-vs-W5-5 97-W6-3-vs-C6-5 0 0 1 1 1 1 Paired t-test using limma

22 1-18-200522 > dfp 0 > LimmaPTT$LimmaTStat<-LimmaPTT$coefficients/(LimmaPTT$sigma*LimmaPTT$stdev.unscaled) > LimmaPTT$LimmaTPvalue<-rep(NA,SpotsPerArray) > LimmaPTT$LimmaTPvalue[dfp]<-2*pt(LimmaPTT$LimmaTStat[dfp],LimmaPTT$df.residual[dfp],lower.tail=FALSE) > attributes(LimmaPTT) $names [1] "coefficients" "stdev.unscaled" "sigma" "df.residual" [5] "cov.coefficients" "pivot" "method" "design" [9] "genes" "Amean" "LimmaTStat" "LimmaTPvalue" $class [1] "MArrayLM" attr(,"package") [1] "limma" Paired t-test using limma

23 1-18-200523 Facilitates easy data import and normalization Keeps track of "bad" spots To run the basic t-test, it takes a bit of additional work If we were to use the empirical Bayes statistics as implemented in limma, it would be even easier Empirical Bayes is generally BETTER than simple t-test Will talk about this type of analysis next week Limma also allows fitting models with multiple factors which we will also talk about next week Next time – multiple hypothesis testing and p-value adjustments limma so far


Download ppt "1-18-20051 Randomization issues Two-sample t-test vs paired t-test I made a mistake in creating the dataset, so previous analyses will not be comparable."

Similar presentations


Ads by Google