Presentation is loading. Please wait.

Presentation is loading. Please wait.

The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund.

Similar presentations


Presentation on theme: "The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund."— Presentation transcript:

1 The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund

2 Papers and source materials Useful References *Graphics taken from these publications

3 Tuxedo Workflow Differential expression *TopHat and Cufflinks require a sequenced genome

4 Discovery Environment Using a GUI Tophat (bowtie) Cufflinks Cuffmerge Cuffdiff CummeRbund Your Data iPlant Data Store FASTQ Discovery Environment Atmosphere

5 CummeRbund Bioconductor R library; Getting started in Atmosphre “Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations.” Any image w/R can work, and you could also search for an image with cummeRbund installed

6 Bring your Data into Atmosphere Using iCommands or iDrop

7 Connect with VNC Visualization use case for Atmosphere VNC Viewer

8 Installing CummeRbund Available via Bioconductor

9 Examine the cufflinks data > cuff <- readCufflinks() > cuff CuffSet instance with: 2 samples 33714 genes 43481 isoforms 35113 TSS 32924 CDS 33621 promoters 35113 splicing 27350 relCDS

10 Visualize sample dispersion >disp<-dispersionPlot(genes(cuff)) >disp Counts vs. dispersion Overdispersion greater variability in a data set than would be expected based on a given model ( in our case extra-Poisson variation) If you use Poisson model, you will overestimate differential expression

11 Variation matters http://www.fgcz.ch/education/StatMethodsExpression/03_Count_data_analysis.pdf Poisson adequately describes technical variation

12 Overdispersed Data

13 Squared Coefficient of Variation >genes.scv<-fpkmSCVPlot(genes(cuff)) >genes.scv Normalized measure of cross-replicate variability Represents the relationship of the standard deviation to the mean Differences in SCV can result in lower numbers of differentially expressed genes due to a higher degree of variability between replicate fpkm estimates

14 Distributions of FPKM scores across samples >dens<-csDensity(genes(cuff)) >dens >densRep<-csDensity(genes(cuff),replicates=T) >densRep Non-parametric estimate of pdf

15 FPKM Pairwise Scatter Plots > csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T)

16 Saving your Plots Just in case you are not working in R studio 1. Plot type: >(e.g. jpeg, png, pdf) (file_path_and_file_name) 2. Plot function 3. dev.off() > png (‘csScatter.png’) #Will save in working directory > csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T) >dev.off

17 Selecting and Filtering Gene Sets Using the ‘getSig’ function # Enables you to get genes at significance n >sig <-getSig(cuff, alpha=0.05, level =‘genes’) # genes of significance 0.05 >length(sig) #returns the number of genes in the sig object >sig <-getSig(cuff, alpha=0, level=‘genes’) >tail(sig,100) #displays the last 100 genes in the sig object you just made

18 Selecting and Filtering Gene Sets Using the ‘getGenes’ function # Get the gene information >sigGenes <- getGenes(cuff,sig) Plot this in another scatter plot >csScatter(sigGenes, ‘WT’, ‘hy5’)

19 Heat mapping Similar Expression Values >sigGenes <-getGenes(cuff,tail(sig,50)) #last 50 genes in the list we created >csHeatmap(sigGenes,cluster=‘both’)

20 Heat mapping Similar Expression Values >csHeatmap(sigGenes,cluster=‘both’,replicates=‘T’)

21 Expression Plots by Genes > myGeneId<-”AT5G41471" > myGene<-getGene(cuff,myGeneId) > myGene

22 Expression Plots by Genes > expressionPlot(myGene,replicates=‘T’)

23 Keep asking: ask.iplantcollabortive.org

24 The iPlant Collaborative is funded by a grant from the National Science Foundation Plant Cyberinfrastructure Program (#DBI-0735191).


Download ppt "The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund."

Similar presentations


Ads by Google