Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applied Bioinformatics Introduction to R, continued Bing Zhang Department of Biomedical Informatics Vanderbilt University

Similar presentations


Presentation on theme: "Applied Bioinformatics Introduction to R, continued Bing Zhang Department of Biomedical Informatics Vanderbilt University"— Presentation transcript:

1 Applied Bioinformatics Introduction to R, continued Bing Zhang Department of Biomedical Informatics Vanderbilt University bing.zhang@vanderbilt.edu

2 Matrix subsetting and combining 2 TaskR code Import data from a tabular file data<-read.table("GSE8671_exp.txt",head=TRUE,sep="\t") Convert data frame to matrix data0<-as.matrix(data) Get dimensions of the matrix dim(data0) Select discrete rows by index data0[c(1,3,5,7,9),] Select continuous rows by index data0[5:10,] Select discrete columns by index data0[,c(1,3,5,7,9)] Select continuous columns by indexdata0[,5:10] Select both rows and columns by indexdata0[1:10,1:5] Select one row by namedata0[“1438_at”,] Select both rows and columns by namedata0[c(“1438_at”, “117_at”),c(“GSM215052”, “GSM215079”)] Calculate variances for all rowsgene_variances<-apply(data0,1,var) Calculate means for all rowsgene_means<-apply(data0,1,mean) Combine columns (same number of rows)combined<-cbind(data0,gene_means,gene_variances) Select rows by output of a comparisoncombined[gene_means>60000,]

3 Save your work The R environment is controlled by hidden files in the startup directory .Rdata .Rhistory Save before quit  > q() Save worksapce image? [y/n/c]: During a session  > save.image() Save your code to a file (e.g. diff.r), which can be excuted in batch  $ R CMD BATCH diff.r & &: running a program in the background Screen output to diff.r.Rout 3

4 Install and load packages CRAN packages  http://cran.r-project.org/web/packages/ http://cran.r-project.org/web/packages/  >6000 packages BioConductor packages  http://www.bioconductor.org/ http://www.bioconductor.org/  ~1000 packages for the analysis of high-throughput genomics data 4 TaskR code Install a CRAN packageinstall.packages (“package name”) Install a BioConductor packagesouce (“http://www.bioconductor.org/biocLite.R”) biocLite (“package name”) Load a package/librarylibrary (“package name”)

5 Graphics in R R has very strong graphic capacities High quality, high reproducibility, lots of packages On-screen graphics  Works in R Gui (both Windows and Mac)  In Linux, requires X11 (windowing system for bitmap displays) in Linux Output to a file  postscript, pdf, svg  jpeg, png, tiff, … 5 Start a pdf filepdf(“gse4183_clustering.pdf”, width=10, height=15) Generate a heatmapheatmap.plus(data3, Rowv=as.dendrogram(rhc), Colv=as.dendrogram(hc), colSideColors=ann, cexRow=0.5, cexCol=0.5, col=greenred(256)) Close the filedev.off()


Download ppt "Applied Bioinformatics Introduction to R, continued Bing Zhang Department of Biomedical Informatics Vanderbilt University"

Similar presentations


Ads by Google