Download presentation
1
Overview of Bioconductor
Aedín Culhane
2
Bioconductor Biannual release (normally April, October) to coincide with R release. Current: Bioconductor 2.9 (release coincide with R 2.14) To install use script on Bioconductor Website source(" biocLite()
3
Packages Overview BioConductor web site
Bioconductor BiocViews Task view Software Annotation Data Experimental Data
4
What Packages do I need? Specific to you data and analysis pipeline but for examples: Bioconductor Workshops Bioconductor Workflows
5
Main types of Annotation Packages
Gene centric AnnotationDbi packages: Organism: org.Mm.eg.db. Technology/Platform: hgu133plus2.db. GeneSets and Pathway (biology level): GO.db or KEGG.db .db packages can be queried with sql or accessed using annotation package (totable, get, mget) Genome centric GenomicFeatures packages: Transriptome level: TxDb.Hsapiens.UCSC.hg19.knownGene Generic features: Can generate via GenomicFeatures biomaRt: Query web-based `biomart' resource for genes, sequence, SNPs, and etc. See
6
Bioconductor resources
Mailing List (sign up for daily digest) Documentation, workshop/course material online Slides from talks, pdf of tutorials, R code Help available for each software package Each package MUST contain vignette (howto) Other resources ww.Rseek.org
7
Vignette Tutorials, provide worked example of package
Required in Bioconductor packages Written in Sweave (Leisch, 2002). LATEX dynamic reports in which R code is embedded and executable All R code in vignette is checked (and executed) by R CMD check library("Biobase") library("GOstats") # Load package of interest openVignette()
8
S4 classes and ExpressionSet
Within Bioconductor, you will encounter packages are structured around S4 object- oriented programming proposed by John Chambers (developer of S) A class provides a software abstraction of a real world object. A method performs an action on a class (Think of a class as a noun, and method as verb)
9
Object (S4) An object is an instance of a class.
Descriptions are stored in slots slotNames(ob1) lists all slots in object, or use str(). To access slots slotname(ob1), or slot(ob1, “slotname")
10
Example: ExpressionSet
> ALL ExpressionSet (storageMode: lockedEnvironment) assayData: features, 128 samples element names: exprs protocolData: none phenoData sampleNames: LAL4 (128 total) varLabels: cod diagnosis ... date last seen (21 total) varMetadata: labelDescription featureData: none experimentData: use 'experimentData(object)' pubMedIds: Annotation: hgu95av2 library(ALL) data(ALL) slotNames(ALL) phenoData(ALL) class(ALL) ?ExpressionSet
11
Method which act on a S4 class
showMethods(class= "ExpressionSet") getMethod("write.exprs", "ExpressionSet") Or if you wish to see how the package really works, download and look the source code
12
Getting Data into R & Bioconductor
Aedín Culhane
13
Simple Excel SpreadSheet data
Simple table read.table() read.csv() scan() However more datatype specialized. See Technologies on BiocViews. ews.html Large data files. Also see
14
Some common data types Microarray SNP NGS May 2011
15
A Microarray Overview
16
Reading Affymetrix Data
library(affy) require(affy) # Alternative affybatch <- ReadAffy(celfile.path="[Location of your data]") eSet<-justRMA() May 2011
17
Sample R code
18
ExpressionSet Class in R
May 2011
19
Assessing Data Quality
May 2011
20
Public Microarray Data
ArrayExpress 21997 Studies (622,617 profiles,) GEO 22,735 Studies (558,074 profiles) Statistics May 2011
21
R Code May 2011
22
More on GEOquery require(GEOquery)
Let's try to load the GDS810 dataset which contains data on Alzheimer's disease at various stages of severity. GDS810<-getGEO("GDS810") The getGEO function returns an object of class GEOData. You can get a description of this class like this: help("GEOData-class") Meta(GDS810) Columns(GDS810) head(Table(GDS810)) May 2011
23
Affy SNP Arrays May 2011
24
Process – Affy SNP Arrays (Oligo package)
May 2011
25
Other Arrays Illumina 2 color spotted arrays Other arrays Lumi package
Limma package Other arrays igo-arrays/ May 2011
26
Next Generation Sequencing Data
27
R Code May 2011
28
Exercise Install the library GEOquery
Download the dataset GSE1297 using getGEO This data will be downloaded as an eSet, so to see the expression data and phenoData, use pData and exprs Use ArrayQualityMetrics to Assess the data quality of these data May 2011
29
R basics: Getting help To get help help.search(“mean”)
help(mean) help.search(“mean”) apropos("mean") example(mean)
30
With thanks to May 2011
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.