Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38 no. 5 (2006): pp GenePattern is supported by funding from the NIH
Today… Introduction to GenePattern –Why –What –How Demonstration Summary
Challenges Modern research methods follow a more integrative approach Tools are not available to biomedical researchers Tools are difficult to use Results difficult to interpret correctly
Purpose Create tools that are easily accessible to biomedical researchers Allows for a combination of multiple data sources and methods Allows for “reproducible research”
GenePattern 1.Offers a repository of analytic and visualization tools: Modules 2.Easy creation of complex methods from these tools: Pipelines 3.The rapid development and dissemination of new methods: Programming Environment
1. Modules Point and click ~ 60 analysis modules (handout) Documentation Designed for Affymetrix data 14 different file extensions
2. Pipelines Golub et al illustrates need Records the methods, parameters and data to ensure reproducibility Allows methods to be “chained” Published or create new Easily shared Assigns version numbers
3. Programming environment Libraries allow transparent access to GenePattern modules from R, Matlab and Java Language independent mechanism to add new tools to the module repository Tools can be your own or public (e.g. from Bioconductor)
Functional Architecture Taken from Reich et al Nature Genetics 2006
Components 1.The GenePattern server 2.The Java Client 3.The Web Client
Software Architecture Reich et al Nature Genetics 2006
GenePattern Current version –Release: 2.0.1, Release date 3/2/2006 OS compatibility: –Windows: XP, 2000, 2003 –Mac: OS X or later –Unix: Linux, Solaris, Tru64 Hardware requirements: –256MB RAM –500MB disk space
Demonstration
Gene Expression Analysis Four broad categories 1.Differential analysis/Marker selection 2.Prediction 3.Class discovery 4.Pathway analysis Data Formats Annotations
Proteomics SELDI, MALDI and LC-MS in mzXML format Quality assessment Peak detection Spectra comparison Proteomic analysis pipeline Data conversion
SNP analysis In alpha testing Uses high-density SNP microarray data Copy number alterations Loss of heterozygosity (LOH) detection
Data preprocessing and conversion Importing, exporting and file conversion Normalization, filtering and imputing ID conversion and annotation Row and column extraction, transpose, reorder and split data
Comparison of Selected Microarray Analysis Software Platforms Reich et al Nature Genetics 2006
Summary Has a few minor problems Is it something MIBLab can use? –Who is user? –What is it missing? Should be easily added
Sources Gould J, Getz G, Monti S, Reich M, Mesirov JP. Comparative Gene Marker Selection suite. Bioinformatics May 18; Liefeld T, Reich M, Gould J, Zhang P, Tamayo P, Mesirov JP. GeneCruiser: a web service for the annotation of microarray data. Bioinformatics Sep 15;21(18): Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP. GenePattern 2.0. Nature Genetics 2006 May;38(5):500-1.