geWorkbench John Watkinson Columbia University
geWorkbench The bioinformatics platform of the National Center for the Multi-scale Analysis of Genomic and Cellular Networks (MAGNet). Also, part of the NCI’s cancer Biomedical Informatics Grid (caBIG) initiative. The project was formerly called caWorkbench.
geWorkbench (cont.) A desktop application for integrative genomics. Runs on Windows, Linux and Macintosh. Includes a variety of informatics tools, but specializes in microarray analysis. Open-source and free for non-commercial use. Includes an API for plugin development.
geWorkbench (cont.)
Integrative Genomics Increasingly, researchers need to combine several data sources (microarray assays, DNA/RNA/protein sequences, protein structure, gene ontology, clinical data, etc.) geWorkbench attempts to move past simple microarray analysis to include integrative methods. Plugin framework allows geWorkbench to interact with other major software packages, including BioConductor, GenePattern and Cytoscape.
Data Support Microarray assays (one-color and two-color, as well as caARRAY assays). Sequence files. BLAST queries. Gene-Gene interaction networks (Interactomes). Gene Ontology Terms. caBIO pathways and annotations. Protein structure files (PDB).
Components geWorkbench has a plugin interface for the development of 3rd-party components. Documentation and developer support is available from the geWorkbench team. All visualizations and analyses have been written using the API. Several groups at Columbia are developing for the platform.
Microarray Analysis Summarization of raw chip data (via BioConductor). Normalization and Filtering. Differential expression analysis. Clustering (Hierarchical and Self-Organizing Maps). Classification (SVM and SMLR). Many visualization tools.
Hierarchical Clustering
Scatter Plot Visualization
caBIO Pathway Viewer
Sequence Analysis BLAST and HMM search interface. Pattern discovery. Synteny analysis. Promoter region analysis. A variety of sequence viewers.
Pattern Discovery Viewer
Promoter Viewer
GO Term Enrichment Traditional t-tests on microarray data determine differentially expressed genes between two different phenotypes. Gene Ontology (GO) term enrichment can determine which functional or structural categories show significant differentiation. Supported in geWorkbench’s GO Panel component. A similar technique can be applied to other gene sets, such as KEGG pathways.
GO Terms (cont.)
Reverse Engineering Microarray data can be used to infer biological pathways. geWorkbench’s Reverse Engineering component uses the ARACNE algorithm to build gene-gene interaction networks. These can be compared and combined with an online database of interactions, curated by Columbia.
Reverse Engineering (cont.)
Matrix REDUCE Given microarray data and upstream sequences for genes, transcription factor binding sites can be inferred. The Matrix REDUCE component in geWorkbench provides this analysis and tools to visualize the results.
For More Information Mailing List: John Watkinson:
Acknowledgements ARACNE algorithm by Califano et al. Matrix REDUCE algorithm by Bussemaker, et al. geWorkbench team: Aris Floratos, Eileen Daly, Kenneth Smith, Kiran Keshav, Xiaoqing Zhang, Manjunath Kustagi, Matthew Hall, Bernd Jagla, Mary VanGinhoven, John Watkinson.