Download presentation
Presentation is loading. Please wait.
1
Divining Systems Biology Knowledge from High-throughput Experiments Using EGAN Jesse Paquette ISMB 2010 Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center University of California, San Francisco (AKA BCBC HDFCCC UCSF)
2
High-throughput experiments This talk applies to –Expression microarrays –aCGH –SNP/CNV arrays –MS/MS Proteomics –DNA methylation –ChIP-Seq –RNA-Seq –In-silico experiments If parts of the output can be mapped to gene IDs –You can use EGAN
3
What do you hope to accomplish? Collect data Process data Differential analysisPublish! Clusters and/or gene lists New testable hypotheses Produce insight about the underlying biology New grants!New papers! Drug targets!
4
Leverage organic intelligence Clusters and/or gene lists New testable hypotheses Produce insight about the underlying biology Summarize Visualize Contextualize
5
Producing insight from clusters and gene lists Summarize: find enriched pathways (and other gene sets) –Hypergeometric over-representation DAVID –Global trends GSEA Visualize: gene relationships in a graph –Protein-protein interactions Cytoscape –Network module discovery Ingenuity IPA –Literature co-occurrence PubGene Contextualize: pertinent literature PubMed Google iHOP
6
EGAN: Exploratory Gene Association Networks Methods: state-of-the-art analysis of clusters and gene lists –Hypergeometric enrichment of gene sets –Global statistical trends of gene sets –Hypergraph visualization (via Cytoscape libraries) –Literature identification –Network module discovery User Interface: responds quickly to new queries from the biologist –Sandbox-style functionality –Dynamic adjustment of p-value cutoffs –Point-and-click interface –All data in-memory for immediate access –Links to external websites Modular: integrates as a flexible plug-and-play cog –All data is customizable –Proprietary data can be restricted to the client location –Java runs on almost every OS (PC, Mac, LINUX) –Can be configured and launched from a different application (e.g. GenePattern) –Analyses can be scripted for automation
7
Gene sets A gene set is a a set of semantically related genes –e.g. Wnt signaling pathway EGAN contains a database of gene sets –> 100k gene sets by default KEGG, Reactome, NCI-Nature, Gene Ontology, MeSH, Conserved Domain, Cytoband, miRNA targets –You can easily add your own Simple file format Download from MSigDB (Broad Institute)
10
Gene-gene relationships EGAN also contains –Protein-protein interactions (PPI) –Literature co-occurrence –Chromosomal adjacency –Kinase-target relationships Other possibilities –Sequence homology –Expression correlation
12
Example with microarray and aCGH results Mirzoeva et al. (2009) Cancer Research –UCSF-LBL collaboration –Analysis of breast cancer cell lines Basal vs. luminal Discoveries in this presentation –miRNA regulator of subtype (mir-200) –Annexin (ANXA1) as potential regulator of ER, glucocorticoid and EGFR signaling
13
Gene list - higher expression in basal cell lines
19
Gene set/pathway enrichment
36
Importing gene lists from publications
39
Combining expression with aCGH
51
Finding network modules
57
Where to find EGAN Website –http://akt.ucsf.edu/EGAN/http://akt.ucsf.edu/EGAN/ 2010 paper in Bioinformatics –http://www.ncbi.nlm.nih.gov/pubmed/19933825http://www.ncbi.nlm.nih.gov/pubmed/19933825
58
Acknowledgements BCBC HDFCCC UCSF –Taku Tokuyasu –Adam Olshen –Ritu Roy –Ajay Jain LBNL –Debopriya Das –Joe Gray Funding –UCSF Cancer Center Support Grant UCSF –Early adopters Ingrid Revet Antoine Snijders Stephan Gysin Sook Wah Yee Joachim Silber –Cytoscape gurus David Quigley Scooter Morris –OTM David Eramian Ha Nguyen –Laura van ’t Veer –Donna Albertson –Graeme Hodgson
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.