a Cytoscape plugin to assess enrichment of GO categories in biological networks Steven Maere 5th Annual Cytoscape Retreat Amsterdam, November 2007
Gene Ontology (GO) GO provides three structured vocabularies to describe gene function: - biological process - molecular function - cellular component Biological Process Cell cycle M phase crosslinked, hierarchical tree structure level of detail increases with level in hierarchy multiple annotations per gene www.geneontology.org
BiNGO ? Biological Network Gene Ontology tool : assess whether a given set of genes is significantly enriched in genes involved in particular functions/processes/pathways. > 40 tools with similar functionality listed on GO website Why BiNGO ? Its integration in Cytoscape provides several advantages : BiNGO can be used interactively on molecular interaction graphs visualized in Cytoscape. Integrated use of tools : BiNGO can be used in combination with other Cytoscape plugins (e.g. MCODE). intuitive and versatile visualization of results flexibility
BiNGO 2.0 Features assesses overrepresentation or underrepresentation of GO categories in gene sets Cytoscape or gene list input batch mode : analyze several clusters simultaneously using same settings GO and GOSlim ontologies, automated remapping of annotations Wide range of organisms and gene identifiers Evidence code filtering Hypergeometric or binomial test for overrepresentation Multiple testing correction using Bonferroni (FWER) or Benjamini&Hochberg (FDR) correction Cytoscape visualization of results mapped on the GO hierarchy. extensive results in tab-delimited text file format making and using custom annotation/ontology files is easy open source
GO overrepresentation x out of X genes in group A belong to GO category B which is shared by n out of N genes in the whole genome genome : N Group A : X GO Cat B : n n x we can use the hypergeometric distribution (a.k.a. the 2x2 exact Fisher test) to determine whether category B is significantly overrepresented in cluster A
Multiple testing correction When testing multiple GO categories, a multiple testing correction has to be applied in order to control the false positive rate Several strategies for adjusting p-values : - controlling FWER = prob. of at least 1 type I error e.g. Bonferroni correction very conservative - controlling the FDR = False Discovery Rate controls % of false positives e.g. Benjamini & Hochberg correction valid for independent or + correlated tests less conservative Benjamini, Y. & Hochberg, Y., J. R. Statist. Soc. B 57(1), 289-300 (1995)
BiNGO DEMO
BiNGO Maere S, Heymans K and Kuiper M (2005) BiNGO: a Cytoscape plugin to assess over-representation of Gene Ontology categories in biological networks. Bioinformatics 21, 3448-9. Availability: GO Tools at www.geneontology.org Plugins at www.cytoscape.org Plugin Manager in Cytoscape www.psb.ugent.be/cbd/papers/BiNGO or Google ‘bingo gene ontology’
Thanks ! Karel Heymans, Martin Kuiper Ghent University Andrew Markiel, Iliana Avila Institute for Systems Biology Rowan Christmas Ruth Isserlin, Gary Bader University of Toronto Mike Smoot, Trey Ideker UC San Diego Benno Schwikowski Institut Pasteur