GO-based tools for functional modeling GO Workshop 3-6 August 2010
Functional Modeling Grouping by function GO Slim sets GO browser tools GOSlimViewer GO enrichment analysis DAVID EasyGO/agriGO Onto-Express Funcassociate 2.0 Pathway & network analysis Hypothesis testing
Grouping by function
GO Slim Sets slim sets are abbreviated versions of the GO contain broader functional terms made by different GO Consortium groups (for different purposes, eg. plant, yeast, etc) need to cite which one you used! More information about GO terms for each slim set can be found at EBI QuickGO: GO Slim and Subset Guide
QuickGO: Create your own subset/slim of GO terms GO slims tutorial available This tutorial will describe GO slims, what they are used for and how to use QuickGO for: * creating a custom GO slim * using a pre-defined GO slim * obtaining GO annotations to a GO slim * customising a set of slimmed annotations * using statistics calculated by QuickGO to generate graphical representations of the data
AmiGO: GO Slimmer bin/amigo/slimmer?session_id=4878amig o bin/amigo/slimmer?session_id=4878amig o
GOSlimViewer input file Input is a text file containing 3 tab separated columns: 1.accession 2.GO:ID 3.aspect (P,F or C) file provided by GORetriever and GOanna2ga can manually add to it from GOanna excel file allows you to include your additional GO annotations in the analysis
GOSlimViewer output
GO Enrichment analysis
Determining which classes of gene products are over-represented or under-represented.
However…. many of these tools do not support agricultural species the tools have different computing requirements A list of these tools that can be used for agricultural species is available on the workshop website at the “Summary of Tools for gene expression analysis” link.
Evaluating GO tools Some criteria for evaluating GO Tools: 1. Does it include my species of interest (or do I have to “humanize” my list)? 2. What does it require to set up (computer usage/online) 3. What was the source for the GO (primary or secondary) and when was it last updated? 4. Does it report the GO evidence codes (and is IEA included)? 5. Does it report which of my gene products has no GO? 6. Does it report both over/under represented GO groups and how does it evaluate this? 7. Does it allow me to add my own GO annotations? 8. Does it represent my results in a way that facilitates discovery?
Some useful expression analysis tools: Database for Annotation, Visualization and Integrated Discovery (DAVID) AgriGO -- GO Analysis Toolkit and Database for Agricultural Community used to be EasyGO chicken, cow, pig, mouse, cereals, dicots includes Plant Ontology (PO) analysis Onto-Express can provide your own gene association file Funcassociate 2.0: The Gene Set Functionator can provide your own gene association file
functional grouping – including GO, pathways, gene-disease association ID Conversion search functionally related genes regular updates online support & publications
May 2010: EasyGO replaced by agriGO
enrichment analysis using either GO or Plant Ontology (PO) 40 species: chicken, cow, pig, mouse, cereals, poplar, fruits GenBank, EMBL, UniProt Affymetrix, Operon, Agilent arrays
Onto-Express Onto-Express analysis instructions are Available in onto-express.ppt
Species represented in Onto-Express
Can upload your own annotations using OE2GO
Pathway & network analysis
GO, Pathway, Network Analysis Many GO analysis tools also include pathway & network analysis Ingenuity Pathways Analysis (IPA) and Pathway Studios – commercial software DAVID – includes multiple functional categories Onto-Tools – includes Pathways Express tool
Pathways & Networks A network is a collection of interactions Pathways are a subset of networks Network of interacting proteins that carry out biological functions such as metabolism and signal transduction All pathways are networks of interactions Not all networks are pathways
KEGG BioCyc Reactome GenMAPP BioCarta Pathguide – the pathway resource list Pathways Resources
Biological Networks Networks often represented as graphs Nodes represent proteins or genes that code for proteins Edges represent the functional links between nodes (ex regulation) Small changes in graph’s topology/architecture can result in the emergence of novel properties
Types of interactions protein (enzyme) – metabolite (ligand) metabolic pathways protein – protein cell signaling pathways, protein complexes protein – gene genetic networks
Sod1 Mus musculus Network example: STRING Database
Database/URL/FTP DIP BIND MPact/MIPS STRING MINT IntAct BioGRID HPRD ProtCom 3did, Interprets Pibase, Modbase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbm SCOPPI iPfam InterDom DIMA Prolinks mbi.ucla.edu/cgibin/functionator/pronav/ Predictome PLoS Computational Biology March 2007, Volume 3 e42
Some comments on analysis tools: > 68 GO based analysis tools listed on the GO Consortium website (not a comprehensive list!) several tools combine GO, pathway and network functional analysis many different ways of visualizing the results expanding the species supported by analysis tools – check with tool developers check for last updates & user support information
Tutorial 5 In this tutorial we will use several GO modeling tools. We will use GOSlimViewer to summarize the GO function from the cassava data set. Next we will use two GO enrichment analysis tools, DAVID and AgriGO to do GO enrichment analysis of a maize data set and compare the results from the two tools.