Exploring and Presenting Results Laura Biggins laura.biggins@babraham.ac.uk v2017-07
I have my results… what next?
I have my results… what next? 1. Interpret and explore 2. Presenting results
Interpreting and exploring results How can the results be displayed so that I can interpret and explore them most easily? Understanding the functional terms (incl GO hierarchy) Finding relevant information amongst the masses (GOslim, redundant terms, clustering) Presenting results How should I present my results? What information should I include? 2 main things here - How should we presenting results in a way that makes sense – graphically or textually – how to display it and what information should be included – what is relevant? - How should results be presented to ourselves – how can they be displayed so that we can interpret them? Assuming we’ve got a large table of results. The issue here is selecting relevant information – condensing Small lists Let’s think about what makes sense to display If you’re comparing between 2 datasets and showing a comparison between one or two categories
Interpreting and exploring results Sometimes tables may be sufficient .
…but large tables are often difficult to interpret
Overlaps in gene lists Gene ontology is hierarchical general Parent Root ontology terms 1 2 3 general specific Parent Child
Overlaps in gene lists Most genes are associated with more than one GO term Many annotation sources e.g. G:profiler - Gene Ontology terms, biological pathways, regulatory motifs of transcription factors and microRNAs, human disease annotations and protein-protein interactions
Exploring overlaps
Exploring hierarchy - Panther Interactive pie charts to explore the lower level categories – nice for exploring if you’re interested in some of the categories. This is separate to the overrepresentation test
Exploring ontology structure - GOrilla Good thing about GOrilla – shows the hierarchy of the GO terms Also a way of presenting results cbl-gorilla.cs.technion.ac.il/
Exploring ontology structure - GOrilla But – if lots of info, difficult to interpret cbl-gorilla.cs.technion.ac.il/
g:profiler - filtering
DAVID – functional annotation clustering david clusters by overlapping genes
GOrilla - REVIGO Gorilla does have an option that allows the results to be displayed in REVIGO.
GOrilla - REVIGO http://revigo.irb.hr/
Reducing redundancy Use a filtering tool Use a GOslim – may lose detail Use a clustering tool Select non-redundant terms yourself – be consistent P-value filter, top x number of categories, largest categories, most enriched Could be worth trying a GOslim
Displaying results What information should I include? How should I present my results?
What information should be included? Choosing what to display
Figure examples - bad From panther – ugly and not ok – large categories may have much larger numbers
Figure examples – better bar charts The number of genes should not be displayed. DAVID Enrichment score = the geometric mean (in -log scale) of member's p-values in a corresponding annotation cluster, is used to rank their biological significance. Thus, the top ranked annotation groups most likely have consistent lower p-values for their annotation members.
Figure examples - table If you want to include more information than can be easily interpreted on a graph, include a table. How to select key terms to put in the table. What numbers to include?
Figure examples - GOplot R package to display results from ontology analyses Circle area is proportional to the number of genes Area of circle is proportional to the number of genes
Figure examples - GOplot
ClueGo App within Cytoscape ClueGO integrates GO terms as well as pathways Creates a functionally organized GO/pathway term network
Summary Explore the data – use common sense Do not try and plot absolutely everything Choose a method to deal with redundant terms Think about what you’re plotting and whether it makes sense Do not be afraid of including tables
Practical