Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene Ontology. GO Terms Data Full tree available from www.geneontology.org in text formatwww.geneontology.org OBO file format is current, contains all.

Similar presentations


Presentation on theme: "Gene Ontology. GO Terms Data Full tree available from www.geneontology.org in text formatwww.geneontology.org OBO file format is current, contains all."— Presentation transcript:

1 Gene Ontology

2 GO Terms Data Full tree available from www.geneontology.org in text formatwww.geneontology.org OBO file format is current, contains all terms in 1 file GO format deprecated, 3 separate files Gene to GO category mapping available in Affymetrix annotations 6118 // electron transport // inferred from electronic annotation /// 6810 // transport // inferred from electronic annotation

3 WorkBench Model 1. Select set of genes (by hand, or by process – i.e. T-Test) 2. GO panel automatically identify chip type by specific marker presence (or override by drop down) 3. Check for existence of serialized tree model, if doesn’t exist then build (from deprecated “GO format”) and serialize 4. Map selected genes’ GO category membership onto GO tree 5. Calculate PValue of selected genes’ GO category membership 6. Compare selected gene’s expression levels in a given category to the reference gene’s expression levels – this determines whether a gene is considered “enriched” or not

4 Issues - Structural Use of Swing DefaultTreeModel as data structure to represent tree – limits use by other components (1624) GO tree not accessible to other components – contained only within the GO Panel

5 Issues - Programmatic Of the 2.8 seconds it takes to build an in memory representation of the GO tree, 2 seconds is serializing the tree. Takes 3.9 seconds to deserialize tree (390ms to parse original file) ComputeCumulative() is recursive – can overflow stack for deep trees (1502) Linear search caused by Vector.contains() at GoTerm:190 approached 25% of runtime for large gene lists. Replace with something like TreeSet. PValue trends rendering causes OutOfMemory for larger sets of genes, crashing interface.

6 Issues - GUI

7 Issues – GUI Confusing layout of components, i.e. when checkbox beside Reference List is checked, that means ignore reference list List of genes for a given category has no functionality (should be able to add to selections from that list too at least) No way to see position in GO tree for a gene in the table view Multiple progress bars, sometimes with big pauses in between confuse user Is the PValue trends graph useful?

8 Proposed fixes Switch to using OBO gene tree file format Define an interface for GO terms usage across the entire application and back that with standard data structures (not Swing support classes) Move GO terms into the global annotations service Do not serialize the GO terms tree, parse file every time – faster, less code, less chance of error (i.e. if the OBO source file was updated) Change GUI to split pane with table view on the left and a dynamic tree view on the right which changes to display the tree for the selected gene Make it clear when you’re using all the genes from the specified annotation as the reference list, or when you’ve specified a list of genes to override that list Unless P-Value trends graph provides important functionality, remove it – or replace it by exposing the results of the analysis and displaying in a more general graphing component


Download ppt "Gene Ontology. GO Terms Data Full tree available from www.geneontology.org in text formatwww.geneontology.org OBO file format is current, contains all."

Similar presentations


Ads by Google