WebApollo: A Web-Based Sequence Annotation Editor for Community Annotation Ed Lee, Gregg Helt, Nomi Harris, Mitch Skinner, Christopher Childers, Justin Reese, Jay Sandaram, Christine Elsik, Ian Holmes, Suzanna Lewis Bioinformatics Open Source Conference (BOSC 2011) July 15, 2011
“Old” Apollo What is Apollo? Popular open source genome annotation editing tool Tool for visually inspecting computational analyses and experimental evidence for genomic features and building a manually-curated consensus Standalone Java application
In The Olden Days Users required to download and install Apollo Customized configurations needed to be pushed to users Annotations saved locally in flat files Sharing done by ing files
Apollo in the Olden Days (Flat file)
Starting To Get Better Annotations saved directly to a centralized database Edits made by other users not visible until you actively reloaded Potential issues with stale annotation data Apollo software downloaded more transparently by Java Web Start Java versioning still an issue
Starting To Get Better
Now We’re Talking Web-based (runs in browser) No software download required Customized configurations automatically pushed to users Users automatically see any new data tracks Annotations saved to centralized database Edit server mediates annotation changes made by multiple users
Now We’re Talking
WebApollo Framework
Web-based Client JBrowse Javascript based annotation browser Fast Highly interactive WebApollo extensions to JBrowse Provides gestures needed for editing annotations Communication with the annotation editing engine and data providing service HTML5 Canvas quantitative data rendering
Annotation Editing Engine Java Handles all the logic for editing Edits stored persistently in the server BerkeleyDB JE for fast access Able to restore data if either client or server crashes Once annotators are satisfied with annotations, they are stored in a centralized Chado database Per-sequence (contig, chromosome, etc.) user permissions (none/read/write)
Multiple Client Synchronization Comet model Server pushes annotation updates to all clients in real time
Enabling WebApollo to access as many types of genomic data as possible Efficient access to public data from UCSC, Ensembl, and GMOD Chado databases Unified strategy like DAS is preferred Solution: Trellis, a DAS server framework that: Supports modular back-end plugins Poka Trellis plugin for direct UCSC database access DAS plugin to support Ensembl servers Supports modular front-end content formats JBrowse JSON plugin
Poka Plugin UCSC To JBrowse Via Trellis DAS Data Model UCSC MySQL Genome database JBrowse Client JBrowse JSON Plugin
WebApollo Demo
Source Code (BSD License) Client source code Annotation editing engine Data model and I/O layer Trellis server code
Acknowledgements LBNL Ed Lee Gregg Helt Nomi Harris Suzanna Lewis UC Berkeley Mitch Skinner Ian Holmes Georgetown University Christopher Childers Justin Reese Jay Sundaram Christine Elsik Demo: