wFleaBase Daphnia Genome Database from Common Components Daphnia Genomic Consortium Meeting, Sept Don Gilbert,
A Replicable Genome infOrmation System ( Argos ) | flybase.net/flybase-ng common/ java/ ; perl/ -- program libraries and packages servers/ -- major programs (BLAST, MySql/PostgreSQL, others) systems/ -- OS executables of programs daphnia/.. implemented organism genome systems eugenes/ flybase/ docs/ & install/ -- Argos instructions and usage template/ -- structure for new projects ROOT/ -- common directory of installed projects
Argos features Common genome tool set Share benefits of “best of breed” genome tools Common parts are tested & maintained by others Minimal IT expertise (no compiles or system management) Choice of tools (existing or new genome DB use parts desired) Flexible project packages Project needs specify tool set (compare EnsEMBL where all use one set) Own look’n’feel web pages, contents, functions Security for protected and public sections Easy replication to any Unix computer ‘Live’ database system replication using rsync Keep remote servers up-to-date every day Local cluster/grid for high-volume traffic Works on common workstations, laptops
Argos common parts Java common library, Ant builds, XML Tools, Web Services (Axis), Lucene for “Google”-like searches Perl common library of BioPerl, GBrowse, others Servers include Apache, Tomcat web servers MySQL, PostgreSQL databases BLAST (NCBI) Systems compiled for apple-powerpc-darwin, intel-linux, sun-sparc-solaris
wFleaBase structure Cgi-bin-- Web programs(Perl) Common -- Link to common, shared tools Conf-- Site configurations for web, data Data-- Bulk data & FTP site folder Dbs-- Project databases: blast, lucene, mysql Indices-- Database indices Lib-- Program libraries Web-- Web structure and documents Genomics, Sequences, Maps, Literature, Stocks, Docs, other includes Public and Protected (project member only) parts Webapps -- Web programs (Java) includes Search system, Secure web and editing
Search wFleaBase
BLAST wFleaBase
Edit wFleaBase
Where to put Daphnia Genome? Database needs Automated annotation and curated updates Search and retrieve data subsets Choices EnsEMBL - working now, Gramene & others use GMOD:Chado - in development (FlyBase,WormBase, ChlamyGenome,TIGR, others will use) Others choices?
Generic Model Organism Database Construction Set Genome+ Database (more than annotations) Genome visualization tools Genome annotation pipeline planned Literature curation and Gene Ontology tools Component system (pick and choose) Developing - more complete in
EnsEMBL Genome Database Genome annotation database Genome visualization tools Genome annotation pipeline Comprehensive system (all or none) Production - useable now
From Shawn Hoon, Fugu Informatics Group
wFleaBase issues Basic web system ready for genome data? Start with EnsEMBL for management; move to GMOD:Chado if better choice? Add GMOD GBrowse; Apollo Editor with genome Add “Self-service” database features for? Easy management by scientists Genome data; stocks; research literature Add evolutionary, ecological, environmental data Prototype at
GBrowse Maps
Apollo Annotator