Genome Workbench Chuong Huynh NIH/NLM/NCBI New Delhi, India October 1, 2004 huynh@ncbi.nlm.nih.gov Slides from Michael Dicuccio’s Genome Workbench June 21, 2004 talk
Obtaining GenomeWorkbench Not officially released to the public Probably end of October 2004??? Beta version snapshots: ftp://ftp.ncbi.nih.gov/toolbox/gbench/
NCBI Genome Workbench
Genome Workbench: Goals Provide an interactive, client-side GUI Provide full suite of annotation tools Sequin does a lot of this older code primarily a submission tool Provide a platform for visualization and analysis Provide a platform that offers easy extensibility
Why Client-Side? Clients are now pretty fast Access to private data you can actually BLAST genomes on the client-side! Access to private data “If you can’t bring the data to GenBank, bring GenBank to the data!” Not just private data – extend to private data sources, data management Ability to mix and match analytical methods
Application Architecture Core application provides application services, data management, standard dialogs and components Plug-ins handle most of the requests everything is a plug-in
Plugin Manager
Core Application: MVC MVC = Model / View / Controller 30+ year old paradigm for applications separates responsibilities of the application into discrete components Genome Workbench uses this extensively Model = Data being viewed View = Viewers on this data Controller = Application, editing framework under construction
Extensibility: Plug-Ins Framework provides standard interfaces for defining, manipulating plug-ins Dynamically loaded at runtime; Only loaded when needed Plug-ins live in shared libraries can have more than one plug-in per library Don’t need to rebuild the entire application to add new features Three types: Data sources, Viewers, Algorithms
Extensibility: Scripting Wrap C++ interfaces with a bit of glue to make them available to scripting languages Goals are two-fold: obtain command console for scripting language write plug-ins entirely in a scripting language Focus initially on PERL, Python; intend to add others Python completed; PERL almost completed
Client-Side Benefits Data Caching BLAST request caching data in GenBank is updated, but updates for individual sequences are infrequent Pattern of use is frequently optimal for caching BLAST request caching BLAST requests valid for 24 hours IDs unique, can be cached on the client-side Directory Indexing can index directories of files can search by content, molecule type, IDs, etc.
Some Functionality NOT Enabled Only blastn works over the network Choose From Other Documents Does not work in 20040712 build Access to NCBI network through a proxy