Presentation is loading. Please wait.

Presentation is loading. Please wait.

Worldwide Protein Data Bank www.wwpdb.org wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update.

Similar presentations


Presentation on theme: "Worldwide Protein Data Bank www.wwpdb.org wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update."— Presentation transcript:

1 Worldwide Protein Data Bank www.wwpdb.org wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update

2 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Update report D&A Team Charge for end of January 2010: Deliver production functionality that will provide a significant impact on the annotation workflow. Agenda: 1.Deliverables 2.Accomplishments 3.What’s keeping us/you up at night 4.Timeline overview

3 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Functional Deliverables  Implement the chemical and model coordinate sequences issue resolution and integration using the Master Format.  Provide an annotator graphical interface to resolve discrepancies.  Implement the capability to repeat an incremental process step (GO BACK) under conditions such as –Replacement coordinates packaged in mmCIF or PDB formats –Replacement coordinates with updated sequence –Replacement chemical sequence  Integration of these new functionalities into the existing workflows.

4 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable Details  Finalization of Physical Data Exchange  Annotator graphical interface for sequence functionality  Master Format.  Extended API  Tracking DB support  Extended Work Flow Engine (WFE)  Work Flow Manager (WFM)  Work Flow Manager User Interface (WFM UI)  Integration of this “module” of new functionalities into the existing workflows.

5 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Physical Data Exchange  All sites have acquired NetApp hardware for this project  The version of NetApp software compatible with all sites has been determined.  A simplified secure protocol for NetApp communication has been found which avoids the need for extra networking hardware.  When the release candidate for the NetApp operating system is finalized as general release, in December, all sites will be on the same page for data exchange.

6 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Process Overview With GO BACK functionality

7 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable: Annotator Interface A graphical interface for resolution of structural features Requirements for display and editing by Annotation staff, including 3D visualization Resource allocation: RCSB Technical design: JavaScript/AJAX + CSS User prototype review Stress tested prototype with very large sequences  User testing functional prototype (begins Dec 15)  Integration with current systems using Master Format (Jan15)  In Use by annotators by Jan 28.  Integrate with new system (WFE, WFM, API) March

8 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Design Convergence – Master Format, API, WFM, WFE, UI  Distributed development on a complex project is challenging, but we are managing  Reached consensus on critical project technologies – –Master format & workflow schema –Project identifiers –Python implementation –Division of effort among programming layers –Passing communication and control of between computational and interactive workflows –Requirements and technology platform for sequence editor + 3D viewer

9 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable update: MASTER FORMAT  A single data dictionary for the project based on the PDB Exchange Dictionary (PDBx). (John) PDBx extended for Common D&A project (deposition data set identifier, WF class ID, WF instance ID, Site ID, Version ID)  PDBx (mmCIF syntax) data file format will be used as a working format for PDB annotation. (Zukang) Translation between RCSB and PDBx tested with Maxit Conversion tool for PDB to PDBx completed PDBx mapping CIF to PDB within Maxit – ready for testing

10 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Data and Application API Design  Unified Python language implementation  Provides all access to data and applications for the workflow manager and workflow engine  Subcomponents of the API provide access to: –Data objects and data values –Applications and tools –Tracking and status information –Site level configuration information

11 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable update: Extended API  Site Configuration API Configuration: Division of processing responsibility between the workflow engine and the API decided.  Workflow Engine/Manager (12/15, Luana)  Add sequence data methods (11/25, Vladimir, John)  Solution for identifying and finding things Archival data files Transient files required by workflows for data processing Versioning of data files and key data values within files Progress and tracking workflows  MySQL support of tracking (12/4, Li)  Application integration with API and WFE (12/4, Vladimir)

12 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable update: WFE  Final design – core API communication protocols Internal object representation Final design – XML schema created (description of WF) WFE can process revised WF definitions  Test suites  Engine development (12/23, Tom)  Integration with API, data model, WFM (12/23, Tom)

13 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable update: WFM Design Functional Architectural design  Will present progress and tracking information  Will start/stop and restart the workflow engine in executing data processing tasks  Will work in a fully distributed web-based mode  Will provide a launch point for tasks requiring interactive or graphical interactions. Two modes defined – Immediate mode – all processing occurs in a single session (simple case). Deferred mode – requests for input are registered with the workflow manager for later processing by annotator

14 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable update: WFM UI  WFM – Annotator UI (Luana)  Requirements (12/3) annotator team)  Design (12/10)  Development (1/15)  WFM Development (1/21, Luana)  Integration with WFE, API (2/4, Vladimir, Luana)  User Testing (2/28, all)

15 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Deliverable: GO BACK FUNCTIONALITY Master Format  Workflow execution environment (WFE, WFM)  Session management and tracking infrastructure

16 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Things that have kept us up at night  These are cornerstone deliverables requiring intense study and design consideration – beyond the proof of concept. –Organization of data, communication protocols, etc. –Clear consensus of design features has required an evolution of understanding – requiring wetting of hands  Ramp up of skill sets: Python, mmCIF (PDBe),  EBI External services: web-service set up  Site specific integration challenges  Resource issues

17 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Good News (from your local PM)  Team is VERY FUNCTIONAL –A lot has been accomplished despite distributed team members and multi-tasking resources  Consensus on difficult issues – starting at considerable philosophical distances has been achieved! –No bloodshed to date – all limbs in tact  Team is still highly motivated to succeed with this project!

18 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables Timeline Summary  Functional Interface –integrated in existing systems January 15, 2010 –In use by annotators by January 28, 2010  Full Integration of WFM UI with WFE, WFM and API February 4, 2010  Testing completed by February 28, 2010

19 Worldwide Protein Data Bank Common D&A Project January 2010 Deliverables PDBe integration  There are significant changes to the PDBe annotation –PDBe data model -> D & A data model – import –Load D & A data model with status and domain data –Start web services/connect to web resources  External services at EBI –Run workflows  Implement programs at PDBe –Export data from D & A data model to PDBe data model –Requires Glen who will be away for December to integrate path


Download ppt "Worldwide Protein Data Bank www.wwpdb.org wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update."

Similar presentations


Ads by Google