Download presentation
Presentation is loading. Please wait.
Published byFrederick Strickland Modified over 9 years ago
1
November 6, 2009 www.wwpdb.org
2
November 6, 2009 Agenda Summary OverviewHaruki Nakamura Common Deposition and Annotation Tool Martha Quesada Method- and Molecule- specific Activities John Markley Gerard Kleywegt Policy and New Ventures PDB Journal Helen Berman Matt Day
3
November 6, 2009 Summary Overview Haruki Nakamura Worldwide Protein Data Bank www.wwpdb.org
4
November 6, 2009 wwPDBAC 2008 Recommendations Establish validation procedures for X-ray and NMR –X-ray Task Force draft report; NMR Task Force established Establish recommendations for additional data deposition and release requirements –Policy document updated Work with SAXS/SANS community –First report completed Definition of purview of the PDB –For approval at this meeting Establish feasibility of chemical shift depositions –Implementation plan Independent funding for wwPDB –wwPDB Foundation established Develop wwPDBAC membership plan –Done Establish EM Task Force –First EMDB AC Meeting held March 2009 Broaden wwPDBAC to include China and India –Associate Members: Zihe Rao (China) and Manju Bansal (India)
5
November 6, 2009 wwPDB October 2008 - September 2009 New Leadership PDBe-Gerard Kleywegt Funding stable at 4 sites Continued growth of archive Increased use of data PDB Archive Version 3.15 released: archival snapshots Further coordination of FTP and web updates Common Tool Project underway Establishment of wwPDB Foundation Continued outreach
7
November 6, 2009 Funding RCSB PDB competitive renewal funded by NSF –January 2009 - December 2013 PDBe competitive grant from Wellcome Trust –December 2009 - November 2014 –Additional new permanent EMBL posts-from 2013: stable core of at least 15 permanent posts (from 6 in 2008!) PDBj competitive renewal funded by JST (Japan Science & Technology Agency) –April 2006 - March 2011: Current program –April 2011 - March 2014: A new funding system for life-science databases is planned BMRB competitive renewal funded from the National Library of Medicine –September 2009 – August 2014 (parent grant) –September 2009 – August 2011 (administrative supplement – US recovery act funding) –September 2009 – August 2011 (competitive renewal – US recovery act funding)
8
November 6, 2009 * By experimental type By deposition and processing site *(projected) ( * 8322) PDB Depositions
9
November 6, 2009 Increase of PDB data depositions from Asia and Oceania regions China Hong Kong Taiwan Australia India New Zealand
10
November 6, 2009 PDB Downloads ** 1st month after version 3.0/3.1 files released 1st month after version 3.15 files released
11
November 6, 2009 PDB FTP Traffic (July 2008 - June 2009) RCSB PDB 200 million data downloads PDBe 37 million data downloads PDBj 14 million data downloads
12
November 6, 2009 Remediation Rollout Complete PDB File Format Contents Guide Version 3.20 released (Sep 2008) PDB format version 3.15 files released (March 2009) –New records/remarks: describe models, ligands and zero occupancy atoms/residues –Enhancements: assemblies, SITE records, chain IDs, database references (including taxonomy id, PubMed, DOI IDs) added for all files –Various corrections (including sequence, beamline, wavelength, atom connectivities, atom nomenclature, mmCIF consistency) –Missing remarks restored
13
November 6, 2009 Common Tool for Deposition and Annotation Manage increased data load without an increase in resources Create global deposition and annotation tools Proof of concept delivered July 2009 First test system due June 2010
14
November 6, 2009 wwPDB Foundation Will enable funding for wwPDBAC meetings and outreach activities Bylaws created Paperwork filed
15
November 6, 2009 Outreach Joint publications Meeting posters, exhibit booths, and presentations –eChemInfo, EMBO Practical Course: Macromolecular Crystallography, Biophysical Society, Experimental Biology, Biocuration Meeting, ISMB/ECCB, Protein Society, ACA, ACS, AsCA Website Helpdesk At AsCa’09 Beijing Se Won Suh Mitchell Guss
16
November 6, 2009 Formalized Internal Communications Phone conferences among site directors Regular exchange visits Weekly VTC’s among staff –Common Tool –EM –NMR –Data annotation Electronic reporting (validation, structure assignments, FTP updates, etc.) wwPDB Retreat 2009
17
November 6, 2009 Common Deposition and Annotation (D&A) Tool Martha Quesada Worldwide Protein Data Bank www.wwpdb.org
18
November 6, 2009 2007 wwPDB Retreat What strategic objectives should we start to address now to meet our goals in the next 5 – 10 years? IMPACT COST
19
November 6, 2009 Drivers and Opportunities wwPDB Common D & A Tool Project New deposition types: multiple methods and new data New validation procedures Need to support higher throughput Limited resources for new development and maintenance Process inefficiencies –Redundant tools in use: AutoDep and ADIT, data harmonization required Good collaborations among sites –Possibility of sharing of workload –Precedent for common tools for NMR and EM
20
November 6, 2009 Inflection Point Business as usual Ready for today New capabilities Change the game Time Usage, Impact Decision to come together as the wwPDB in developing the tools that will support the shared functions of the wwPDB for the next 10 years
21
November 6, 2009 wwPDB Common D&A Tool Project The goal is to implement a set of common deposition and annotation processes and tools that will enable the wwPDB to deliver a resource of increasingly high quality and dependability over the next 10 years. The tools will address: the increase in complexity and experimental variety of submissions and the increase in deposition throughput The processes and tools will maximize the efficiency and effectiveness of data handling and support for the scientific community
22
November 6, 2009 Project Scope Common deposition interface and processing –Coordinates (x,y,z) regardless of experiment origin (X-ray, NMR, EM) –NMR restraints, chemical shifts –X-ray structure factors –3D-EM maps Data processing: Validation Tools –Fit of model to data –Coordinates for polymers and ligands –NMR Restraints –Structure factor validation –EM map validation Partly developed by External Task Forces To be integrated over time
23
November 6, 2009 Assumptions & Constraints Functional requirements –Deposition tool must handle all current, agreed upon data entry and report formats from the user community –All data elements covered within the PDB annotation manual must be included Technical requirement –Design will enable flexibility for growth and evolution –Technical level reasonable standard, not bleeding edge or declining –Design must enable integration with community data capture
24
November 6, 2009 For example Deposition will capture all currently deposited experimental data for each method The tool will support all data formats and validation requirements for all deposition types The system will allow for workload balancing during deposition
25
November 6, 2009 Potentially novel designs require early experimentation Common D&A Project Characteristics Large and complex – Break into smaller bits Distributed developers – Establish development controls and communication Some requirements well understood – Policies in place Some requirements evolving Iteration 1 Iteration 2 Iteration n Increment delivered Review lessons learned Etcetera… 1. 2. 5. 3. 4. 6. 1. 2. 5. 3. 4. 6. 1. 2. 5. 3. 4. 6.
26
November 6, 2009 Modern Requirements Analysis Process (workflow) driven: As is vs. To be –What exactly do we actually do? –Analyze current processes –What works and where are the opportunities? –What would be the ideal process? Alignment of requirements to workflow –Functional requirements calculations, decision trees, reports, communication –Data flow requirements –Technical (strategic and tactical) requirements
27
November 6, 2009 Project Phases, Structure and Roles Steering Committee –Governance –Milestone reviews and guidance Concept Team –Initial requirements and design Core Team (Functional leaders) –Plan and manage the project Project Teams –Design, develop and test component solutions –Deliver the solution Develop & Test Delivery Initiation Concept Reqmt Design
28
November 6, 2009 Communication and Coordination Among all of the project stakeholders –Inward facing –Outward facing Between distributed functional and development groups
29
November 6, 2009 Initiation Concept Steering Committee (wwPDB Directors) October 2007 Set project Project Goal, Scope, Assumptions and Constraints, initial timelines Approves project at each milestone Concept Team November 2007 Objectives, strategies and metrics Stakeholder analysis & risk assessment New system requirements Concept process maps Approved May 08
30
November 6, 2009 Objectives & Strategies Improve data quality beginning at data capture –Provide for interactive feedback and value to the depositors during the deposition process –Employ community-driven validation methods Improve efficiency –Standardization, automation and more flexible data sharing Improve existing tools –Use “best of breed” existing tools where possible –Free resources to redevelop/develop new common tools Enable system maintenance and evolution –system modularity
31
November 6, 2009 Initiation of Design and Planning The Core Team, representing the functional groups and sites, leads the project through design and implementation in conversation with the Steering Committee RCSB PDB: John Westbrook, Jasmine Young; PDBe: Tom Oldfield, Sameer Velankar, Jawahar Swaminathan; BMRB: Steve Mading, Eldon Urlich; PDBj: Takanori Matssura Develop & Test Reqmt Design
32
November 6, 2009 Project Team: Distributed Delivery Subject and technical experts from all sites Reqmt Design Develop & Test Quarterly face-to- face meetings Weekly VTC team working meetings On-going teleconferences and email Shared web-based document and code management tools
33
November 6, 2009 Key Design Elements Modular construction through an API Reuse of “best of breed” existing tools; redevelop tools as time and need dictate. Enable system maintenance and evolution Improved workflow efficiency for faster processing Workflow automation - workflow engine and manager Improved collaboration More flexible data sharing Proposed technical design and deliverables reviewed and approved by the Steering Committee
34
November 6, 2009 July 2009 Technical Design Proof Of Concept Application Programming Interface: API –“ wrapped” application functionality Faster processing through improved efficiency –workflow automation implementation Improved collaboration –Snapmirror tested Potentially novel designs require early experimentation Python Core API Layer C/C++ Apps Fortran Apps RDBMS Other Services Other Services Workflow Engine
35
November 6, 2009 January 2010 - Production Deliverable Implementation of an annotation module With GO BACK functionality Expansion of workflow proof of concept Implementation of the API using existing functionality and the “Master Format” Introduction of “Go Back” functionality Improved user interface Integrated with existing workflows.
36
November 6, 2009 Project Progress July 2008: Initiate Design and Development Planning (Core Team) Nov 2008: Define Data Model Requirements (Project Team Meeting), Flesh out Design elements March 2009: Finalized design elements and initiate development of “proof of concept” July 2009: Deliver design “proof of concept” January 2010: First production deliverable 2009 2010 Concept Requirements Design Development Test Delivery 4Q 2007 2008 2011 Initiation
37
November 6, 2009 wwPDB Common D&A Tool Project Timeline Going Forward Concept Define deliverables Initial design Process definition Data model definition Requirements elaboration Data flow documentation Technical Design Data Sharing & Replication API, Master Format Automated Workflow Technical Proof of Concept Development of initial production deliverable Communication design production deliverables D&A system delivery Initiation Concept 2009 2010 4Q 2007 2008 2011 Requirements Design Delivery Development Test
38
November 6, 2009 Ultimate Project Deliverables For Depositors –Interactive and informative deposition interface –Value added validation input and annotation during deposition –Faster processing For Annotators –Improve efficiency, freeing time for more advanced annotation Improved quality early in the process Automation of appropriate processing steps Best of breed tools Expanded functionality –Enable system maintenance and evolution through system modularity For Data Users –Higher Quality Archive
39
November 6, 2009 Method and Molecule-specific Activities John Markley Gerard Kleywegt Worldwide Protein Data Bank www.wwpdb.org
40
November 6, 2009 NMR
41
November 6, 2009 NMR Update Remediated NMR restraints project is near NMR Validation Task Force established –First meeting held Sept 21, 2009, in Paris, France Implementation plan for Chemical Shifts requirement in progress Status of SMSDep
42
November 6, 2009 NMR Validation Task Force: Charge Advise on validation of new NMR data depositions Provide a report for the wwPDB AC Provide recommendations for structure validation criteria and tools –Tools and procedures recommended should be freely available and simple to install and maintain so users can easily use in own laboratories –Tools should not be used as a basis to “reject” structures, but to flag potential problems for the depositor/user to be aware of –Recommendations should be assembled into a “white paper” for publication –Recommendations should be targeted to software developers, depositors, journal editors, and PDB users
43
November 6, 2009 NMR Validation Task Force Committee Members Gaetano Montelione (Co-Chair, Rutgers) Michael Nilges (Co-Chair, Institut Pasteur) Ad Bax (NIH)* Peter Guentert (University Frankfurt) Torsten Herrmann (CNRS/ENS Lyon) Jane Richardson (Duke University) Charles Schwieters (NIH) Geerten Vuister (Radboud University)* David Wishart (University of Alberta) * Notes on the Paris meeting Ad Bax and Eldon Ulrich were unable to attend the meeting Jurgen Doreleijers attended as a substitute for Geerten Vuister Meeting Observers Naohiro Kobayashi (PDBj-BMRB) John Markley (NMR VTF Organizer) Randy Read (Chair, X-ray VTF) Eldon Ulrich (BMRB)* Wim Vranken (PDBe) John Westbrook (RCSB PDB)
44
November 6, 2009 NMR VTF: Outcome of first meeting September 21, 2009; Paris, France General consensus on the value of expanded NMR validation for the scientific community. Consensus on coordination with X-ray VTF on common validation issues. Requirements and available tools for validation were assessed during the meeting. Areas targeted for further research: format consistency for restraints, treatment of internal dynamics and ensemble averaging. Website and mail archive created to support task force communication
45
November 6, 2009 Chemical Shifts: Progress in implementation BMRB has been the primary deposition and processing site for NMR chemical shift (CS) data Mandatory chemical shift and reference data items have been defined, and a prototype mandatory CS system is in place wwPDB to perform minimal processing: –check format and sanity check at deposition –substitute explicit atoms for pseudo-atoms –maintain nomenclature correspondence during annotation Data files are to be transferred to BMRB for further annotation PDB will release chemical shift files in NMR-STAR format along with coordinate data files Download statistics for chemical shift data files will be be maintained for BMRB (needed for grant reporting)
46
November 6, 2009 SMSDep Deposition system is in place and accepting structures and associated NMR chemical shifts Current policy is to accept data only for small peptides or nucleic acids (processing and annotation is carried out at PDBj-BMRB) We need to monitor the level of activity to determine whether this site should be maintained
47
November 6, 2009 X-ray
48
November 6, 2009 wwPDB X-ray Validation Task Force Initial meeting –April 14-16, 2008 EBI, Hinxton, UK –R. Read (Chair), P. Adams, A. Brunger, P. Emsley, R. Joosten, G. Kleywegt, E. Krissinel, T. Luetteke, Z. Otwinowski, T. Perrakis, J. Richardson, W. Sheffler, J. Smith, I. Tickle, G. Vriend Goal –Gather recommendations and consensus on additional validation for PDB entries, and identify software applications for these validation tasks –Provide code/algorithms for the validation-software pipeline Preliminary Outcome –Candidate global and local validation measures were identified –These measures were reviewed in terms of the requirements of depositors, reviewers, and users
49
November 6, 2009 X-ray Validation Task Force: Next Steps May 2008 - September 2009: discussions (e-mail, Gordon Conference) and report writing October 2009: Meeting to complete report during Cold Spring Harbor Laboratory Crystallography Course November 2009: Report presented at wwPDBAC wwPDB partners are pooling manpower to implement Task Force recommendations –One dedicated programmer to implement the validation-software pipeline (Swanand Gore) Validation tools and procedures will also be incorporated in the new wwPDB Common Deposition and Annotation system
50
November 6, 2009 wwPDB X-ray Validation Task Force Apply new knowledge of structure –proteins, nucleic acids, carbohydrates, ligands New opportunities from mandatory data –fit to data, quality of data, pathologies Exploit new technologies –machine-readable annotation Serve the different communities –users, depositors, editors/referees
51
November 6, 2009 Ramachandran revisited
52
November 6, 2009 Clashes and holes
53
November 6, 2009 One intuitive summary of quality
54
November 6, 2009 Members of X-ray VTF Paul AdamsAxel BrungerPaul Emsley Roobie Joosten Gerard Kleywegt*Eugene Krissinel Thomas Lütteke Zbyszek Otwinowski Tassos Perrakis Jane Richardson Will ShefflerJanet Smith Ian TickleGert VriendRandy Read
55
November 6, 2009 SAXS/SANS
56
November 6, 2009 SAXS/SANS Increase of SAS publications for structural biology –Higher intensity sources –Advances in data analysis and modeling tools Increase in number of deposition requests since 2005 Number of hits from simple search of publications by using protein structure and small angle scattering
57
November 6, 2009 SAXS/SANS: Current Status Two types of models –Atomic model with a directed sequence –Dummy residue model 41 structures have been deposited since 1998 –7 Model 1 –2 Model 2 (withdrawn) –32 are between Model 1 and 2 (chain of C ) Model 1 Model 2
58
November 6, 2009 wwPDB proposed requirements for a SAXS/SANS PDB entry Model is derived and fully defined by the experimental data Model is a folded chain of residues with directionality COMPND, SOURCE, SEQRES and external sequence reference (DBREF) are included x,y,z coordinates per atom. Cα or P model allowed Has acceptable geometry (bond-length, bond-angle, torsion-angle, non- bonded contacts, etc.) Experimental and refinement details recorded in appropriate REMARK records Parameters directly derived from the scattering profile should be supplied and appropriately recorded (radius of gyration, Dmax in distance distribution function, mass, etc.) Reduced 1D experimental profile Family of models should be superimposed
59
November 6, 2009 SAXS/SANS: Next Steps Create a SAXS/SANS Task Force to advise the wwPDB –Which if any SAXS/SANS models should be in PDB? –Template for PDB file –Validation standards
60
November 6, 2009 Electron Microscopy
61
November 6, 2009 Electron Microscopy Collaborative project between RCSB PDB, PDBe, and Baylor-NCMI is funded by NIH, BBSRC, and EMBL Unified tool for collecting model coordinates and map files in a one-stop shop Merge deposition and annotation with PDB as part of Common Tool by 2011
62
November 6, 2009 EM Coordinate and Map Depositions
63
November 6, 2009 EMDatabank.org Joint map + coordinate deposition service Emdatabank.org: news, EM software list, information about dictionaries, conventions, FAQ page, community links EMSEARCH: search by ID, author, sample type, keyword, deposition date EMViewer: simple map viewer
64
November 6, 2009 EMDB Annotation 1 EBI annotator in 2002 1 RCSB PDB annotator added in 2008 Joint deposition of map and model enables joint validation Annotation document has been written Remediation underway to improve uniformity of maps and header XML files Letters sent to journals about deposition requirements
65
November 6, 2009 EM Navigator Service provided by PDBj: Search for EM data through EMDB with the corresponding PDB data Views of EM 3D structures with or without the relating atomic structures in PDB as movies and by molecular viewers Will test the newly remediated EM map files so as to ensure the quality of these files
66
November 6, 2009 Next steps Community input on modeling criteria for EM maps-First meeting January 2010 Set up EM Task Force to set deposition and validation standards for deposition Set requirements for EM for the Common Tool project Target date for incorporation of EM maps into the PDB-2011
67
November 6, 2009 Complex Chemistry of Peptide- Like Molecules in the PDB: Antibiotics and Inhibitors
68
November 6, 2009 Challenges Inclusion of non-standard amino acid, nucleotides, or other chemical groups in sequence Non-linear (cyclic or branched) sequences Microheterogeneity (some cases) Non-uniform annotation of the same molecule in different PDB entries Lack of annotation regarding the source and function of these molecules
69
November 6, 2009 Solutions Analyze and classify –Groups antibiotics and inhibitors into polymeric molecules or single components Chemical Component Dictionary updates Remediate files Establish rules for future processing Create Peptide Reference Dictionary
70
November 6, 2009 Peptide-Like Molecules: Inhibitors Combined (single component) (406) Retained as polymer (350) Split to polymer (23) FFRCK; PPACK II Subcomp: DPN PHE ARG 0QE Pepstatin IVA VAL VAL STA ALA STA 1 2 3 4 5 6 Leupeptin Ace-Leu-Leu-Argal 1 2 3 4
71
November 6, 2009 Peptide-Like molecules: Antibiotics Split for clear and correct representation (180) Actinomycin D (non-ribosomal product) THR DVA PRO SAR MVA PXZ THR DVA PRO SAR MVA 1 2 3 4 5 6 7 8 9 10 11 Thiostrepton (gene product) I A S A S C T T C I C T C S C S S 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
72
November 6, 2009 Peptide Reference Dictionary (PRD) Initial research completed (summer interns) Checking and corrections (in progress) Information to be incorporated in remediated files Dictionary to be made available to all
73
November 6, 2009 Status of Peptide-Like Molecules Inhibitor remediations (RCSB PDB) Antibiotic remediations (PDBe) Expected release: Spring 2010
74
November 6, 2009 Policy Issues and New Ventures Helen Berman Worldwide Protein Data Bank www.wwpdb.org
75
November 6, 2009 Molecules Accepted by wwPDB Current Requirement: Polypeptide structures with 24 or more residues Polynucleotide structures with 4 or more residues Polysaccharide structures with 4 or more residues
76
November 6, 2009 Molecules Accepted by wwPDB Proposed Requirement Polypeptide structures –Gene products –Naturally-occurring non-ribosomally synthesized peptides –Peptidic repeat units of large fibrous polymers –Synthetic peptides of at least 24 residues unless there is a clear case that it is of biological relevance Polynucleotide structures –With 4 or more residues Polysaccharide structures –With 4 or more residues
77
November 6, 2009 PDB Format Problems Large structures not accommodated in current format –> 62 chains –99999 atoms Level of experimental detail Multiple models Non-homogeneous models Non-linear sequences (e.g., carbohydrates)
78
November 6, 2009 Proposal Create a modified PDB format for ATOM and others records such that is possible to –calculate electron-density maps –validate of the model and its fit to the experimental data –continue or re-do the refinement Meta data would be represented in PDBx Small group of key community members would be consulted New format in place by Q4 2010
79
November 6, 2009 Status Code Clarification Author Approval of Annotation Report? No Author Response & No issues No Author Response & Entry has Issues & Paper is published No Author Response & Entry has Issues & No paper is published REL Release immediately Release after 3 weeks Release with CAVEAT record upon electronic publication of corresponding paper Withdrawn 12 months after deposition HPUB Release upon electronic publication Release with CAVEAT record upon electronic publication of corresponding paper Withdrawn 12 months after deposition HOLD At most a year after deposition; entry is released upon electronic publication based upon journal policy At most a year after deposition; entry may be released upon electronic publication based upon journal policy Release with CAVEAT record upon electronic publication of corresponding paper Withdrawn 12 months after deposition
80
November 6, 2009 Obsolete Entries Appears in entries that have been removed from main distribution into ftp://ftp.wwpdb.org/pub/pdb/data/structures/obsolete/ It is PDB policy that only the principal investigator and/or the primary author who submitted an entry has the authority to obsolete it OBSLTE indicates which, if any, new entries have replaced the entry that was obsoleted (SPRSDE) Explanation for obsolete entries without replacement entry should be included in files
81
November 6, 2009 Fabricated Structures Problem: A PDB entry is found to be fabricated Proposed Action: If the author does not agree to OBSLTE the entry, the employer of the author may request that PDB entry be made OBSLTE. This request must be appropriately documented. The citation for the obsolete entry must be a published explanation of the circumstances that led to retraction of the paper(s) and associated PDB entry or entries.
82
November 6, 2009 Worldwide Protein Data Bank Foundation, Inc. Organized to benefit the public and to support and ensure the functioning of the wwPDB AC Conducts the annual review meeting to ensure the continued effectiveness of the collaboration Conducts seminars and workshops with the purpose of educating the public and promoting the goals and activities of the wwPDB in curation, archiving, and disseminating the common data archive of biological macromolecules Fundraising initiatives include journal and donations
83
November 6, 2009 Worldwide Protein Data Bank Foundation, Inc. Bylaws filed with the state of New Jersey Board of Trustees –4 members, one elected by each wwPDB site Officers –Elected by the Board of Trustees –President, Secretary, Treasurer
84
November 6, 2009 Industrial Interactions Increased interactions/collaborations with software companies: OpenEye, CCG, Schrodinger leading to improved data files US-CSAR: Community Structure Activity Resource are collecting model and binding data from pharma. Refined structures will be sent to PDB Europe-Innovative Medicines Initiative's Open Pharma Space may provide a source of industrial models
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.