Real World Experiences in Operating a Collaboratory: The Protein Data Bank Helen M. Berman Board of Governors Professor of Chemistry.

Slides:



Advertisements
Similar presentations
Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
Advertisements

Reshaping Digital Library Services at National Level – Why, How, When? Kristiina Hormia-Poutanen, Director of Library Network Services, Finland Liber annual.
SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Margaret J. Gabanyi March 4, 2009 How to Use the.
 Copyright 2007 STI - INTERNATIONAL Semantic Technology Institute International PlanetData - Ensuring Impact.
Data activities of the International Union of Crystallography Brian McMahon IUCr 5 Abbey Square Chester CH1 2HU
PubMed Central ANCHASL Spring Meeting April 1, 2005 Robert James Associate Director of Public Services Duke University.
1.
Update on PDB Data Deposition Specifications
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Introducing ICPSR An Electronic Brochure. Our Mission ICPSR provides leadership and training in data access, curation, and methods of analysis for a diverse.
Workshop on Biological Macromolecular Structure Models RCSB Protein Data Bank Rutgers, The State University of New Jersey.
The Big Mash Up or The Future of International Data The World Bank’s Open Data Initiative Eric Swanson Development Data Group The World Bank July 2010.
Supporting Data Management Across Disciplines Katherine McNeill Massachusetts Institute of Technology IASSIST Annual Conference 2010.
Management and Distribution of Chemical Data in the Protein Data Bank John Westbrook, Dimitris Dimitropoulos, Jasmine Young, Peter Rose, Philip E. Bourne.
The RCSB Protein Data Bank Teaching an Old Dog New Tricks
U.S. National Committee for CODATA Overview of Recent Activities and Plans September 2006 U.S. National Committee for CODATA Overview of Recent Activities.
BDPA-Information Technology Thought Leaders Bridging the Gap Between Professionals and Students “Boardroom to Classroom concept”
Number of released entries Year. Growth of Molecular Complexity Number of Chains Year Number of Structures Containing that Number of Chains.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
Bringing Structure to Biology: Small Molecules and the PDBe
23 rd August 2005CCP4 Workshop IUCr 2005 Florence Italy 1 N6: A Protein Crystallographic Toolbox: The CCP4 Software Suite and PDB Deposition Tools IUCr.
Evaluation of Structure Quality Using RCSB PDB Tools Kyle Burkhardt, Lead Data Annotator The RCSB PDB at Rutgers University.
The DSpace Course Module – An introduction to DSpace.
CCP-EM community meeting 7 February 2013 EMDB and beyond Ardan Patwardhan and Gerard Kleywegt Protein Data Bank in Europe EMBL-EBI.
Research Infrastructures in Structural Biology - NMR Lucia Banci CERM, Florence, Italy Workshop "Future Needs for Research Infrastructures.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
23 rd August 2005CCP4-RCSB Workshop IUCr 2005 Florence Italy 1 N6: A Protein Crystallographic Toolbox: The CCP4 Software Suite and RCSB PDB Deposition.
Dataset Citation: From Pilot to Production Mark Martin Assistant Director, Office of Scientific and Technical Information U.S. Department of Energy.
E-BioSci a platform for e-publishing and information integration in the life sciences Les Grivell European Molecular Biology Organization.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Databank in Europe (PDBe)‏ An Introduction.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
Helen M. Berman, Rutgers University EMBO Practical Course Section: Searching Structure Databases September 26, 2008 PSI Structural Genomics Knowledgebase.
The Environmental Genomics Thematic Programme Data Centre Dawn Field, Director.
Data and Dissemination Core 1. Overview and EFI Website – Heidi Imker, UIUC 2. EFI LabDB LIMS – Wladek Minor, UVA 3. SFLD – Patsy Babbitt, UCSF (post lunch)
Worldwide Protein Data Bank Worldwide Protein Data Bank History of the PDB  1970s  Community discussions about how to establish.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
Towards Data Attribution & Citation in the Life Sciences Philip E. Bourne UCSD 8/22/11Data Attribution and Citation.
A pilot KB of biological pathways important in Alzheimer’s Disease Tim Clark MassGeneral Institute for Neurodegenerative Disease June.
PSCIC Working Group: Parag Chitnis Chris Greer Susan Lolle Sam Scheiner Jane Silverthorne Bill Zamer Manfred Zorn.
May 2, 2013 An introduction to DSpace. Module 1 – An Introduction By the end of this module, you will … Understand what DSpace is, and what it can be.
Data Integration and Management A PDB Perspective.
Structure database: PDB Tuomas Hätinen. Protein Data Bank A repository for 3-D biological macromolecular structure. It includes proteins, nucleic acids.
Valentina Di Francesco Senior Program Officer for Bioinformatics, Structural Genomics and Systems Biology Microbial Genomics.
Protein Data Bank: An Introduction Learning to Use the RCSB PDB Portal.
Examples for Open Access Scholar Electronic Repository by New Bulgarian University IP LibCMASS Sofia 2011 Contract № 2011-ERA-IP-7 Sofia, September,
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
Worldwide Protein Data Bank wwPDB Common D&A Project November 24, 2009 November 24, 2009 Steering Committee Project Update.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
Enterprise Archiving, Retention and Discovery System Jim Albert Deputy Director Department of Information Services April 19 th 2007.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Worldwide Protein Data Bank wwPDB Common D&A Project Full Project Team Meeting Rutgers March 16-19, 2010.
Carolina Environmental Program At UNC 2003 Models-3 Workshop Status of the CMAS Center Bob Imhoff, CMAS Director.
EMBL-EBI Data Archives – An Overview. The EMBL-EBI mission Provide freely available data and bioinformatics services to all facets of the scientific community.
Applications and Requirements for Scientific Workflow May NSF Geoffrey Fox Indiana University.
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Helen M. Berman March 4, 2009 How to use the PSI.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Afternoon session: The archival problem and infrastructure for solutions Prof John R Helliwell Interactive Publications.
Economics and Impact of the Protein Data Bank (PDB) Archive
PDBe Protein Interfaces, Surfaces and Assemblies
Protein 3d structure Our understanding of life at the molecular level is highly dependent on the ability to map the molecular details of individual proteins.
The Protein Data Bank: Evolution of a key resource in biology
Structural biology Our understanding of life at the molecular level is highly dependent on the ability to map the molecular details of individual proteins.
DIGITAL LIBRARY.
CCP4-PDB Workshop ACA 2004 Chicago
The site to download BALBES:
N6: A Protein Crystallographic Toolbox:
Presentation transcript:

Real World Experiences in Operating a Collaboratory: The Protein Data Bank Helen M. Berman Board of Governors Professor of Chemistry & Chemical Biology Director, Research Collaboratory for Structural Bioinformatics and the Protein Data Bank

What is the PDB? Single international repository for all information about the structure of large biological molecules Archival database with hundreds of thousands of users who depend on the data

Number of released entries Year

1970’s Grass roots community efforts to archive data Protein crystallographers discuss how to archive data June 1971 –Cold Spring Harbor meeting brings groups together (Cold Spring Harbor Symposia on Quantitative Biology, vol. XXXVI, 1972.) October 1971 –PDB is announced in Nature New Biology (7 structures; vol 233, 1971, page 223) 1975 –PDB receives first funding from NSF (~32 structures)

Nature New Biology CHAD

1980’s Technology takes off –molecular biology, instrumentation, computer hardware and software Structural biology is able to focus on medical problems Community efforts to promote data sharing IUCr guidelines requiring data deposition in the PDB are published

1990’s Number of structures increases exponentially Complexity of structures increases New databases begin to emerge More structures determined by cryo- electron microscopy Plans for structural genomics emerge User community for the PDB expands dramatically RCSB awarded contract for the PDB

Who does what? Rutgers –Data in: standards, validation, annotation UCSD/SDSC –Data out: search engine, Web site, data distribution

Communication VTC Electronic , forums, wikis Procedures Internal newsletter Retreats

Retreats Team building exercises Management training Technical discussions Time to get to know one another

VTC’s Two formal ones per week Ad hoc when there are issues to discuss

2000’s Continued growth in structure studies Structural genomics takes off RCSB PDB contract renewed 2bus Kurt Wüthrich, who determined the first first three-dimensional protein structure by NMR spectroscopy with coworkers (proteinase IIa inhibitor from bull seminal plasma) was awarded the Nobel Prize in Chemistry in 2002 Release of new database and website BMRB joins RCSB

The PDB is Global

Worldwide Protein Data Bank

Mission Maintain a single archive of macromolecular structural data that is freely and openly available to the global community

wwPDB Formalization of current working practice Members –RCSB PDB (Research Collaboratory for Structural Bioinformatics) –PDBj (Osaka University) –Macromolecular Structure Database (EBI) MOU signed July 1, 2003 Announced in Nature Structural Biology November 21, 2003

Guidelines and Responsibilities All members issue PDB ID’s and serve as distribution sites for data One member is the archive keeper (RCSB) All format documentation publicly available Strict rules for redistribution of PDB files All sites can create their own web sites

Future 60,000 structures by ,000 depositions per year in 2010 Complexity will increase dramatically New methods will yield new structures

Scientific Challenges Number of data files continues to increase Information content of each data file is increasing Many more very large macromolecular complexes New structure determination methods Structure genomics

Technical Challenges How do we represent diverse data? How do make a searchable database? How do we integrate with other data resources? How do we make a scalable system? How do we meet the needs of a diverse community?

“These studies should lead to an understanding of structure/function relationships and the ability to obtain structural models of all proteins identified by genomics. This project will require the determination of a large number of protein structures in a high-throughput mode.” Structural Genomics From the NIH Request for Proposals for Structure Genomics Centers: “The next step beyond the human genome project”

PSI - Structures (Sep images)

Community Depositors –Different methods: X-ray, NMR, cryo-EM Users –Specialists (structural biologists) –Generalists –Educators –Students –Lay community

Active Outreach Electronic Meetings Publications One on one Many workshops

Issues Standards: What is the role of the centers? What should it be? Long term preservation: How long? What are the options? Stability: Strong dependency of research community demands a more stable model

Bottom line All the interdependencies within wwPDB and between the scientific community and wwPDB call for a new funding model that will ensure the long term preservation and availability of the research data contained within these resources

Acknowledgements Operated by two members of the RCSB: Supported by: NIGMS The RCSB PDB is a member of the