Economics and Impact of the Protein Data Bank (PDB) Archive

Slides:



Advertisements
Similar presentations
Genome Annotation: A Protein-centric Perspective.
Advertisements

Maines Sustainability Solutions Initiative (SSI) Focuses on research of the coupled dynamics of social- ecological systems (SES) and the translation of.
Depository Institutions
Regional Innovation Strategy RIS OF SOUTH MORAVIA & MASARYK UNIVERSITY Prof. Martin Bareš vice-rector of Masaryk University Petr Chládek RIS manager, JIC.
Society for Endocrinology Society for Endocrinology BES March 2007 Steve Byford Society for Endocrinology
Global Alignment and Collaboration Jo
Implementing Open Acces in Denmark UNESCO Regional Consultations on Open Access, Berlin, November 2013 forfatter.
Open Access and Scholarly Communications Tyler Walters Julie G. Speer Library Faculty Advisory Board November 20, 2009.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 15 th April 2009 Visit of Spanish Royal Academy.
Management and Distribution of Chemical Data in the Protein Data Bank John Westbrook, Dimitris Dimitropoulos, Jasmine Young, Peter Rose, Philip E. Bourne.
‘ Social injustice is killing people on a grand scale ’ ‘ A new approach to development ’ economic growth is without question important, particularly for.
New Crossroads Transitions & Transformations Science Librarians in the 21st Century Mary M. Case University of Illinois at Chicago.
Erice 2008 Introduction to PDB Workshop From Molecules to Medicine: Integrating Crystallography in Drug Discovery Erice, 29 May - 8 June Peter Rose
World Food Prize International Symposium October 12 – 14, 2005 NASULGC Food and Society Initiative Mortimer H. Neufville.
Social Science Data and ETDs: Issues and Challenges Joan Cheverie Georgetown University Myron Gutmann ICPSR – University of Michigan Austin McLean ProQuest.
Introduction to databases Tuomas Hätinen. Topics File Formats Databases -Primary structure: UniProt -Tertiary structure: PDB Database integration system.
Session Chair: Peter Doorn Director, Data Archiving and Networked Services (DANS), The Netherlands.
On the organization and conduct of expert examination in science and technology in the USA and the European Union Scientific Research.
Worldwide Protein Data Bank Worldwide Protein Data Bank History of the PDB  1970s  Community discussions about how to establish.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
EMBL-EBI EMBL-EBI EMBL-EBI What is the EBI's particular niche? Provides Core Biomolecular Resources in Europe –Nucleotide; genome, protein sequences,
2004: 39.4 (35.9 – 44.3) million Western & Central Europe [ – ] North Africa & Middle East [ – 1.5 million] Sub-Saharan.
24 September 2009National Academies - BRDI1 Research Data and Information: New Developments at NIH Jerry Sheehan Assistant Director for Policy Development.
Data Integration and Management A PDB Perspective.
A Data Centre for Science and Industry Roadmap. INNOVATION NETWORKING DATA PROCESSING DATA REPOSITORY.
Kiichiro Fukasaku Development Centre
Real World Experiences in Operating a Collaboratory: The Protein Data Bank Helen M. Berman Board of Governors Professor of Chemistry.
China July 2004 The European Union Programmes for EU-China Cooperation in ICT.
Open Access and the ESRC New directions in scholarly communications in the social sciences.
Afternoon session: The archival problem and infrastructure for solutions Prof John R Helliwell Interactive Publications.
SOCIAL INCLUSION IN EASTERN EUROPE AND CENTRAL ASIA TOWARDS MAINSTREAMING AND RESULTS SOCIAL INCLUSION IN EASTERN EUROPE AND CENTRAL ASIA TOWARDS MAINSTREAMING.
ICPSR Data Fair November 8, 2010 Katherine McNeill, MIT Libraries
Hawai’i THE NATIONAL SCIENCE FOUNDATION (NSF) is the only federal agency whose mission includes support for all fields of fundamental science and engineering.
NRF Open Access Statement
Washington THE NATIONAL SCIENCE FOUNDATION (NSF) is the only federal agency whose mission includes support for all fields of fundamental science and engineering.
Perspectives from a GEF Implementing Agency
South Big Data Innovation Hub
Ian Bruno, Suzanna Ward The Cambridge Crystallographic Data Centre
Introduction to Business (MRK 151)
ELIXIR Core Data Resources and Deposition Databases
HELLASCO CONFERENCE 2017 The recent developments in the external aid consulting sector September 29, 2017 Ines Ferguson, Chair of EFCA European.
Briefing on the oversight role of the DPW over IDT
Federal STEM Education Inventory and Strategic Plan
Optimizing Outcomes for Equitable, Efficient, Safe, and Green Mobility
Susan Veldsman Director: Scholarly Publishing Unit October 2010
Long-term Grid Sustainability
The Challenge.
KP to add NSF Logo and Grant #
What is €5 billion worth? Magda Gunn, IMI Scientific Project Manager.
The Protein Data Bank: Evolution of a key resource in biology
Investment in Skills and Labour Force for Human Development 2017 Round Table Implementation Meeting Pre-Consultation National Human Resource Development.
Funding Sustainability and Domain Repositories
Functional Annotation of the Horse Genome
"3 by 5" progress December 2005.
An Industry Perspective Nicole Denjoy COCIR Secretary General
Introduction to Bioinformatics
EU Social Dialogue in the Food & Drink Industry
Institutional Structure of the GEF
Showing throughout the event
International Strategy
Nicola Perrin The Wellcome Trust
National data services for South African research
Some Approaches to Faculty Assignments
Some Approaches to Faculty Assignments
Wrap-Up – NSF Site Visit 8 February 2010
The Biodiversity and Protected Areas Management (BIOPAMA) Programme
Incorporating Scientific Practices into the BBNJ ILBI
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Yoichiro Ishihara Resident Representative
Some Approaches to Faculty Assignments
Presentation transcript:

Economics and Impact of the Protein Data Bank (PDB) Archive R. Andrew Byrd Chair, wwPDB Advisory Committee Chief, Structural Biophysics Laboratory, National Cancer Institute, NIH wwpdb.org

Protein Data Bank First open access digital resource for biology data (est. 1971 with 7 entries) Single global archive of experimental 3D structures of biological macromolecules (>121,000 entries) Primary data => structural biology, computational biology, drug discovery, … Complements GenBank and UniProt sequence databases Data Management Plan for all biomedical grants in US All data freely available (scientists and educators – world-wide) Global archive of experimental macromolecular structure data central to biomedical research ABL tyrosine-kinase inhibited by Imatinib for treatment of chronic myeloid leukemia (CML). PDB ID 2hyy Cowan-Jacob et al. (2007) Acta Crystallographica D63: 80-93. HIV-1 reverse transcriptase complex with DNA and nevirapine PDB ID 3v81 Das et al. (2012) Nature Structural and Molecular Biol ogy19: 253-259.

Organizational Structure/Funding Partners share “Data In” responsibilities Biocurate new depositions Define deposition and annotation policies Resolve data representation issues Implement community validation standards Partners independently funded by each region Overseen by a wwPDB Advisory Committee Partners compete on “Data Out” resources wwpdb.org US: NSF, NIH, DOE Europe: EMBL-EBI, Wellcome Trust, BBSRC, EU, NIH, CCP4 Asia: NBDC-JST US: NIGMS

Impact Metrics ~11,000 new structure depositions/year Biocuration responsibilities distributed by geographic location ~1.5 million data files downloaded/day Pharma Industry use PDB archive behind company firewalls daily >75% of use by non- depositors Depositor Locations Biocuration Workload Distribution Data Download Locations

Sustainability wwPDB established in 2003 Goals: (1) protect PDB archive and prevent fragmentation (2) enable global cooperation on: Increased “Data In” productivity: common OneDep system for deposition/biocuration/validation Geographical Distribution-Load Balancing of “Data In” Preparations to extend the wwPDB Franchise: Consideration for sites in PRC, South Asia, South America

Evaluation of ICSPR* Funding Models Only 1 of the 8 funding models evaluated was deemed acceptable for wwPDB to ensure: Economic Stability/Long-term Sustainability Global Open Access Equity for Data Depositors Equity for Research/Teaching Institutions Infrastructure Model! *Inter-university Consortium for Political and Social Research Sustaining Domain Repositories for Digital Data: A White Paper

What is the Infrastructure Model? Funding agencies commit to direct payment of the costs of archiving experimental data/metadata generated with the research support they provide Data resource funding comes in the form of strategic, long-term infrastructure investments (divorced from typical 3-5 year grant cycles) Ensures economic stability/sustainability for an open access data resource ecosystem with equity for data depositors and consumers

Infrastructure Model and the wwPDB wwPDB partners endorse the Infrastructure Model (i.e., a model in which research funders reserve a percentage of annual expenditure for digital data archiving and preservation across the sciences) Estimated annual cost ~1-2% of the cost of data generation wwPDB estimates for archiving experimental macromolecular structure data/metadata in the Protein Data Bank Conservative cost of replicating the PDB archive (assuming average unit cost of US$100,000) equals US$12 billion Impacts >80% of biomedical research grants

Thank you