Download presentation
Presentation is loading. Please wait.
Published byMonica Craig Modified over 9 years ago
1
Board on Research Data and Information, National Research Council “Changing Roles of Libraries in Support of Scientific Data Activities” June 3, 2010 More Data, More Use, Less Lead Time: Scientific Data Activities at the National Library of Medicine Betsy L. Humphreys Deputy Director National Library of Medicine www.nlm.nih.gov
2
NLM & Scientific Data Data categories – Substances – Sequences – Clinical Research – Taxonomies/Nomenclatures/Ontologies
3
NLM & Scientific Data Challenges (aka Problems) – Much more data Greater NIH/other investment in generating data High throughput methods New, unfunded mandate(s) – Much less lead time Need to achieve standardization more rapidly
5
Growth In PubChem Tested Substances
7
7 ICMJE FDAAA 801 ~25-30 / wk ~250 / wk ~320 / wk Number of Studies Registered at ClinicalTrials.gov since May 1, 2005 2,317 Results Records submitted (Sept 2008 – March 2010) – About 30 new results records per week; 80 re-submissions per week – Anticipate increase in rate as rules become clear and outreach continues
9
UMLS Metathesaurus – May 2010 version
10
NLM & Scientific Data Strengths – Mission & Track Record Curation, Storage, Permanent Access, Standards, R & D – Robust Infrastructure Staff Expertise, Advisory Structure, Computing, Communications – Connections between different kinds of data, information – Strong US partnerships and international collaborations – Heavy use Weaknesses – The “defects of our qualities” – Limited resources – Less user outreach/training than desirable
12
Hazardous Substances Data, 1978-
13
Toxic Release Inventory Data, 1987-
14
National Center for Biotechnology Information, 1988- – Design, develop, implement, and manage automated systems for collection, storage, retrieval, analysis, & dissemination of knowledge concerning molecular biology, biochemistry, & genetics – Perform research into advanced methods of computer- based information processing capable of representing and analyzing the vast number of biologically important molecules and compounds – Enable persons engaged in biotechnology research and medical care to use these systems & methods – Coordinate, as much as is practicable, efforts to gather biotechnology information on an international basis
15
Benzene – PubChem Bioassay Results
17
- ~2 million users a day - 100 million hits a day - 5 terabytes of data a day - 3,500 web hits a second (peak) 17
18
PubChem Users per Day
19
Current Activities/Future Plans Continued emphasis on: – Improving the input Tagging, standardization, explicit links (e.g., GenBank #s, NCT #s) – Increasing data curation efficiency – Use of “influentials” to promote standards, best practices – US Partnerships & International collaborations – Computer center efficiency, security – Better discovery, retrieval, display methods
21
21
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.