UK National Chemical Database Service: An integration of commercial and public chemistry services to support chemists in the United Kingdom Antony Williams,

Slides:



Advertisements
Similar presentations
Introduction to Kalabie Electronic Lab Notebook May 2009
Advertisements

© S.J. Coles 2006 Usability WS, NeSC Jan 06 Experiences in deploying a useable Grid-enabled service for the National Crystallography Service Simon J. Coles.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
© S.J. Coles 2006 Usability WS, NeSC Jan 06 Enabling the reusability of scientific data: Experiences with designing an open access infrastructure for sharing.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
Supporting Engagement in Open Access: a Publishers Perspective
Data Management: Metadata, Repositories and Curation Tony Mathys, Anne Robertson Eddie Boyle, Guy McGarva GeoForum, 4 th November, York.
Materials Data Curation System
THE GLOBAL CHEMISTRY NETWORK David James Executive Director, Strategic Innovation Jim Iley Executive Director, Science and Education 3 rd September 2013.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
CAPTURE SOFTWARE Please take a few moments to review the following slides. Please take a few moments to review the following slides. The filing of documents.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
A Presentation Management System for Collaborative Meetings Krzysztof Wrona (ZEUS) DESY Hamburg 24 March, 2003 ZEUS Electronic Meeting Management System.
Royal Society of Chemistry developments to support open drug discovery Antony Williams, Ken Karapetyan, Valery Tkachenko, Colin Batchelor Alexey Pshenichnov.
@MAKERERE DSpace Development At Makerere University An overview of the Uganda Science Digital Library (USDL) Pilot Project A paper presented at the DSpace.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
© S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,
New Faculty Orientation Blackboard Academic Suite 7.1 University of the Pacific June 28, 2015.
How community crowdsourcing and social networking is helping to build a quality online resource for chemists.
Crowdsourcing Chemistry for the Community – 5 Years of Experiences Antony Williams NFAIS, February 28 th 2012.
Internet and Social Networking Research Tools for Academic Writing Copyright © 2014 Todd A. Whittaker
The Value of a Unique Researcher Identifier to ChemSpider Projects Antony Williams ORCID Meeting, Boston, May 18 th 2011.
Practical Advice Morag Greig Advocacy William J Nixon Service Development DAEDALUS Workshop – 27 June 2005.
Royal Society of Chemistry activities to develop a data repository for chemistry-specific data Aileen Day, Alexey Pshenichnov, Ken Karapetyan, Colin Batchelor,
CSED Computational Science & Engineering Department CHEMICAL DATABASE SERVICE The Current Service is Well Regarded The CDS has a long and distinguished.
Chemical Database Projects Delivered by RSC eScience at the FDA Meeting “Development of a Freely Distributable Data System for the Registration of Substances”
ChemSpider – A Combination Platform of Free Chemistry Database, Free Prediction Engines and Crowdsourcing Environment Antony Williams University of Oregon,
Big Data Supporting Drug Discovery Cautionary Tales from the World of Chemistry for Translational Informatics Valery Tkachenko RSC-CSIR/OSDD meeting Pune,
Data Management BIRN supports data intensive activities including: – Imaging, Microscopy, Genomics, Time Series, Analytics and more… BIRN utilities scale:
Chemical health and safety data online – data consistency Antony Williams iRAMP Meeting, Ithaca, Feb 2014.
Marrying ACD/Labs technologies to eScience Projects at the Royal Society of Chemistry Antony Williams ACD/Labs User Meeting June 2013.
The Benefits of Participation in the Social Web of Science Antony Williams Research Square October 30 th 2014.
11 Curation of Chemistry Data from the Laboratory to Publication Jeremy Frey & Simon Coles School of Chemistry University of Southampton Jeremy Frey &
Now launched! Visit nature.com/scientificdata Honorary Academic Editor Susanna-Assunta Sansone Advisory.
Implementing an Institutional Repository: Part III 16 th North Carolina Serials Conference March 29, 2007 Resource Issues.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
ALA Institutional Repository Update ALA Archives at the University of Illinois Urbana-Champaign Chris Prom Cara Bertram Denise Rayman.
GPO’s Federal Digital System December 10, 2009 U.S. Government Printing Office.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Vendor Session: ChemSpider, from Royal Society of Chemistry.
Data enhancing the Royal Society of Chemistry publication archive Antony Williams, Colin Batchelor, Peter Corbett, Ken Karapetyan and Valery Tkachenko.
Adapting the Electronic Laboratory Notebook for the Semantic Era Tara Talbott, Michael Peterson, Jens Schwidder, James D. Myers 2005 International Symposium.
After the RAE: Continuing to manage research outputs Morag Watson Digital Library Development Manager University of Edinburgh.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
A Technical Overview Bill Branan DuraCloud Technical Lead.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
A Chemistry Data Repository to Serve Them All Antony Williams.
Structure verification and elucidation using the ChemSpider database Antony J Williams, Valery Tkachenko and Alexey Pshenichnov SERMACS, November 16 th.
General & Background InformationPractical & Useful DataDetailed, Original Research Encyclopedias Dictionaries Reference Texts Books Safety Information.
Open Governance Platform
NRF Open Access Statement
Stuart J. Chalk, Department of Chemistry University of North Florida
Preliminaries Have you sign up for SciFinder account? Login to PC
Applying Royal Society of Chemistry Cheminformatics Skills to Support the PharmaSea Project Antony Williams, Alexey Pshenichnov, Valery Tkachenko, Ken.
Preliminaries Have you sign up for SciFinder account? Login to PC
Dealing with the complex challenge of managing diverse chemistry data online Antony Williams, Valery Tkachenko, Alexey Pshenichnov and Ken Karapetyan.
Moving on : Repository Services after the RAE
An Introduction to Tessella and The Safety Deposit Box Platform
ORCID ID: Driving needs for analytical data exchange standards and the potential impacts on the chemical sciences Antony Williams.
Preliminaries Have you sign up for SciFinder account? Login to PC
Beyond the paper resume and how to develop an online profile as a scientist Antony Williams.
Overview of open resources to support automated structure verification
Developing Institutional Data Repositories
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Presentation transcript:

UK National Chemical Database Service: An integration of commercial and public chemistry services to support chemists in the United Kingdom Antony Williams, Valery Tkachenko and Richard Kidd ACS Dallas March 2014

UK Chemical Database Service The National Chemical Database Service is for UK academics – see later for Rest of World

Vision for the Service PART 1 Provide access to databases and services of interest to the academic community to serve their needs. Access to services to include: Crystallography data – Organic and inorganic materials Thermophysical data Reactions Data including retrosynthetic analysis Prediction technologies – name generation, physicochemical parameters, NMR prediction

Service Rollout Many services are hosted in the cloud Access through login/password, IP authentication or Shibboleth authentication Lots of hard work in a very short time – so much thanks to all of the service providers More providers stepped up to help – ChemAxon Crystallography concern (understatement!)

Feedback from Community Converted initial public negativity spike on Twitter pre-release to very positive feedback post-release Training required – onsite training sessions organized Available Chemicals Directory is big plus! Concerns with Retrosynthetic Analysis tool

Usage Majority of usage is for crystallography data – previous provider had same bias Usage is increasing month-by-month Still way-under used and in many cases low awareness

Vision for the Service PART 2 Response to the call for proposals included our vision for a 21 st Century data repository At a time of Open Access, Open Data and funding agency requirement to make data public – build a data repository Funding is split for licensing content and services (VAST MAJORITY) and some funding for research and development

An Initial “Vague” Vision Set Manage “all” of the chemistry data associated with chemical substances Data to be downloadable, reusable, interactive Build a platform that enables the scientist Data storage, validation, standardization and curation Collaborative data sharing Provide data platform that can enable and enhance publishing of scientific papers

Data Repository Registration of chemical compounds Deposition of chemical syntheses Addition of analytical data Integration to electronic notebooks Rewards and recognition for data sharing Document processing Hosting of data as private, embargoed or public

What we will deliver for all data Simple interfaces for uploading of data Embeddable widgets and programming interfaces to utilize in in-house systems, ELNs Automated harvesting approaches – sweeping directories for data Data validation where possible

Input data pipeline

Compounds upload Draw chemicals in the interface (Javascript editors – PC, Mac, Tablets, Phones) Drag and drop of compounds Automated generate of properties – Formulae, Mw, Mi, physchem properties Metadata input forms Bulk upload

Depositions Gateway User Interface

Chemical Validation and Standardization

Reactions Hosting of reaction data – standard “document formats” – full flexibility but limiting – extraction of data from embedded objects Encourage template formats – using ELNs for example, community agreed templates

Electronic Notebook Data Development work integrating chemistry into the Southampton Labtrove notebook Stoichiometry table development Analytical data integration “ChemTrove” rolled out to a small test group in January

Micropublishing Syntheses

ChemSpider SyntheticPages

Requirements Community agreement on acceptable templates for CSSP/Reactions deposition Data Model deposition based on mappings between template and CSSP model Adoption of Labtrove interface for deposition

What we will deliver Micropublishing platform for submission of Protocols and Procedures Reactions Safety and Hazard data (LATER) Template-based submissions of procedures Matched to ELN submissions Full details for user submission versus mapped submission into database

Reaction Deposition/Validation

Spectral Data Support for “structure identification” is a must – “greatest value” for reference and lookup Support for data standards primarily – JCAMP, mzML, SPC Want to support ASSIGNED data formats Hold binary files but prefer standards – WHY?

Raw Spectral Data

10 years from now… Binary file formats generally need original data processing software to deal with them – from Bruker, Agilent, Jeol, Thermo, Waters, blah, blah, blah, blah,… While we can store the original raw data files for posterity should we? This has been one focus for data repositories

This is way more useful

Processed data… Spectral searching is made possible Spectral matching is possible

This is what we really want…

Addition of Analytical Data Spectral Container is in development using componentized widgets for display NIST spectra converted into standardized JCAMP format for deposition - 296,103 spectra deposited 10% of remaining NIST spectra need to be curated as there are obvious structure issues

Javascript viewer NMR, MS, IR

Depositions Gateway User Interface

Document processing

Depositions Gateway User Interface

User Interface Approach

Present activities for ACS Fall Deposition process development of compounds, reactions and spectral data by end of Spring FTP, DropBox, Web-upload, ELN integration Compounds, Reactions, Spectral data search, display, download Data sharing – private, public, collaborative Metadata, metadata, metadata standards! Open Sourcing Chemical Registry System including CVSP

UK Chemical Database Service The National Chemical Database Service is for UK academics What would be necessary to make this available for “Rest of World”, a single institution, an organization? It’s not really technology…that’s scale out and can be handled It’s negotiation with database providers, pricing, login/authentication, localization?

Acknowledgments Jeremy Frey and Simon Coles, University of Southampton Will Dichtel and Leah McEwan, Cornell University Stuart Chalk, University of North Florida Bob Hanson and Bob Lancashire, Jmol and JSpecView

Thank you ORCID: Twitter: Personal Blog: SLIDES: