Overview of open resources to support automated structure verification

Slides:



Advertisements
Similar presentations
An Introduction to the IMSLP Petrucci Music Library An online music resource
Advertisements

UK National Chemical Database Service: An integration of commercial and public chemistry services to support chemists in the United Kingdom Antony Williams,
Royal Society of Chemistry developments to support open drug discovery Antony Williams, Ken Karapetyan, Valery Tkachenko, Colin Batchelor Alexey Pshenichnov.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
How community crowdsourcing and social networking is helping to build a quality online resource for chemists.
Crowdsourced Curation of Chemistry Data. How Bad is Online Chemistry Data? Antony Williams Wolfram Summit, September 2010.
Crowdsourcing Chemistry for the Community – 5 Years of Experiences Antony Williams NFAIS, February 28 th 2012.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
The Value of a Unique Researcher Identifier to ChemSpider Projects Antony Williams ORCID Meeting, Boston, May 18 th 2011.
FPDS- NG Reports Overview December 16, Today’s Goals Provide an overview of the FPDS-NG reporting capability Demonstrate each of the reporting tools.
ChemSpider – A Crowdsourcing Environment for Hosting and Validating Chemistry Resources (and lessons from President Bush) Antony Williams 5th Meeting on.
Natural Resource Program Center NPSpecies Update Alison Loar and Michelle Flenner 4/21/2010.
Royal Society of Chemistry activities to develop a data repository for chemistry-specific data Aileen Day, Alexey Pshenichnov, Ken Karapetyan, Colin Batchelor,
CSED Computational Science & Engineering Department CHEMICAL DATABASE SERVICE The Current Service is Well Regarded The CDS has a long and distinguished.
ChemSpider – A Combination Platform of Free Chemistry Database, Free Prediction Engines and Crowdsourcing Environment Antony Williams University of Oregon,
Microsoft Excel 2007 © Wiley Publishing All Rights Reserved. The L Line The Express Line to Learning L Line.
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.
Marrying ACD/Labs technologies to eScience Projects at the Royal Society of Chemistry Antony Williams ACD/Labs User Meeting June 2013.
The Benefits of Participation in the Social Web of Science Antony Williams Research Square October 30 th 2014.
Website established ( Concentrating initially on establishing a mass spectral database Data will be collected in JCAMP-DX format to.
Environmental and Health Data Health Datapalooza III Data Lab June 5, 2012 Steve Young, Senior Advisor, EPA (This document does not.
How* to Win the #BestMicrosoftHack Shahed Chowdhuri Sr. Technical WakeUpAndCode.com *Hint: Use the Cloud.
Vendor Session: ChemSpider, from Royal Society of Chemistry.
One publisher’s perspectives on an evolving industry Grace Baynes Nature Publishing Group October 2009.
Data enhancing the Royal Society of Chemistry publication archive Antony Williams, Colin Batchelor, Peter Corbett, Ken Karapetyan and Valery Tkachenko.
Thin Client Collaboration Web Services Minjun Wang Department of Electrical Engineering and Computer Science Syracuse University, U.S.A
Reaxys – The Highlights. Slide 2 What is Reaxys? A brand new workflow solution for research chemists and scientists from related disciplines An extensive.
Structure verification and elucidation using the ChemSpider database Antony J Williams, Valery Tkachenko and Alexey Pshenichnov SERMACS, November 16 th.
MS Libraries for Forensics: DART-MS and GC-MS
General & Background InformationPractical & Useful DataDetailed, Original Research Encyclopedias Dictionaries Reference Texts Books Safety Information.
Digital Certificates Presented by: Matt Weaver. What is a digital certificate? Trusted ID cards in electronic format that bind to a public key; ex. Drivers.
Who is NCCT? National Center for Computational Toxicology – part of EPA’s Office of Research and Development Research driven by EPA’s Chemical Safety for.
The CompTox Chemistry Dashboard: an informational data hub at the
The KNIME workflow for automated processing of PHYSPROP data
US EPA’s CompTox Chemistry Dashboard
The ACT with Writing and ACT WorkKeys - Alabama
Applying Royal Society of Chemistry Cheminformatics Skills to Support the PharmaSea Project Antony Williams, Alexey Pshenichnov, Valery Tkachenko, Ken.
ALTERNATIVE CLIENT CONTACT CHANELS
Dealing with the complex challenge of managing diverse chemistry data online Antony Williams, Valery Tkachenko, Alexey Pshenichnov and Ken Karapetyan.
SHARE: A Public Good to Increase Scholarly Innovation
EBSCO eBooks.
ORCID ID: Driving needs for analytical data exchange standards and the potential impacts on the chemical sciences Antony Williams.
Partner Enables MS Office 365 Users to Create Code-Free Mobile Applications from Excel Data “Office 365 is the premier productivity solution for businesses.
RSC电子平台使用介绍 联系人:孙燕 Tel:
CREATIVE COMMONS FOR CULTURAL HERITAGE
User guide to books at jstor
Five years of helping chemists to create an online presence using freely available resources Antony Williams National.
Who knew I would get here from there: How I became the ChemConnector
Authorship initiatives
MTM Tools key to running
How to use Google Scholar for Academic Research?.
Power BI Overview  SV Trainings Power BI Course allows you to master Power BI Tool. We provide best Power BI Online Training classes to help you learn.
Module 6: Preparing for RDA ...
Securely run and grow your business with Microsoft 365 Business
Using SDMX structures to facilitate data reporting
Teacher Academy Workshops
Mobilizing EPA’s CompTox Chemistry Dashboard Data on Mobile Devices
Workshop on the data collection of occupational data
Automating Profitable Growth™
PREMIS Tools and Services
State Technology Updates
SharePoint 2019 Overview and Use SPFx Extensions
Increase productivity
Ex Libris Leganto : Sharing the Love of Reading Lists
Power BI Streaming Datasets with MS Flow
From Local Catalog to World Wide Web
Digital Resources.
Automating Profitable Growth
The single digital gateway
Presentation transcript:

Overview of open resources to support automated structure verification http://www.orcid.org/0000-0002-2668-4821 Overview of open resources to support automated structure verification and elucidation Antony Williams1 and Emma Schymanski2 National Center for Computational Toxicology, US EPA, Research Triangle Park, Durham, NC, USA. Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Campus Belval, Luxembourg. The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA March 2018 ACS Spring Meeting, New Orleans

Today’s Session Our focus for the session: Access to data to support automated structure verification and elucidation – NMR and MS Data quality, curation and validation – and a call to action OPENness is here – Open Access, Data, Source Data standards – we already have them and there are more coming Vendors and scientists providing and using available data There are tools USING these data for Structure Elucidation Cannot be an exhaustive review…but at least a good start

An Ideal Scenario… All published structures and spectra will be available from all published articles for repurposing and reuse in standard formats (preferably not necessarily Open!) Scientists are building open approaches MS fragmentation NMR shift prediction Structure generators Computer-Assisted Structure Elucidation (CASE) Are we there yet???

Publishers sharing data We have achieved ideal scenario right? No – PDF figures in Supplementary Info is still the default There is a need for public databases of spectral data. There ARE some out there. Analogous to Wikipedia, we are primarily consumers rather than contributors…

Sites sharing data There are many sites that “share” spectral data. Generally in non-open formats There are rich resources Cannot easily be used to serve automated structure verification and elucidation

PubChem – Spectral Links

PubChem - Spectral Links

Spectral Links to Partial Data

SDBS – Free Not Open

SDBS – Free Not Open

SDBS – Free Not Open

ChemSpider

ChemSpider

ChemSpider

NIST WebBook https://webbook.nist.gov/chemistry/

NIST WebBook https://webbook.nist.gov/chemistry/

Focused Databases Focused databases Compiled focused databases of Open Data are preferable Spectral data for structure verification and elucidation – Open Mass Spec Data especially useful (Emma’s talks!) Data can be brought in-house and integrated Algorithms can be derived – e.g. NMR shift prediction

NMRShiftDB https://nmrshiftdb.nmr.uni-koeln.de/

NMRShiftDB https://nmrshiftdb.nmr.uni-koeln.de/

NMRShiftDB https://nmrshiftdb.nmr.uni-koeln.de/

Open Databases offer more value Open Resources Open Databases offer more value Bring the data in-house, integrate, link Ingest and train algorithms

CSEARCH/NMRPREDICT http://nmrpredict.orc.univie.ac.at/

MassBank https://massbank.eu/MassBank/

MassBank https://massbank.eu/MassBank/

m/z CLOUD https://www.mzcloud.org/

Integrating Data and Services Integration: Use simple URL linking for navigation Provide simple services for real time prediction

Example Integration via the CompTox Chemistry Dashboard

Link-Based Access

Link Access

Open Data For Bulk Predictions Open Data for apps Structures CAS Registry Numbers Names Formulae Mass iOS app including predicted C13 NMR

Mass Searching

Mass and CNMR Searching

Important Standards in our efforts Structures – Molfile, SDF file, InChIs (standard and non-standard) NMR – JCAMP and all its variants MS – mz(X)ML, MSP (and all its variants), MGF, MassBank

There are more coming

Conclusion The abundance of online data continues to grow There are “integrated data”, there are databases, there are online tools, there are mobile apps Data Quality is critical and OPENness is enabling Open Data Open Standards Open Source The rest of the day will expand on these efforts…

Contact Antony Williams US EPA Office of Research and Development National Center for Computational Toxicology (NCCT) Williams.Antony@epa.gov ORCID: https://orcid.org/0000-0002-2668-4821