Applying Royal Society of Chemistry Cheminformatics Skills to Support the PharmaSea Project Antony Williams, Alexey Pshenichnov, Valery Tkachenko, Ken Karapetyan, David Sharpe ACS San Francisco August 2014
Cancer Deaths Worldwide
Top Treatments for Cancer
Importance of Natural Products Over half of all drugs introduced between 1940 and 2006 were of natural origin or inspired by natural compounds
Natural Products for all of us!
We Are Doomed I Tell You!!!
We Are Doomed I Tell You!!!
The Dangers of Algal Blooms!
Nature’s Little Pharmacy
We Are Doomed I Tell You!!!
Antibiotic resistance
Discovery Curve Decay
RSC and Natural Products
Focus on Marine Natural Products RSC cheminformatics support to include: Deliver “PharmaSea website” Provide access to natural products subset Develop “dereplication techniques” Searching NMR features against database Develop advanced searches for MS data Host Open Data from the PharmaSea project and make available to the community
http://www.pharma-sea.eu/
The PharmaSea Website RSC is open-sourcing a chemical registry system as a result of Open PHACTS Chemical Registry system used to underpin the PharmaSea website – behind login Will be enhanced with data deposition capabilities and “dereplication”
The PharmaSea Website
The PharmaSea Website
The PharmaSea Website
New Repository Architecture doi: 10.1007/s10822-014-9784-5
New Repository Architecture
Compounds
Reactions
Analytical data
Crystallography data
Deposition of Data
Extending PharmaSea Site PharmaSea website will be extended Spectral data handling: Support Dereplication
Identifying novel compounds Compounds are collected from the ocean Extraction via chromatography Analytical sciences including: UV-Vis data (Lambda-max) Mass spectrometry (formula/mass) NMR spectroscopy (HNMR/2D) Utilized for dereplication,,,
Is this already known or not??
Identifying novel compounds 4 Me singlets 4 Me doublets 1 OMe singlet Aromatic protons
Identifying novel compounds 2D NMR data will give details regarding substitutions and this information can be used in the dereplication process
What we need is… If we could have: A DB containing known marine natural products This would give formula and mass for searching The DB has all spectral data available for each compound If experimental data are not available then use the compound to COMPUTE spectral features
RSC Acquires Marinlit All Marinlit chemical compounds in ChemSpider Marinlit developers are dereplication experts
Structure searchable database Index literature related to marine natural products: 26K articles and growing Structure searchable database Data includes taxonomy, location and literature “Spectral features” generated algorithmically Utilize the spectral features for dereplication MarinLit is ‘article-centric’ and not compound centric. Compounds are only indexed when they are newly discovered, revised, or new to marine. All compound records link to the paper they were first mentioned. They are not linked to subsequent articles that describe them.
PharmaSea Dereplication Work in progress: Produce “dereplication widget” to embed in the PharmaSea website Generate “structure features” file for every new compound deposited to PharmaSea Ideal would be to utilize spectral data directly to elucidate structures – “Computer Assisted Structure Elucidation”. ACD/Labs….
CASE-based Elucidation Computers can elucidate structures today with greater efficiency and success than many scientists – see Patrick Wheeler’s talk Natural products specifically can be very challenging and CASE is well-proven ACD/Labs have delivered their CASE-system (ACD/Structure Eludicator) to the project
1D & 2D NMR Synchronized Processing The Software displays correlations for assigned spectra and structures, and highlights correlations that are likely to be erroneous.
ChemSpider supporting CASE RSC delivered entire ChemSpider structure dataset for inclusion into the Structure Elucidator software.
CASE vs Microscopy? DOI: 10.1002/anie.201203960
Single Molecule AFM
CASE vs Microscopy? DOI: 10.1002/anie.201203960
Next:Tagging Natural Products
Next:Tagging Natural Products
Next:Tagging Natural Products
Next:Tagging Natural Products
Future Plans Roll out tagging on ChemSpider to crowdsource marine natural products subset Implement tagging for further details onto PharmaSea website Collaborate with other natural product sources Mass spectrometry fragmentation prediction
Future Plans – MS Fragmenter
Future Plans – MS Fragmenter
Future Plans
To be published: 2015 (RSC) Modern NMR Approaches To The Structure Elucidation of Natural Products Volume 1: Instrumentation and Software Volume 2: Data Acquisition and Applications to Compound Classes Edited by Antony Williams, RSC, Gary Martin, Merck and David Rovnyak, Bucknell University
To be published: 2015 (Springer) Computer-based Structure Elucidation from Spectral Data Will include a functional demo version of the ACD/Structure Elucidator software to teach the basic approaches to computer-assisted structure elucidation Authored by Mikhail Elyashberg, Kirill Blinov and Antony Williams
Acknowledgments Alexey Pshenichnov, Ken Karapapetyan and Valery Tkachenko (RSC – US Cheminformatics) Marcel Jaspars (University of Aberdeen) John Blunt and Murray Munro (Marinlit) Serin Dabb (RSC, Marinlit) Patrick Wheeler and David Hardy (ACD/Labs)
Thank you Email: williamsa@rsc.org ORCID: 0000-0002-2668-4821 Twitter: @ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams 57