Structure verification and elucidation using the ChemSpider database Antony J Williams, Valery Tkachenko and Alexey Pshenichnov SERMACS, November 16 th 2012
Mass Spectrometry for Structure ID Many applications of mass spectrometry are the identification of “knowns” Known structures, previously characterized, previously identified and, increasingly, online Dereplication, identification of “other manufacturers” materials, metabolites, lipids analysis – can be supported by existing databases What large database could serve mass spec. ?
ChemSpider 28 million chemicals with associated data…linked out to 400 data sources…
ChemSpider
What will ChemSpider give us??
Spectra Linked: e.g. Cholesterol
Spectra Linked
For Mass Spectrometrists Valuable searches for Mass Spec would be: Search the database by mass or formula for structure identification Search subsets of data – e.g. “metabolism”, pesticides etc Link structure-based data across the internet Provide “programming interfaces” to integrate Does ChemSpider provide value to Mass Spectrometrists?
Pre-calculated data
Mass Spec Analysis Jim Little, Eastman Chemical
ChemSpider Interface
Tinuvin 328
Position sorted by references
Position 1 only
Searching by Monoisotopic Mass
Identification of “Known Unknowns” “Known Unknowns” can be identified by searching in ChemSpider Searching of “segregated” datasets can be performed Datasets can be expanded for specific projects – for example, natural products ID…
Web Services Open Up Collaboration Agilent, Bruker, Waters and Thermo all use our web-based services for compound lookup Many academic sites integrating directly – metabonomics, name lookup, semantic markup
Web Services
Results of the ChemSpider Search in the MarkerLynx Worksheet
Hit Details in ChemSpider
Calculation of Elemental Composition & ChemSpider Search of Lipid Maps Database Performed via MarkerLynx
Commercial Database Access Recently deposited to ChemSpider EPA/NIST IR Database >5000 spectra Presently under development NIST MS database >200,000 MS spectra
Coming Soon – NIST DB in ChemSpider
Where next with Analytical Support? PharmaSea project for the identification of natural products – dereplication approaches Use mass spectrometry searches of natural product slices to identify Natural product data include from RSC databases (NPU) and ChemSpider data sources – MarinLit for example Pre-fragment compounds and develop searches Dereplication using NMR data NMR features Predicted spectra and “Verification approaches”
SpectraSchool
Coming Soon Storage and display of ASSIGNED spectra – already started with NMR spectral assignment
Crowdsourcing ChemSpider ChemSpider is crowdsourced Community deposition, annotation and curation Anyone can “Leave Feedback” Registered users can add data
Ideas for Future Work Extended search capabilities Expand existing databases Integrate to metabolic pathways tools
How long until Mobile StructureID?
Acknowledgments RSC eScience Team James Little, Eastman Chemical Company Alexey Pshenichnov, University of Leicester – SpectraSchool ACD/Labs – Assigned Spectra Display Widget Depositors of data – there are many!
Thank you Twitter: ChemConnector Personal Blog: SLIDES: