Presentation is loading. Please wait.

Presentation is loading. Please wait.

US EPA’s CompTox Chemistry Dashboard

Similar presentations


Presentation on theme: "US EPA’s CompTox Chemistry Dashboard"— Presentation transcript:

1 US EPA’s CompTox Chemistry Dashboard
Mass-Spectrometry Based Structure Identification of “Known-Unknowns” Using the US EPA’s CompTox Chemistry Dashboard Andrew D. McEachran1*, Jon R. Sobus2, and Antony J. Williams3 1Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC 2U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, Research Triangle Park, NC 3U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, Research Triangle Park, NC ANYL 77 ACS Spring 2017 San Francisco, CA April 2-6, 2017 l ORCiD: X Problem Definition and Goals Mass Spectrometry Based Searches in the Dashboard Batch Searching of Unknowns Problem: Structure identification workflows in non-targeted analyses (NTA) have historically identified less than 10% of observed chemical features in environmental samples. Improvements in workflows require the incorporation of high quality data from a variety of resources to confidently identify structures. One such resource is the use of chemistry databases, generally online resources. Previous research documented the identification of ‘known unknowns,’ or those structures unknown to an investigator but known within a chemical resource or database, by rank-ordering of references and/or data sources (Little et al, 2012). In this manner, the most likely candidate chemicals rise to the top of a search list. Unknown features DB Matching for formula(e) C18H34N2O6S, C10H12N2O, etc. Goals: To develop NTA identification tools and provide functionality within the US EPA’s CompTox Chemistry Dashboard. To determine the efficacy of searching for ‘known unknowns’ in the CompTox Chemistry Dashboard and using data source ranking techniques to assist in structure identification. 1. Searching a single monoisotopic mass in the Advanced Search options. Excel export of batch search of molecular formulae Data included in the download consists of CASRN, formula, mass, bioactivity, etc. 3. Searching a single molecular formula in the Advanced Search options. Abstract The CompTox Chemistry Dashboard is a publicly accessible database provided by the National Center for Computational Toxicology at the US-EPA. The Dashboard provides access to a database containing ~750,000 chemicals and integrates a number of our public-facing projects (e.g. ToxCast and ExpoCast). The available data provide a valuable foundation to mass-spectrometry based structure identification and the dashboard has already been used to assist in identifying chemicals present in environmental media including house dust and water. This poster will review the data and functionality available in the CompTox Dashboard to support structure identification using mass spectrometry data. Specifically, we have developed new approaches to rank-order hit lists of chemicals based on mass and formula-based searching and have demonstrated the value of functional use filtering as an additional confirmation criterion. Future Work Rank-ordering using PubMed count & PubChem sources Occurrence count of chemicals within PubMed and PubChem will be added to internal data source counts to improve the assessment of occurrence in NTA samples Retention Time Prediction Incorporation of accurate retention time predictions allows for the screening out of unlikely candidate chemicals based on their chromatographic behavior. Recent work has demonstrated RT prediction utility on chemicals within DSSTox (McEachran et al, in prep). 2. Search results from a mass search, sorted by data sources. 4. Search results from a formula search, sorted by data sources. Identifying Known Unknowns CompTox Chemistry Dashboard Mass-based Searching Formula-based Searching Dashboard ChemSpider Average Rank Position 1.3 2.2 1.2 1.4 Percentage in Position #1 85% 70% 88% 80% Very little non-targeted representation in poster session, provide background Predicted environmental media occurrence and functional use Functional uses (i.e. dye, preservative, antioxidant) allow for enhanced identification capabilities and can help inform likelihood of occurrence in environmental media. Predicted occurrence for chemical data content in 22 different media have been predicted and will be incorporated into future work. Non-Targeted Analysis Collaboration Research Trial The US EPA is conducting a collaborative research trial with more than 20 labs and institutions from around the world in an effort to develop, improve, and standardize analytical and data processing methods in NTA and suspect screening analysis 162 unique chemicals were searched by both monoisotopic mass and formula and results rank-ordered by data sources in the Dashboard and ChemSpider The Dashboard outperformed ChemSpider in terms of the average rank position within each search method. Additionally, the percentage of chemicals in position #1 was higher for the Dashboard in both cases (McEachran et al, 2017). Conclusions The CompTox Chemistry Dashboard outperformed ChemSpider using single searches of known unknowns and ranking by the number of data sources. Our Open Data, differently than with ChemSpider, are available for download. Advancing exposure science through NTA workflows will be incorporated via the Dashboard. Compound class Number in class Average Rank Number of compounds in each position rank-ordered #1 #2 #3 #4 #5+ Pharmaceutical Drug 72 1.3 59 8 3 2 Manufacturing Chemicals 42 1.2 38 1 Personal Care Products 2.6 6 Steroid Hormones 7 1.0 Perfluorochemicals 5 Pesticides 12 10 Veterinary Drugs Dyes Food product/natural compounds 4 1.5 Illicit Drugs Misc. Molecules Compound class Number in class Average Rank Number of compounds in each position rank-ordered #1 #2 #3 #4 #5+ Pharmaceutical Drug 72 1.4 55 9 6 2 Manufacturing Chemicals 42 5.5 28 3 5 Personal Care Products 8 6.1 1 4 Steroid Hormones 7 1.0 Perfluorochemicals 1.2 Pesticides 12 2.3 Veterinary Drugs 1.3 Dyes Food product/natural compounds 3.8 Illicit Drugs 2.0 Misc. Molecules References The Dashboard includes: Dashboard Homepage- provides for a simple text entry box allowing a type-ahead search for systematic, trade and trivial names, CAS Registry Numbers and InChI identifiers. Initial simple searches can be filtered to return single component chemicals (not mixtures) and ignoring isotopes. Single Chemical Record Page- Chemical records associated with structure representations are displayed along with intrinsic properties (e.g. molecular formula, mass and systematic name), structural identifiers (e.g. SMILES, InChI strings and keys), and, where possible, links to related information on Wikipedia. Chemical Properties Panel- Experimental and predicted chemical properties are displayed for a number of properties including LogP, water solubility, melting point, etc. Predictions are based on the Open Structure Activity Relationships Application (OPERA) models developed from curated datasets (Mansouri et al 2016). Mansouri K, Grulke CM, Richard AM, Judson RS, Williams AJ An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling. SAR QSAR Environ. Res. 27(11): doi: / X Little JL, Williams AJ, Pshenichnov A, Tkachenko A Identification of “known unknowns” utilizing accurate mass data and ChemSpider. J Amer Soc Mass Spectrom. 23(1): doi: /s y McEachran AD, Sobus JR, Williams AJ Identifying known unknowns using the US EPA's CompTox Chemistry Dashboard. Anal. Bioanal. Chem. 409(7): doi: /s z Acknowledgements Comparison of results of searching individual chemicals and rank-ordering by number of data sources between the Dashboard (left) and ChemSpider (right), divided up by compound classes. The authors would like to acknowledge specific members of the development team within NCCT (Jennifer Smith, Chris Grulke, Jeff Edwards) and collaborators in NERL (Kristin Isaacs, Katherine Phillips, Kathie Dionisio) for their ongoing contributions to the Dashboard and this research. This presentation does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.


Download ppt "US EPA’s CompTox Chemistry Dashboard"

Similar presentations


Ads by Google