US EPA’s CompTox Chemistry Dashboard

Slides:



Advertisements
Similar presentations
Perspectives from EPA’s Endocrine Disruptor Screening Program
Advertisements

V Alyssa Rosemartin 1, Lee Marsh 1, Ellen Denny 1, Bruce Wilson USA National Phenology Network, Tucson, AZ; 2 - Oak Ridge National Laboratory, Oak.
Photo image area measures 2” H x 6.93” W and can be masked by a collage strip of one, two or three images. The photo image area is located 3.19” from left.
External Peer Review of the FDA Office of Regulatory Affairs Pesticide Program FDA Science Board Advisory Committee Meeting Nov. 4, 2005.
Systematic Review Data Repository (SRDR™) The Systematic Review Data Repository (SRDR™) was developed by the Tufts Evidence-based Practice Center (EPC),
Office of Research and Development National Center for Computational Toxicology April 6, 2010 Exposure-Based Chemical Prioritization Workshop: Exploring.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Using Social Care Online: an overview Version 1.0 April 2015.
Search on Journal of Dairy Science ® An Overview April
Improving Quality with the Substance Registry Services (SRS) John Harman U.S. EPA May 14, 2009.
The Value of a Unique Researcher Identifier to ChemSpider Projects Antony Williams ORCID Meeting, Boston, May 18 th 2011.
Introduction In order for us to learn from the extensive prior literature we have collated information on molecules screened versus Mycobacterium tuberculosis.
Bibliometrics and Impact Analyses at the National Institute of Standards and Technology Stacy Bruss and Susan Makar Research Librarians SLA Pharmaceutical.
DECam Logbook CTIO uses an electronic logbook for DECam – Proper Use will result in better procedures.
1 Innovative Science To Improve Public Health EDKB: Endocrine Disruptors Knowledge Base at the FDA Huixiao Hong, Ph.D. Center for Bioinformatics Division.
ClinicalTrials.gov Tutorial Chicago Urban Health Outreach Project ClinicalTrials.gov Tutorial Use the buttons below to navigate. Start by clicking the.
A Food Analysis1 Food Analysis. A Food Analysis 2 Definition Process of assessing the physical, chemical, and or microbiological properties.
Searching the Chemical Literature: Reference Books and Online Resources Dr. Sheppard Chemistry 4401L.
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.
Mike Rusak and Laura Fairburn CURRENT OBSTACLES ON THE APPLICATION OF AMBIENT IONIZATION MS IN FORENSICS.
Training Individuals to Implement a Brief Experimental Analysis of Oral Reading Fluency Amber Zank, M.S.E & Michael Axelrod, Ph.D. Human Development Center.
Sharon M. Jordan Assistant Director for Program Integration U.S. DOE Office of Scientific & Technical Information Vantage Point: Government R&D Results.
Biological and Chemical Oceanography Data Management Office slide 1 of 19 CAMEO Data Management Bob Groman Biological and Chemical Oceanography Data Management.
EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI Information Workshop on European Bioinformatics Resources.
Office of Research and Development Photo image area measures 1.5” H x 7” and can be masked by a collage strip of one, two or three images. The photo image.
Use of Machine Learning in Chemoinformatics
By: Kem Forbs Advanced Google Search. Tips and Tricks Keywords: adding additional terms or keywords can redefine your search and make the most relevant.
Innovative Research for a Sustainable Future mg kg -1 d -1 * For more information please send to: AbstractOngoing.
McGraw-Hill © 2007 The McGraw-Hill Companies, Inc. All rights reserved. Slide 1 Sociological Research SOCIOLOGY Richard T. Schaefer 2.
Structure verification and elucidation using the ChemSpider database Antony J Williams, Valery Tkachenko and Alexey Pshenichnov SERMACS, November 16 th.
General & Background InformationPractical & Useful DataDetailed, Original Research Encyclopedias Dictionaries Reference Texts Books Safety Information.
Introduction to PubChem BioAssay
AdisInsight User Guide July 2015
Toxicity vs CHEMICAL space
Who is NCCT? National Center for Computational Toxicology – part of EPA’s Office of Research and Development Research driven by EPA’s Chemical Safety for.
Lipinski’s rule of five
Using Social Care Online: an overview
The CompTox Chemistry Dashboard: an informational data hub at the
The KNIME workflow for automated processing of PHYSPROP data
QSAR Toolbox Database Import/Export
Searching the Petroleum Abstracts TULSA® Database
Kamel Mansouri Chris Grulke Richard Judson Antony Williams
Tox21/ToxCast AR Pathway Model
Applying Royal Society of Chemistry Cheminformatics Skills to Support the PharmaSea Project Antony Williams, Alexey Pshenichnov, Valery Tkachenko, Ken.
Contents Module 6: E-journal, E-books and Internet Resources
Research4Life Programmes: Similarities and Differences!
Statistical Methods for Model Evaluation – Moving Beyond the Comparison of Matched Observations and Output for Model Grid Cells Kristen M. Foley1, Jenise.
Jan Stanstrup Bioactive Foods and Health
How to register and use ODMAP for Fire/EMS and other partners
Getting started on informaworld™
EBSCO Discovery Service
Overview of open resources to support automated structure verification
Mobilizing EPA’s CompTox Chemistry Dashboard Data on Mobile Devices
Semiannual Report, March 2015
Welcome to this session which covers tips for searching the Web of Science. Download the slides from this presentation by clicking the Attachments tab.
Benjamin Wooden, Nicolas Goossens, Yujin Hoshida, Scott L. Friedman 
UNIFI: Overview Ken Eglinton.
Food Analysis A Food Analysis.
Maryland Online IEP System Instructional Series - PD Activity #5
I lead the initiative to design this document and the via a webinar to the community of users when CMS upgraded the version of Business Objects that was.
Strategic Environmental Assessment (SEA)
(HINARI) PubMed Conduct systematic reviews of the literature
TSCATS Complete™ Advanced
TUTORIAL Similar Compounds Searching
EXPERTIndex™ “Contains” TSCATS Complete™ Advanced
EXPERTIndex™ “Contains”
SBA Lender Portal Overview
EFSA’s Chemical Hazards Database
Presentation transcript:

US EPA’s CompTox Chemistry Dashboard Mass-Spectrometry Based Structure Identification of “Known-Unknowns” Using the US EPA’s CompTox Chemistry Dashboard Andrew D. McEachran1*, Jon R. Sobus2, and Antony J. Williams3 1Oak Ridge Institute of Science and Education (ORISE) Research Participant, Research Triangle Park, NC 2U.S. Environmental Protection Agency, Office of Research and Development, National Exposure Research Laboratory, Research Triangle Park, NC 3U.S. Environmental Protection Agency, Office of Research and Development, National Center for Computational Toxicology, Research Triangle Park, NC ANYL 77 ACS Spring 2017 San Francisco, CA April 2-6, 2017 mceachran.andrew@epa.gov l ORCiD: 0000-0003-1423-330X Problem Definition and Goals Mass Spectrometry Based Searches in the Dashboard Batch Searching of Unknowns Problem: Structure identification workflows in non-targeted analyses (NTA) have historically identified less than 10% of observed chemical features in environmental samples. Improvements in workflows require the incorporation of high quality data from a variety of resources to confidently identify structures. One such resource is the use of chemistry databases, generally online resources. Previous research documented the identification of ‘known unknowns,’ or those structures unknown to an investigator but known within a chemical resource or database, by rank-ordering of references and/or data sources (Little et al, 2012). In this manner, the most likely candidate chemicals rise to the top of a search list. Unknown features DB Matching for formula(e) C18H34N2O6S, C10H12N2O, etc. Goals: To develop NTA identification tools and provide functionality within the US EPA’s CompTox Chemistry Dashboard. To determine the efficacy of searching for ‘known unknowns’ in the CompTox Chemistry Dashboard and using data source ranking techniques to assist in structure identification. 1. Searching a single monoisotopic mass in the Advanced Search options. Excel export of batch search of molecular formulae Data included in the download consists of CASRN, formula, mass, bioactivity, etc. 3. Searching a single molecular formula in the Advanced Search options. Abstract The CompTox Chemistry Dashboard is a publicly accessible database provided by the National Center for Computational Toxicology at the US-EPA. The Dashboard provides access to a database containing ~750,000 chemicals and integrates a number of our public-facing projects (e.g. ToxCast and ExpoCast). The available data provide a valuable foundation to mass-spectrometry based structure identification and the dashboard has already been used to assist in identifying chemicals present in environmental media including house dust and water. This poster will review the data and functionality available in the CompTox Dashboard to support structure identification using mass spectrometry data. Specifically, we have developed new approaches to rank-order hit lists of chemicals based on mass and formula-based searching and have demonstrated the value of functional use filtering as an additional confirmation criterion. Future Work Rank-ordering using PubMed count & PubChem sources Occurrence count of chemicals within PubMed and PubChem will be added to internal data source counts to improve the assessment of occurrence in NTA samples Retention Time Prediction Incorporation of accurate retention time predictions allows for the screening out of unlikely candidate chemicals based on their chromatographic behavior. Recent work has demonstrated RT prediction utility on chemicals within DSSTox (McEachran et al, in prep). 2. Search results from a mass search, sorted by data sources. 4. Search results from a formula search, sorted by data sources. Identifying Known Unknowns CompTox Chemistry Dashboard   Mass-based Searching Formula-based Searching Dashboard ChemSpider Average Rank Position 1.3 2.2 1.2 1.4 Percentage in Position #1 85% 70% 88% 80% Very little non-targeted representation in poster session, provide background Predicted environmental media occurrence and functional use Functional uses (i.e. dye, preservative, antioxidant) allow for enhanced identification capabilities and can help inform likelihood of occurrence in environmental media. Predicted occurrence for chemical data content in 22 different media have been predicted and will be incorporated into future work. Non-Targeted Analysis Collaboration Research Trial The US EPA is conducting a collaborative research trial with more than 20 labs and institutions from around the world in an effort to develop, improve, and standardize analytical and data processing methods in NTA and suspect screening analysis 162 unique chemicals were searched by both monoisotopic mass and formula and results rank-ordered by data sources in the Dashboard and ChemSpider The Dashboard outperformed ChemSpider in terms of the average rank position within each search method. Additionally, the percentage of chemicals in position #1 was higher for the Dashboard in both cases (McEachran et al, 2017). Conclusions The CompTox Chemistry Dashboard outperformed ChemSpider using single searches of known unknowns and ranking by the number of data sources. Our Open Data, differently than with ChemSpider, are available for download. Advancing exposure science through NTA workflows will be incorporated via the Dashboard. Compound class Number in class Average Rank Number of compounds in each position rank-ordered   #1 #2 #3 #4 #5+ Pharmaceutical Drug 72 1.3 59 8 3 2 Manufacturing Chemicals 42 1.2 38 1 Personal Care Products 2.6 6 Steroid Hormones 7 1.0 Perfluorochemicals 5 Pesticides 12 10 Veterinary Drugs Dyes Food product/natural compounds 4 1.5 Illicit Drugs Misc. Molecules Compound class Number in class Average Rank Number of compounds in each position rank-ordered   #1 #2 #3 #4 #5+ Pharmaceutical Drug 72 1.4 55 9 6 2 Manufacturing Chemicals 42 5.5 28 3 5 Personal Care Products 8 6.1 1 4 Steroid Hormones 7 1.0 Perfluorochemicals 1.2 Pesticides 12 2.3 Veterinary Drugs 1.3 Dyes Food product/natural compounds 3.8 Illicit Drugs 2.0 Misc. Molecules References The Dashboard includes: Dashboard Homepage- provides for a simple text entry box allowing a type-ahead search for systematic, trade and trivial names, CAS Registry Numbers and InChI identifiers. Initial simple searches can be filtered to return single component chemicals (not mixtures) and ignoring isotopes. https://comptox.epa.gov Single Chemical Record Page- Chemical records associated with structure representations are displayed along with intrinsic properties (e.g. molecular formula, mass and systematic name), structural identifiers (e.g. SMILES, InChI strings and keys), and, where possible, links to related information on Wikipedia. Chemical Properties Panel- Experimental and predicted chemical properties are displayed for a number of properties including LogP, water solubility, melting point, etc. Predictions are based on the Open Structure Activity Relationships Application (OPERA) models developed from curated datasets (Mansouri et al 2016). Mansouri K, Grulke CM, Richard AM, Judson RS, Williams AJ. 2016. An automated curation procedure for addressing chemical errors and inconsistencies in public datasets used in QSAR modelling. SAR QSAR Environ. Res. 27(11): 939-965. doi:10.1080/1062936X.2016.1253611 Little JL, Williams AJ, Pshenichnov A, Tkachenko A. 2012. Identification of “known unknowns” utilizing accurate mass data and ChemSpider. J Amer Soc Mass Spectrom. 23(1): 179-185. doi:10.1007/s13361-011-0265-y McEachran AD, Sobus JR, Williams AJ. 2017. Identifying known unknowns using the US EPA's CompTox Chemistry Dashboard. Anal. Bioanal. Chem. 409(7): 1729-1735. doi:10.1007/s00216-016-0139-z Acknowledgements Comparison of results of searching individual chemicals and rank-ordering by number of data sources between the Dashboard (left) and ChemSpider (right), divided up by compound classes. The authors would like to acknowledge specific members of the development team within NCCT (Jennifer Smith, Chris Grulke, Jeff Edwards) and collaborators in NERL (Kristin Isaacs, Katherine Phillips, Kathie Dionisio) for their ongoing contributions to the Dashboard and this research. This presentation does not necessarily represent the views or policies of the U.S. Environmental Protection Agency.