Strategies to build toxicity databases for data mining Chihae Yang June 28, 2007 Leadscope, inc.

Slides:



Advertisements
Similar presentations
CONFIDENTIAL INFORMATION KnowTox, a Franco- Hungarian collaborative project relative to toxicity.
Advertisements

Perspectives from EPA’s Endocrine Disruptor Screening Program
Dosimetry in Risk Assessment and a bit More Mel Andersen McKim Conference QSAR and Aquatic Toxicology & Risk Assessment June 27-29, 2006.
Carcinogen Classification Criteria Patricia Richter Ph.D., DABT Tobacco Products Scientific Advisory Committee June 8, 2010.
June 2010 LANDSIEDEL 1 Chemical Industries Role in Tomorrows Toxicity Testing Robert Landsiedel, Susanne Kolle, Tzutzuy Ramirez, Hennicke Kamp and Ben.
Development of an Institutional Knowledge-base at FDA’s Center for Food Safety and Applied Nutrition Kirk B. Arvidson 1, Annette McCarthy 1, Chihae Yang.
National Pesticide Program A New Toxicology Testing Paradigm: Meeting Common Needs Steven Bradbury, Director Environmental Fate and Effects Division Office.
1 Development & Evaluation of Ecotoxicity Predictive Tools EPA Development Team Regional Stakeholder Meetings January 11-22, 2010.
Chemical Category Formation: Toxicology and REACH Dr Steven Enoch Liverpool John Moores University 14 th May 2009.
Chemical Screening Programs Ted Smith Dale Phenicie.
Chemicals Management in a Transatlantic Perspective Henrik Selin November 10, 2008.
Ecological Risk Asssessment Part I – The Basics. Introduction Subject normally taught at end of course, after exposure to background material Subject.
What Do Toxicologists Do?
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
The Toxic Substances Control Act (TSCA) Interagency Testing Committee (ITC)
June 16-19, USEPA Cancer Guidelines: Mode of Carcinogenic Action 1 ICABR – Impacts of the Bioeconomy on Agricultural Sustainability, the Environment.
Overview of Bioinformatics A/P Shoba Ranganathan Justin Choo National University of Singapore A Tutorial on Bioinformatics.
Application of Toxicology Databases in Drug Development (Estimating potential toxicity) Joseph F. Contrera, Ph.D. Director, Regulatory Research and Analysis.
Development of the Fathead Minnow Narcosis Toxicity Data Base Larry Brooke 1, Gilman Veith 2, Daniel Call 3, Dianne Geiger 1, and Christine Russom 4 1.
The Role of Research in the Business of the Environmental Protection Agency Steven Bradbury, Director Environmental Fate & Effects Division Office of Pesticide.
Dr. Manfred Wentz Director, Hohenstein Institutes (USA) Head, Oeko-Tex Certification Body (USA) AAFA – Environmental Committee Meeting November 10, 2008.
International Initiatives and the U.S. HPV Challenge Program Ken Geiser, PhD Lowell Center for Sustainable Production University of Massachusetts Lowell.
Preparing for REACH implementation: The RIP process Dimosthenis A. Sarigiannis, PhD Institute for Health and Consumer Protection DG Joint Research Centre.
Examining Bioaccumulation & Biomagnification: Implications for Ecosystems and Human Health.
Exploratory IND Studies
1 Innovative Science To Improve Public Health EDKB: Endocrine Disruptors Knowledge Base at the FDA Huixiao Hong, Ph.D. Center for Bioinformatics Division.
Mike Comber Consulting TIMES-SS Assessment of skin sensitisation hazard Presented on behalf of the TIMES-SS consortia.
Health Canada experiences with early identification of potential carcinogens - An Existing Substances Perspective Sunil Kulkarni Hazard Methodology Division,
Research & Science Advancing Risk Assessment Presentation March Association of Chemical Industry of the Czech Republic Monique Marrec Fairley.
Lhasa ICH M7 Database – Use Cases Dr Angela White.
Regulatory Processes for Pesticides Mark Hartman Antimicrobials Division (AD) Office of Prevention, Pesticides and Toxic Substances United States Environmental.
Issues and Challenges for Integrated Surveillance Systems Daniel M. Sosin, MD, MPH Division of Public Health Surveillance and Informatics Epidemiology.
McKim Conference on Predictive Toxicology
Barcelona April, 2008 Overview of the QSAR Application Toolbox Gilman Veith International QSAR Foundation Duluth, Minnesota.
NUATRC/TCEQ Air Toxics Workshop October Air Toxics Air Toxics: What We Know, What we Don’t Know, and What We Need to Know Human Health Effects –
McKim Conference on Predictive Toxicology The Inn of Lake Superior Duluth, Minnesota September 16-18, 2008 Toxicity Pathways as an Organizing Concept Gilman.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
By Isaac and Christy.  GMOs are regulated by the FDA (Food and Drug Administration), the APHIS (Animal Plant Health Inspection Service), and the EPA.
The Future of Chemical Toxicity Testing in the U.S.
McKim Conference on Predictive Toxicology The Inn of Lake Superior Duluth, Minnesota September 25-27, 2007 Toxicity Pathways as an Organizing Concept Gilman.
1 Harmful/Potentially Harmful Constituents in Tobacco Products and Tobacco Smoke Tobacco Product Constituents Subcommittee Meeting July 7, 2010 Corinne.
UNIT 9 Hazardous Wastes and Risk Assessment. Major Public Agencies Involved in Environmental Health Risk Assessment and Intervention Consumer Product.
Acute Toxicity Studies Single dose - rat, mouse (5/sex/dose), dog, monkey (1/sex/dose) 14 day observation In-life observations (body wt., food consumption,
The Toxic Substances Control Act of 1976 (TSCA) Draft year: October 11, 1976; Amendment years: 1976; National.
(Quantitative) Structure- Activity Relationships (Q)SAR.
QSAR Application Toolbox: First Steps - Data Gap Filling (Read-Across by Analogue Approach)
Introduction to PubChem BioAssay
Use of Borates in Swimming Pools: Consideration of Health Effects
The CompTox Chemistry Dashboard: an informational data hub at the
General Concepts in QSAR for Using the QSAR Application Toolbox
QSAR Toolbox Customized search (Query Tool)
QSAR Application Toolbox: Step 12: Building a QSAR model
FIFRA SAP Meeting February 2, 2010
General Concepts in QSAR for Using the QSAR Application Toolbox
QSAR Toolbox Customized search (Query Tool)
Susan Makris U.S. EPA, Office of Research and Development
NASs approval time by therapeutic area:
ITER & RiskIE Databases
International Toxicity Estimates for Risk (ITER) Database
Laying a Foundation for Working Effectively with EPA and ATSDR
InfoCards – making informaiton on chemicals more accessible
One Language. One Enterprise.™
The Genetic Basis for Cancer Treatment Decisions
QSAR Toolbox Customized search (Query Tool)
Commission report on Art. 8 WFD Monitoring programmes
International Initiatives and the U.S. HPV Challenge Program
QSAR Toolbox Customized search (Query Tool)
Strategies for Integrated Human and Ecological Assessment
EFSA’s Chemical Hazards Database
Presentation transcript:

Strategies to build toxicity databases for data mining Chihae Yang June 28, 2007 Leadscope, inc.

Overview Definitions of databases Landscape of current public toxicity databases Relational database Simple examples Strategies

Types of databases Literature –Bibliography Factual –Primary source –Secondary and tertiary source Curated Metadatabases

Types of information formats Monographs Simple relational tables –compound name –toxicity endpoints Structure-integrated databases

Primary sources US National Toxicology Program (NTP) – US Environmental Protection Agency, Mid- Continent Ecology Division – Tokyo Eiken (Tokyo Metropolitan Institute of Public Health) –

Curated 2 nd and 3 rd databases Chemical Carcinogenesis Research Information System ( US EPA ECOTOX ( Metadatabase –ToxNet ( QSAR database –Danish EPA (

Examples of monographs and risk assessment Internation Programme on Chemical Safety, INCHEM ( US CDC Agency for Toxic Substances & Disease Registry, ATSDR ( International Agency for Research on Cancer International Toxicity Estimates for Risk (ITER) International Risk Information System (IRIS)

ToxicityDatabase sources Carcinogenicity CCRIS, IRIS (EPA), ITER (TERA), NTP, PAN Pesticide, IARC, ATSDR, … Genetic toxicity CCRIS, GeneTox (EPA), NTP, The Mutants- Japan, Tokyo Eiken… Target organsNTP, ATSDR, IRIS (EPA), … Reproductive Developmental DART, NTP, PAN Pesticide ImmunologyNTP Skin sensitizationNo public database Environmental endpoints ECOTOX (EPA)

Issues with toxicity databases Narrow use scenarios –Mainly designed for information look-up –Limited usefulness for data mining and SAR analysis Reasons –Fragmented, inconsistent, conflicting information –Disparate format –Non-standard or questionable data quality –Lack of accessibility –…

In Silico toxicology workflow Standardize Assess Search Visualize Data mine Predict Publish Reports Databases Spreadsheets Microfiche …… Reports Databases Spreadsheets Microfiche …… IntegrationManagement and storageData mining

Trends in toxicity databases Relational databases Standardization Data mining Link to chemical structures Link to genomics, proteomics, metabonomics (metabolomics) data

Relational database - Benefits Search across fields and domains –Precise searching –Asking complex questions –Hypothesis-driven queries Basis for data integration Read across

Relational database - Requirements Data model –Standardized fields –Relationships between the fields Database platform –Oracle –MySQL –BerkeleyDB –PointBase –…

Structure-integrated databases - PubChem

Structure-integrated databases - DSSTox Link-out to PubChem CID

ToxML methodology Open standard XML format for representing toxicity data –Standardized fields –Controlled vocabulary Extensible through a schema Independent of any particular database schema and independent of any application

Compound level Study level Test level Treatment level QSAR Signal detection Individual level Yang, CODDD, 9(1), 124, Flexible data model for different uses

Example: Look-up lanzoprazole (pravacid) lanzoprazole Search ( 검색 ) ID Name

Example: Look-up lanzoprazole (pravacid) 아급성 독성 박테리아 돌연변이 포유류 돌연변이 염색체 이상 (in vivo) 반복 투여 – 만성 – 아급성 급성 독성 생식 독성

Example: Find analogs of lanzoprazole 화학구조 - substructure - similarity degree of similarity search

listed in the order of similarity Example: analogs of lanzoprazole

Profile of toxicity of lanzoprazole analogs 병리결과임상결과유전독성결과독성실험 type 투여방법

Traditional paradigm of chemical analog searching structural descriptions chemical stressor analogs profile Current Computer-Aided Drug Design, 2006, 2, biological/environmental fate

Finding biological analogs structural descriptions analogs profile biological/environmental fate

Example: Hypothesis-driven query Compounds that are genotoxic may give reproductive-developmental effects. –Positives in clastogenicity –Positives in reproductive-developmental effects For example, in US FDA PAFA database, most of the food direct additives that are clastogens resulted in some level of reproductive developmental effects.

Setting up the query: Clastogenic and repro-dev effects clastogenicity 생식 - 발생독성 염색체 이상 repro-dev

Example of query results Repro-developmental clastogenicity

Components of data mining Relational database Searching Visualization Data analysis Categorization/Ranking SAR & QSAR Chemical structure Toxicity studies Look up Asking questions Read across Data, Structures, Relationships

Example of data mining the database Search and results –Hypothesis-driven queries Analysis –Toxicity database analysis for linking profiles of different endpoints Visualization –profiles and correlation

Example: Are there correlations between pathological lesions? Target siteSpeciesChemicals that induces tumor at each site CPDB using NTP pathological terms Current Computer-Aided Drug Design, 2006, 2, 1-19.

Transformed information – need for database model NameSpeciesAdrenalBoneLiverLungThyroid 1,4-Dioxanerat, mouseabsent presentabsent 1,5- Naphthalenediamine rat, mouseabsent present Estradiol mustardratabsent presentabsent Mirexrat, mouseabsent presentabsent C.I. direct blue 15rat, mouseabsent presentabsent 11-Aminoundecanoic acid ratabsent presentabsent Malonaldehyde, sodium salt ratabsent present Trimethylthiourearatabsent present

biological profile (lesions in target organs) compound classes

Correlations of lesions between target organ sites – qualifying read across Pearson correlation coefficients: liver – thyroid80% bone – thyroid82% lung – thyroid40%

Role of toxicity databases Chemical categorization Environmental and human health risk assessment –PBT assessment persistence, bioaccumulation, toxicity A data resource for QSAR (quantitative structure activity relationship) analysis required for international regulatory initiatives

International regulatory initiatives Canadian DSL (domestic substance list) –Implemented EU REACH –Registration, Evaluation, Authorization and Restriction of Chemicals (REACH) –Originally targeted for June 1, 2007 EU 7 th Amendment –Limitation of animal testing for cosmetics

Required components of toxicity database for data mining Well modeled relational database –Searching and retrieval –Search forms allowing hypothesis-driven queries –Database entry/build tools Knowledge base –Chemistry –Biology (toxicity study results) Foundation for analysis

Strategies - Technology Relational data Database “data model” Public standards –standardized field structures –harmonized controlled vocabulary Web-based public open source Link with existing public technology

Strategies - Content Provide or link with all available toxicity endpoints Link structures with toxicity data Link with international database efforts –PubChem ( –OECD toolbox ( ) –EU ECB JRC QSAR ( Ambit ( –US EPA ToxCast ( –US EPA DSSTox ( –US FDA – Leadscope, ToxML ( )

From data to database Vitamin A reproductive- developmental effects data entry form database tree view ToxML Editor – free download at

If you build it, they will come… W.P. Kinsella, Shoeless Joe