QSAR Application Toolbox: Step 12: Building a QSAR model

Slides:



Advertisements
Similar presentations
How to Use a Microsoft Excel* Spreadsheet to Create Graphs.
Advertisements

Working with Profiles in IX1D v 3 – A Tutorial © 2006 Interpex Limited All rights reserved Version 1.0.
Working with Tables for Page Design – Lesson 41 Working with Tables for Page Design Lesson 4.
INDEX 1.BASICS: How is the Tox-Hub page organized ?BASICS: How is the Tox-Hub page organized ? 1.1. The administrator’s login area 1.2. Tabs bar area Heroic.
Using Microsoft Access Microsoft Access is a database program. Databases allow you to organize store and present a group of related information, for example.
1 Development & Evaluation of Ecotoxicity Predictive Tools EPA Development Team Regional Stakeholder Meetings January 11-22, 2010.
TANKS Software Operation First time in this software you need to add the meteorological data for your facility into the database To do this you will need.
Querying a Database Using the Select Query Window
DEMONSTRATION FOR SIGMA DATA ACQUISITION MODULES Tempatron Ltd Data Measurements Division Darwin Close Reading RG2 0TB UK T : +44 (0) F :
Slide 1 SOLVING THE HOMEWORK PROBLEMS Simple linear regression is an appropriate model of the relationship between two quantitative variables provided.
Using Dreamweaver. Slide 1 Dreamweaver has 2 screens that do different things The Document window where you create your WebPages The Site window where.
Using the Georgia Online Assessment System(OAS) We will lead the nation in improving student achievement. Kathy Cox, State Superintendent of Schools.
Department of Mechanical Engineering, LSUSession VII MATLAB Tutorials Session VIII Graphical User Interface using MATLAB Rajeev Madazhy
GHS CLASSIFICATION ONLINE. Registration: Click on “Register”
WEKA - Explorer (sumber: WEKA Explorer user Guide for Version 3-5-5)
Mike Comber Consulting TIMES-SS Assessment of skin sensitisation hazard Presented on behalf of the TIMES-SS consortia.
Management Information Systems MS Access MS Access is an application software that facilitates us to create Database Management Systems (DBMS)
1. CLICK “CONTACTS” (BOTTOM LEFT CORNER OF SCREEN) 2. SELECT “NEW CONTACT GROUP”
0 eCPIC User Training: Resource Library These training materials are owned by the Federal Government. They can be used or modified only by FESCOM member.
 Whether using paper forms or forms on the web, forms are used for gathering information. User enter information into designated areas, or fields. Forms.
MS Access 2007 Management Information Systems 1. Overview 2  What is MS Access?  Access Terminology  Access Window  Database Window  Create New Database.
Office Management Tools II Ms Saima Gul. Office Management Tools II Ms Saima Gul.
Microsoft Access 2000 Presentation 1 The Basics of Access.
Mike Comber TIMES-SS Application of Reactivity Principles in Screening for Skin Sensitisers Presented on behalf of the TIMES-SS consortia & International.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Katy Wolstencroft myGrid University of Manchester IMPACT/Taverna Hackathon 2011.
McKim Conference on Predictive Toxicology
Chapter 3 Response Charts.
Web Page-Chapter 6 Forms. Inserting a Form  Display the Insert bar  Click the arrow to the right of the display category on the Insert bar and then.
Barcelona April, 2008 Overview of the QSAR Application Toolbox Gilman Veith International QSAR Foundation Duluth, Minnesota.
Organization for Economic Co-operation and Development QSAR Application Toolbox -filling data gaps using available information- McKim Conference, September.
1 Berger Jean-Baptiste
QSAR Application Toolbox: First Steps - Data Gap Filling (Read-Across by Analogue Approach)
QSAR Application Toolbox Workflow
QSAR Application Toolbox Workflow
Laboratory of Mathematical Chemistry,
QSAR Toolbox Database Import/Export
General Concepts in QSAR for Using the QSAR Application Toolbox
Example of the storage location of the sample folder
Spreadsheet Manager Training Module
WSP quality assurance tool
TOPSpro Special Topics
European Computer Driving Licence
General Concepts in QSAR for Using the QSAR Application Toolbox
SQL MODELER - OPEN There are Three Ways to open the SQL Modeler
Database application MySQL Database and PhpMyAdmin
Background This is a step-by-step presentation designed to take the first time user of the Toolbox through the workflow of a data filling exercise.
Mail Merge And Macros in MS WORD
Reports: Pivot Table ©2015 SchoolCity, Inc. All rights reserved.
Right-of-Way Cost Estimating Planning Tool Training Guide
Central Document Library Quick Reference User Guide View User Guide
Outlook Background Objectives Specific Aims
QSAR Toolbox Database Import/Export
OECD QSAR Toolbox v.4.2 An example illustrating RAAF Scenario 1 and related assessment elements.
OECD QSAR Toolbox v.4.2 An example illustrating RAAF Scenario 2 and related assessment elements.
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Category elements for assessing category consistency
Outlook Background Objectives Specific Aims
Microsoft Office Access 2003
Microsoft Office Access 2003
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Introduction to Database Programs
Evaluating alert performance accounting for a metabolism
Eviews Tutorial for Labor Economics Lei Lei
This presentation document has been prepared by Vault Intelligence Limited (“Vault") and is intended for off line demonstration, presentation and educational.
Tutorial 7 – Integrating Access With the Web and With Other Programs
European Computer Driving Licence
Introduction to Database Programs
The Category Approach for Predicting Mutagenicity and Carcinogenicity
Presentation transcript:

QSAR Application Toolbox: Step 12: Building a QSAR model

Objectives This presentation demonstrates building a QSAR model for predicting acute toxicity to Tetrahymena pyriformis of aldehydes. The presentation addresses specifically: predicting acute toxicity for a target chemical; building QSAR model based on the prediction; applying the model to other aldehydes; exporting the predictions to a file.

The Exercise This exercise includes the following steps: select a target chemical – Furfural, CAS 98011; extract available experimental results; search for analogues; estimate the 48h-IGC50 for Tetrahymena pyriformis by using trend analysis; improve the data set by either: subcategorising by “Protein binding” mechanisms, or assessing the difference between outliers and the target chemical evaluate and save the model; Use the model to display its training set, visualize its applicability domain and perform predictions.

Chemical Input After launching the Toolbox, select the “Flexible Track”. This takes you to the first module, which is “Chemical input”. Enter the target chemical by its CAS number (98-01-1)

Select target chemical – Furfural, CAS 98011

Substance Information

Profiling the Target Chemical Select the “Profiling methods” you wish to use by clicking on the box before the name of the profiler. For this example check all mechanistic methods. Click on “Apply”.

Profiling

Target interaction with proteins Double clicking shows profiling scheme The chemical could interact with protein by Schiff-base formation.

Target interaction with proteins

Endpoints “Endpoints” refer to the electronic process of retrieving the environmental fate, ecotoxicity and toxicity data that are stored in the Toolbox database. Data gathering can be executed in a global fashion (i.e., collecting all data of all endpoints) or on a more narrowly defined basis (e.g., collecting data for a single or limited number of endpoints).

Extracting endpoint values

Redundancy table Reports for same endpoint values across databases

Reproducing endpoint value In this exercise we will build a QSAR model to estimate the following endpoint : Ecotoxicological Information Aquatic Toxicity Protozoa Tetrahymena pyriformis IGC50 48h

Defining a Category The initial search for analogues is based on structural similarity, in this example: - US EPA categorization

Category Definition

Set Category Name

Analogues The data is automatically collated. Based on the defined category (Aldehydes US EPA categorisation) 274 analogues have been identified. These 274 compounds along with the target chemical form a category (Aldehydes), which can be used for data gap filling (see next slide).

Analogues

Extracting experimental results for analogues Highlight the [274] Aldehydes (US EPA categorisation). The inserted window entitled “Read Data?” appears (see next slide). Click OK.

Extracting experimental results for analogues

Extracting experimental results for analogues

Applying Trend-analysis Move to the module “Filling data gap” Open the data tree to: Ecotoxicological information Protozoa Tetrahymena pyriformis IGC50 48 h Highlight the data endpoint box under the target chemical. It contains already an experimental result, which we are going to reproduce by trend analysis. Next with the “trend analysis” box highlighted, click “Apply” (see next slide).

Apply Trend-analysis

Results of Trend-analysis

Interpreting the Trend-analysis The resulting plot outlines the available experimental results of all analogues (Y axis) according to a default descriptor Log Kow (X axis). The RED dot represents the target chemical. The BLUE dots represent the experimental results available for the analogues. The GREEN dots represent the analogues belonging to a different subcategory (see following slides).

An Accurate Trend Analysis of the Data set (1) In this example, the mechanistic properties of the analogues are not consistent. Subcategorization can be performed based on protein binding mechanisms. This is the second stage of analogue search - requiring the same interaction mechanism. Acute effects are indeed associated with interaction of chemicals with lipid cell membrane, i.e. with protein binding. Chemicals with a different protein binding mechanism compared to the target chemical will be removed.

Subcategorization To improve the data by subcategorizing, follow these steps: Click on Subcategor. Select Protein binding from the Grouping methods list. All chemicals which have a potential protein binding mechanism different from the target chemical are highlighted (GREEN dots) Click on Remove.

Subcategorization

Result after Subcategorization

An Accurate Trend Analysis of the Data set (2) The chemicals which differ from the target are: Michael type nucleophilic addition (23); No binding (48); Nucleophilic addition to azomethynes (1); Nucleophilic substitution of haloaromatics (1); Another way for refining the data set is to ask what makes the obvious outliers different from the target.

Subcategorization Right-Click on any of the outlying results from the analogues (BLUE dots) Select Differences to target from the menu Select Protein binding from the Grouping methods list Click on Remove (see next slide)

Subcategorization

Result after Subcategorization

QSAR Model evaluation To assess the model accuracy use: - Adequacy (predictions after leave-one-out) - Statistics - Cumulative frequency

QSAR Model evaluation

QSAR Model evaluation

abs (obs-predicted) for 95% comparable with the variation QSAR Model evaluation The residuals abs (obs-predicted) for 95% of analogues are comparable with the variation of experimental data.

Saving the Derived QSAR Model To save the new regression model follow these steps: - Click on Save model button - Enter the model name “Acute tox” - Click on OK and - Accept the value

QSAR Model evaluation

Apply QSAR model The derived model can be used to: List training set chemicals; Right-click on the QSAR model Acute tox ; Select training set from the context menu; Visualize whether a chemical is in the applicability domain of the model; In the data matrix highlight the empty cell of one of the analogues (e.g. chemical no 2 in the matrix) for the endpoint 48-h IGC Tetrahymena pyriformis Select Display domain; Perform predictions for the chemicals in the matrix. Select Predict endpoint and All Chemicals in domain

Apply QSAR model Training set

Apply QSAR model Visualize whether a chemical is in the applicability domain of the model The chemical is an aldehyde as required by the model. It can react with protein by Schiff-base formation and does not react to protein by any of the eliminated mechanisms: Michael-type nucleophilic addition No binding Nucleophilic addition to azomethynes Nucleophilic substitution of haloaromatics Another requirement is Log Kow to be >=0.3210 and <= 4.75. The last requirement is slightly violated (Log Kow = 4.87) and therefore the chemical is outside of the applicability domain of the model.

Apply QSAR model Visualize whether a chemical is in the applicability domain of the model

Apply QSAR model Perform predictions

Apply QSAR model Perform predictions

Export QSAR results The predictions for the chemicals in the matrix can be exported into a text file. In the data tree right-click on 48 h (for the endpoint IGC50 for Tetrahymena pyriformis) and select Export endpoint data from the menu.

Export QSAR results click right button

Export QSAR results

Export QSAR results

Export QSAR results The resulting text file can be loaded into a spreadsheet and further analysed.