Sung Kyu (Andrew) Maeng. Contents  QSAR Introduction  QSBR Introduction  Results and discussion  Current QSAR project in UNESCO-IHE.

Slides:



Advertisements
Similar presentations
Agenda of Week V. Forecasting
Advertisements

The Robert Gordon University School of Engineering Dr. Mohamed Amish
Research Methods in Crime and Justice Chapter 4 Classifying Research.
Conceptualization and Measurement
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Experimental Design, Response Surface Analysis, and Optimization
1 Development & Evaluation of Ecotoxicity Predictive Tools EPA Development Team Regional Stakeholder Meetings January 11-22, 2010.
PROBABILISTIC ASSESSMENT OF THE QSAR APPLICATION DOMAIN Nina Jeliazkova 1, Joanna Jaworska 2 (1) IPP, Bulgarian Academy of Sciences, Sofia, Bulgaria (2)
Internal documentation and user documentation
Chapter 4 Validity.
Understanding the Research Process
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
Concept of Measurement
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.
Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott.
Lecture 6: Multiple Regression
Chapter 3 Forecasting McGraw-Hill/Irwin
Forecasting McGraw-Hill/Irwin Copyright © 2012 by The McGraw-Hill Companies, Inc. All rights reserved.
Business Statistics - QBM117 Statistical inference for regression.
CORRELATIO NAL RESEARCH METHOD. The researcher wanted to determine if there is a significant relationship between the nursing personnel characteristics.
Multiple Regression Dr. Andy Field.
Measurement and Data Quality
Predicting Highly Connected Proteins in PIN using QSAR Art Cherkasov Apr 14, 2011 UBC / VGH THE UNIVERSITY OF BRITISH COLUMBIA.
1 Statistical Tools for Multivariate Six Sigma Dr. Neil W. Polhemus CTO & Director of Development StatPoint, Inc. Revised talk:
Data Mining Techniques
Chapter 11 Simple Regression
Graphical Analysis. Why Graph Data? Graphical methods Require very little training Easy to use Massive amounts of data can be presented more readily Can.
Review of methods to assess a QSAR Applicability Domain Joanna Jaworska Procter & Gamble European Technical Center Brussels, Belgium and Nina Nikolova.
Analyzing Reliability and Validity in Outcomes Assessment (Part 1) Robert W. Lingard and Deborah K. van Alphen California State University, Northridge.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
Mike Comber Consulting TIMES-SS Assessment of skin sensitisation hazard Presented on behalf of the TIMES-SS consortia.
1. Chemometrices:  Signal processing  Classification & pattern reccognation  Experimental design  Multivariative calibration  Quantitative Structure.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Geographic Information Science
Use of Machine Learning in Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Multiple regression models Experimental design and data analysis for biologists (Quinn & Keough, 2002) Environmental sampling and analysis.
Identifying Applicability Domains for Quantitative Structure Property Relationships Mordechai Shacham a, Neima Brauner b Georgi St. Cholakov c and Roumiana.
6. Evaluation of measuring tools: validity Psychometrics. 2012/13. Group A (English)
SAR vs QSAR or “is QSAR different from SAR”
Paola Gramatica, Elena Bonfanti, Manuela Pavan and Federica Consolaro QSAR Research Unit, Department of Structural and Functional Biology, University of.
Mike Comber TIMES-SS Application of Reactivity Principles in Screening for Skin Sensitisers Presented on behalf of the TIMES-SS consortia & International.
1 Chapter 3 1.Quality Management, 2.Software Cost Estimation 3.Process Improvement.
Sensitivity and Importance Analysis Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
PERFORMANCE MODELS. Understand use of performance models Identify common modeling approaches Understand methods for evaluating reliability Describe requirements.
Academic Research Academic Research Dr Kishor Bhanushali M
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
1 …continued… Part III. Performing the Research 3 Initial Research 4 Research Approaches 5 Hypotheses 6 Data Collection 7 Data Analysis.
Organic pollutants environmental fate: modeling and prediction of global persistence by molecular descriptors P.Gramatica, F.Consolaro and M.Pavan QSAR.
Week 2 The lecture for this week is designed to provide students with a general overview of 1) quantitative/qualitative research strategies and 2) 21st.
McKim Conference on Predictive Toxicology
1 Experimental Statistics - week 12 Chapter 12: Multiple Regression Chapter 13: Variable Selection Model Checking.
Log Koc = MW nNO – 0.19 nHA CIC MAXDP Ts s = 0.35 F 6, 134 = MW: molecular weight nNO: number of NO bonds.
“ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Sensitivity and Importance Analysis Charles Yoe
Applied Quantitative Analysis and Practices
F.Consolaro 1, P.Gramatica 1, H.Walter 2 and R.Altenburger 2 1 QSAR Research Unit - DBSF - University of Insubria - VARESE - ITALY 2 UFZ Centre for Environmental.
Applied Quantitative Analysis and Practices LECTURE#30 By Dr. Osman Sadiq Paracha.
P. Gramatica 1, H. Walter 2 and R. Altenburger 2 1 QSAR Research Unit - DBSF - University of Insubria - VARESE - ITALY 2 UFZ Centre for Environmental Research.
Use of Machine Learning in Chemoinformatics
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 3 Forecasting.
Copyright © 2014 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 11 Measurement and Data Quality.
QSAR Application Toolbox: Step 12: Building a QSAR model
Multiple Regression Prof. Andy Field.
General Concepts in QSAR for Using the QSAR Application Toolbox
Hierarchical Classification of Calculated Molecular Descriptors
P. Gramatica1, F. Consolaro1, M. Vighi2, A. Finizio2 and M. Faust3
Presentation transcript:

Sung Kyu (Andrew) Maeng

Contents  QSAR Introduction  QSBR Introduction  Results and discussion  Current QSAR project in UNESCO-IHE

Introduction to the (Q)SAR concept  Chemicals with similar molecular structures have similar effects in physical and biological systems → qualitative model (SAR)  The extent of an effect varies in a systematic way with variations in molecular structure → quantitative model (QSAR) Activity depends on chemical structure Biodegradation index = MW-0.314H/C r = 0.866, r 2 = 0.750, Sig. < 0.005, n= 156

SAR vs QSAR  SAR is based on the “similarity” principle;  The principle is assumed, but in the reality it is not always true; - Similarity of structures - Similarity of descriptors  The authenticity depends on the type of the relationship between descriptors (numerical representation of chemicals) and activity;  The type of the relationship should be known (or derived)

SAR vs. QSAR how could we say there is a difference ? Three common things to this point:  Both methods use numerical representation of chemical compounds;  Both methods need to decide which representation to use;  Both methods need to derive the relationship between numerical representation (descriptors, etc.) and activity.

QSAR in water treatment processes Results obtained from valid qualitative or quantitative structure-activity relationship models can provide the removal of PhACs in drinking water and the process selection for target compounds. Results of QSAR may be used instead of testing if results are derived from a QSAR model whose scientific validity has been established

 In principle, QSARs can be used to: - provide information for use in priority setting treatments for target compounds - guide the experimental design of a test or testing strategy - improve the evaluation of existing test data - provide mechanistic information (e.g. to support the grouping of chemicals into categories) - fill a data gap needed for classification QSAR in water treatment processes

OECD Principles for QSAR Validation  QSAR should be associated with the following information: - a defined endpoint - an unambiguous algorithm - appropriate measures of goodness-of-fit, robustness and predictivity - a mechanistic interpretation, if possible

 Development of Quantitative Structure-Biodegradation Relationships (QSBRs) - QSBRs has been developed to predict the biodegradability of chemicals released to natural systems using their structure-activity relationships (SAR) - The development of QSBRs has been relatively slow compared with proliferation of QSARs because of the nature of the biodegradability endpoint - QSBR is very complex because 1. Chemical structure 2. Environmental conditions 3. Bioavailability of the chemical QSBR

- Limitations often associated in developing QSBR 1. Only within cogeneric series of chemicals 2. The absence of standardised and uniform biodegradation databases - Recent years, a very intensive development of new and better qualitative and quantitative biodegradability models was observed - How many QSBR have been developed ? A literature search on QSBR was performed including literature published showed more than 84 models - However, only a few models provided an acceptable level of agreement between estimated and experimental data QSBR

- All QSBR models until 1994 were reviewed by several researchers for their applicability 1. Group contribution method (OECD, PLS, BIOWIN, MultiCASE) 2. Chemometric methods (CART) 3. Expert system (BESS, CATABOL, TOPKAT) - According to the previous studies, the group contribution method seems to be the most applied and successful way of modeling biodegradation QSBR

 OECD hierarchical model approach  Multivariable Partial Least Approach (PLS) model  BIOWIN  MultiCASE anaerobic program Group Contribution Method

 Provide estimates of biodegradability useful in chemical screening under aerobic condition (1,2,5,6)  Provide approximate time required to biodegrade in a stream (3,4)  Recently, BIOWIN was updated and now it can estimate anaerobic biodegradation potential (7) BIOWIN has 7 models (U.S. EPA, 2007) BIOWIN1BIOWIN2BIOWIN3BIOWIN4BIOWIN5BIOWIN6BIOWIN7 linearnon-linearUltimatePrimarylinearNon-linear Based on regressions against 36 preselected chemical structures plus molecular weight of experimental biodegradation data for 295 compounds (BIODEG) Based on regressions of biodegradability estimates from a survey of experts for a suite 200 organic chemicals against the same chemical substructures plus molecular weight Based on regressions of data from the Japanese MITI database against a modified set of chemical substructures plus molecular weight Based on BIOWIN fragment contribution approach. What Does the BIOWIN Model Do?

Materials and method  Finding Molecular Descriptors Sofrware Delft Chemtech, Dragon, Chem3D etc…  Selection of Molecular Descriptors 1. PCA (SPSS) 2. Genetic Algorithm-Variable Subset Selection (Mobydigs)

Principal Component Analysis

 Variables: MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface are  Assessment of the suitability of the data for PCA - KMO > 0.6 (KMO = 0.6), Barlett’s Test of Sphericity < 0.05 (<0.005)  Determination of the number of factors by Kaise criterion, scree plot and Montecarlo parallel analysis Principal Component Analysis (PCA)

The two-component solution explained a total of 67% of the variance with Component 1 contributing 46% and Component2 contributing 21%; Component 1: SIZE and component 2: Hydrophobic/Hydrophilicity HP-neu HP-ion HL-neu HL-ion Classification PhACs - PCA

Dependent variableIndependent variables (Indices, Chemical descriptors) BIOWIN3MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface area R2R2 STD. Error Sig. (p) Rej. range (%) BIOWIN 3 range Equation to predict biodegradation HL < (75) (2.8) logKow-0.008MV+1.039length ( width) HP (86) (2.5) log_Kow-42.75length-94.09eqwidth HL-ionic < (91) (2.6) MW+0.934length ( logKow-13.84length-94.09HL_surf) HP- ionic 1 --< (95) (2.7) - ( logKow-42.57length-94.09eqwidth) HL- neutral < (60) (2.9) logKow-0.004MV ( logKow+27704eqwidth) HP- neutral < (79.7) (2.3) logKow ( logKow eqwidth- 0.78HL_surf) 1.HP and HP-ionic compounds were not feasible to come up with equation because of collinearity problem in variables (Violation in MLR assumptions) Biodegradation (Aerobic)

Innovative system for removal of micropollutants – RBF and NF membrane RBF Membrane months longer weeks days days - weeks weeks - months

Organic micropollutants QSAR Biological treatment Physical/Chemical Treatment MembraneGACAOP NFRO Cl 2 O3O3 ARR RBF /DUNE BIOWIN Kow K O3 MW Process selection and comparative performance assessment QSAR Models Decision Support Framework

GIST Analysis of PhACs LC-MS / AUTO SPE Selection of Target compounds Physical-chemical characteristics Vs. Water treatments Selection of Target compounds QSAR Tools Selection of Water Treatments Selected water Treatments Classification, Database, Model development PhACs removal using selected water treatments by GIST PhACs removal using selected water treatments by UNESCO-IHE A decision support tool for PhACS removal for water utility Current QSAR project