Sung Kyu (Andrew) Maeng
Contents QSAR Introduction QSBR Introduction Results and discussion Current QSAR project in UNESCO-IHE
Introduction to the (Q)SAR concept Chemicals with similar molecular structures have similar effects in physical and biological systems → qualitative model (SAR) The extent of an effect varies in a systematic way with variations in molecular structure → quantitative model (QSAR) Activity depends on chemical structure Biodegradation index = MW-0.314H/C r = 0.866, r 2 = 0.750, Sig. < 0.005, n= 156
SAR vs QSAR SAR is based on the “similarity” principle; The principle is assumed, but in the reality it is not always true; - Similarity of structures - Similarity of descriptors The authenticity depends on the type of the relationship between descriptors (numerical representation of chemicals) and activity; The type of the relationship should be known (or derived)
SAR vs. QSAR how could we say there is a difference ? Three common things to this point: Both methods use numerical representation of chemical compounds; Both methods need to decide which representation to use; Both methods need to derive the relationship between numerical representation (descriptors, etc.) and activity.
QSAR in water treatment processes Results obtained from valid qualitative or quantitative structure-activity relationship models can provide the removal of PhACs in drinking water and the process selection for target compounds. Results of QSAR may be used instead of testing if results are derived from a QSAR model whose scientific validity has been established
In principle, QSARs can be used to: - provide information for use in priority setting treatments for target compounds - guide the experimental design of a test or testing strategy - improve the evaluation of existing test data - provide mechanistic information (e.g. to support the grouping of chemicals into categories) - fill a data gap needed for classification QSAR in water treatment processes
OECD Principles for QSAR Validation QSAR should be associated with the following information: - a defined endpoint - an unambiguous algorithm - appropriate measures of goodness-of-fit, robustness and predictivity - a mechanistic interpretation, if possible
Development of Quantitative Structure-Biodegradation Relationships (QSBRs) - QSBRs has been developed to predict the biodegradability of chemicals released to natural systems using their structure-activity relationships (SAR) - The development of QSBRs has been relatively slow compared with proliferation of QSARs because of the nature of the biodegradability endpoint - QSBR is very complex because 1. Chemical structure 2. Environmental conditions 3. Bioavailability of the chemical QSBR
- Limitations often associated in developing QSBR 1. Only within cogeneric series of chemicals 2. The absence of standardised and uniform biodegradation databases - Recent years, a very intensive development of new and better qualitative and quantitative biodegradability models was observed - How many QSBR have been developed ? A literature search on QSBR was performed including literature published showed more than 84 models - However, only a few models provided an acceptable level of agreement between estimated and experimental data QSBR
- All QSBR models until 1994 were reviewed by several researchers for their applicability 1. Group contribution method (OECD, PLS, BIOWIN, MultiCASE) 2. Chemometric methods (CART) 3. Expert system (BESS, CATABOL, TOPKAT) - According to the previous studies, the group contribution method seems to be the most applied and successful way of modeling biodegradation QSBR
OECD hierarchical model approach Multivariable Partial Least Approach (PLS) model BIOWIN MultiCASE anaerobic program Group Contribution Method
Provide estimates of biodegradability useful in chemical screening under aerobic condition (1,2,5,6) Provide approximate time required to biodegrade in a stream (3,4) Recently, BIOWIN was updated and now it can estimate anaerobic biodegradation potential (7) BIOWIN has 7 models (U.S. EPA, 2007) BIOWIN1BIOWIN2BIOWIN3BIOWIN4BIOWIN5BIOWIN6BIOWIN7 linearnon-linearUltimatePrimarylinearNon-linear Based on regressions against 36 preselected chemical structures plus molecular weight of experimental biodegradation data for 295 compounds (BIODEG) Based on regressions of biodegradability estimates from a survey of experts for a suite 200 organic chemicals against the same chemical substructures plus molecular weight Based on regressions of data from the Japanese MITI database against a modified set of chemical substructures plus molecular weight Based on BIOWIN fragment contribution approach. What Does the BIOWIN Model Do?
Materials and method Finding Molecular Descriptors Sofrware Delft Chemtech, Dragon, Chem3D etc… Selection of Molecular Descriptors 1. PCA (SPSS) 2. Genetic Algorithm-Variable Subset Selection (Mobydigs)
Principal Component Analysis
Variables: MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface are Assessment of the suitability of the data for PCA - KMO > 0.6 (KMO = 0.6), Barlett’s Test of Sphericity < 0.05 (<0.005) Determination of the number of factors by Kaise criterion, scree plot and Montecarlo parallel analysis Principal Component Analysis (PCA)
The two-component solution explained a total of 67% of the variance with Component 1 contributing 46% and Component2 contributing 21%; Component 1: SIZE and component 2: Hydrophobic/Hydrophilicity HP-neu HP-ion HL-neu HL-ion Classification PhACs - PCA
Dependent variableIndependent variables (Indices, Chemical descriptors) BIOWIN3MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface area R2R2 STD. Error Sig. (p) Rej. range (%) BIOWIN 3 range Equation to predict biodegradation HL < (75) (2.8) logKow-0.008MV+1.039length ( width) HP (86) (2.5) log_Kow-42.75length-94.09eqwidth HL-ionic < (91) (2.6) MW+0.934length ( logKow-13.84length-94.09HL_surf) HP- ionic 1 --< (95) (2.7) - ( logKow-42.57length-94.09eqwidth) HL- neutral < (60) (2.9) logKow-0.004MV ( logKow+27704eqwidth) HP- neutral < (79.7) (2.3) logKow ( logKow eqwidth- 0.78HL_surf) 1.HP and HP-ionic compounds were not feasible to come up with equation because of collinearity problem in variables (Violation in MLR assumptions) Biodegradation (Aerobic)
Innovative system for removal of micropollutants – RBF and NF membrane RBF Membrane months longer weeks days days - weeks weeks - months
Organic micropollutants QSAR Biological treatment Physical/Chemical Treatment MembraneGACAOP NFRO Cl 2 O3O3 ARR RBF /DUNE BIOWIN Kow K O3 MW Process selection and comparative performance assessment QSAR Models Decision Support Framework
GIST Analysis of PhACs LC-MS / AUTO SPE Selection of Target compounds Physical-chemical characteristics Vs. Water treatments Selection of Target compounds QSAR Tools Selection of Water Treatments Selected water Treatments Classification, Database, Model development PhACs removal using selected water treatments by GIST PhACs removal using selected water treatments by UNESCO-IHE A decision support tool for PhACS removal for water utility Current QSAR project