Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.

Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar

Outline Introduction Structures and activities Analysis techniques: Free-Wilson, Hansch Regression techniques: PCA, PLS Comparative Molecular Field Analysis

QSAR: The Setting Quantitative structure-activity relationships are used when there is little or no receptor information, but there are measured activities of (many) compounds

From Structure to Property EC 50

From Structure to Property LD 50

From Structure to Property

QSAR: Which Relationship? Quantitative structure-activity relationships correlate chemical/biological activities with structural features or atomic, group or molecular properties. within a range of structurally similar compounds

Free Energy of Binding and Equilibrium Constants The free energy of binding is related to the reaction constants of ligand-receptor complex formation:  G binding = –2.303 RT log K = –2.303 RT log (k on / k off ) Equilibrium constant K Rate constants k on (association) and k off (dissociation)

Concentration as Activity Measure A critical molar concentration C that produces the biological effect is related to the equilibrium constant K Usually log (1/C) is used (c.f. pH) For meaningful QSARs, activities need to be spread out over at least 3 log units

Free Energy of Binding  G binding =  G 0 +  G hb +  G ionic +  G lipo +  G rot  G 0 entropy loss (translat. + rotat.) +5.4  G hb ideal hydrogen bond –4.7  G ionic ideal ionic interaction –8.3  G lipo lipophilic contact –0.17  G rot entropy loss (rotat. bonds) +1.4 (Energies in kJ/mol per unit feature)

Molecules Are Not Numbers! Where are the numbers? Numerical descriptors

Basic Assumption in QSAR The structural properties of a compound contribute in a linearly additive way to its biological activity provided there are no non-linear dependencies of transport or binding on some properties

An Example: Capsaicin Analogs X EC 50 (  M) log(1/EC 50 ) H11.804.93 Cl 1.245.91 NO 2 4.585.34 CN26.504.58 C6H5C6H5 0.246.62 NMe2 4.395.36 I 0.356.46 NHCHO??

An Example: Capsaicin Analogs Xlog(1/EC 50 )MR  EsEs H4.93 1.03 0.00 Cl5.91 6.03 0.71 0.23-0.97 NO 2 5.34 7.36-0.28 0.78-2.52 CN4.58 6.33-0.57 0.66-0.51 C6H5C6H5 6.6225.36 1.96-0.01-3.82 NMe25.3615.55 0.18-0.83-2.90 I6.4613.94 1.12 0.18-1.40 NHCHO?10.31-0.98 0.00-0.98 MR = molar refractivity (polarizability) parameter;  = hydrophobicity parameter;  = electronic sigma constant (para position); E s = Taft size parameter

An Example: Capsaicin Analogs log(1/EC 50 ) = -0.89 + 0.019 * MR + 0.23 *  + -0.31 *  + -0.14 * E s

An Example: Capsaicin Analogs X EC 50 (  M) log(1/EC 50 ) H11.804.93 Cl 1.245.91 NO 2 4.585.34 CN26.504.58 C6H5C6H5 0.246.62 NMe2 4.395.36 I 0.356.46 NHCHO??

First Approaches: The Early Days Free- Wilson Analysis Hansch Analysis

Free-Wilson Analysis log (1/C) =  a i x i +  x i :presence of group i (0 or 1) a i : activity group contribution of group i  : activity value of unsubstituted compound

Free-Wilson Analysis +Computationally straightforward –Predictions only for substituents already included –Requires large number of compounds

Hansch Analysis Drug transport and binding affinity depend nonlinearly on lipophilicity: log (1/C) = a (log P) 2 + b log P + c  + k P: n-octanol/water partition coefficient  : Hammett electronic parameter a,b,c:regression coefficients k:constant term

Hansch Analysis +Fewer regression coefficients needed for correlation +Interpretation in physicochemical terms +Predictions for other substituents possible

Molecular Descriptors Simple counts of features, e.g. of atoms, rings, H-bond donors, molecular weight Physicochemical properties, e.g. polarisability, hydrophobicity (logP), water-solubility Group properties, e.g. Hammett and Taft constants, volume 2D Fingerprints based on fragments 3D Screens based on fragments

2D Fingerprints CNOPSXFClBrIPhCONHOHMeEtPyCHOSOC=CC=CCΞCCΞCC=NC=NAmIm 111001001011111000010010

Regression Techniques Principal Component Analysis (PCA) Partial Least Squares (PLS)

Principal Component Analysis (PCA) Many (>3) variables to describe objects = high dimensionality of descriptor data PCA is used to reduce dimensionality PCA extracts the most important factors (principal components or PCs) from the data Useful when correlations exist between descriptors The result is a new, small set of variables (PCs) which explain most of the data variation

PCA – From 2D to 1D

PCA – From 3D to 3D-

Different Views on PCA Statistically, PCA is a multivariate analysis technique closely related to eigenvector analysis In matrix terms, PCA is a decomposition of matrix X into two smaller matrices plus a set of residuals: X = TP T + R Geometrically, PCA is a projection technique in which X is projected onto a subspace of reduced dimensions

Partial Least Squares (PLS) y 1 = a 0 + a 1 x 11 + a 2 x 12 + a 3 x 13 + … + e 1 y 2 = a 0 + a 1 x 21 + a 2 x 22 + a 3 x 23 + … + e 2 y 3 = a 0 + a 1 x 31 + a 2 x 32 + a 3 x 33 + … + e 3 … y n = a 0 + a 1 x n1 + a 2 x n2 + a 3 x n3 + … + e n Y = XA + E (compound 1) (compound 2) (compound 3) … (compound n) X = independent variables Y = dependent variables

PLS – Cross-validation Squared correlation coefficient R 2 Value between 0 and 1 (> 0.9) Indicating explanative power of regression equation Squared correlation coefficient Q 2 Value between 0 and 1 (> 0.5) Indicating predictive power of regression equation With cross-validation:

PCA vs PLS PCA: The Principle Components describe the variance in the independent variables (descriptors) PLS: The Principle Components describe the variance in both the independent variables (descriptors) and the dependent variable (activity)

Comparative Molecular Field Analysis (CoMFA) Set of chemically related compounds Common substructure required 3D structures needed (e.g., Corina-generated) Bioactive conformations of the active compounds are to be aligned

CoMFA Alignment

CoMFA Grid and Field Probe (Only one molecule shown for clarity)

Electrostatic Potential Contour Lines

CoMFA Model Derivation Van der Waals field (probe is neutral carbon) E vdw =  (A i r ij -12 - B i r ij -6 ) Electrostatic field (probe is charged atom) E c =  q i q j / Dr ij Molecules are positioned in a regular grid according to alignment Probes are used to determine the molecular field:

3D Contour Map for Electronegativity

CoMFA Pros and Cons +Suitable to describe receptor-ligand interactions +3D visualization of important features +Good correlation within related set +Predictive power within scanned space –Alignment is often difficult –Training required

Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.

Similar presentations

Presentation on theme: "Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar.

Similar presentations

Presentation on theme: "Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar."— Presentation transcript:

Similar presentations

About project

Feedback