Download presentation
1
SAR vs QSAR or “is QSAR different from SAR”
Joanna Jaworska Procer & Gamble, Brussels, Belgium and Nina Jeliazkova IPP, Bulgarian Academy of Sciences, Sofia, Bulgaria
2
SAR vs. QSAR how could we say there is no difference ?
SAR is supposed to be not quantitative concept SAR is based on the notion of “similarity” : “Similar compounds have similar activity” “Dissimilar compounds have dissimilar activity” QSAR aims to derive a quantitative model of the activity How could we say there is no difference ? Let’s see what similarity is
3
SAR vs. QSAR Roadmap What “similarity” means? A philosophers’ view and implications to the toxicology; Are the basic tenets of SAR true ? What do similarity measures measure ? How does the similarity measure relate to QSAR modeling ?
4
Similarity : philosophers’ view
exploiting the similarity concept is a sign of immature science (Quine) “it is ill defined to say “A is similar to B” and it is only meaningful to say “A is similar to B with respect to C” (1) W. V. Quine, Natural kinds. In Ontological relativity and other essays, Columbia University Press, New York, NY, 1977. The notion of similarity is used mainly in early stages of the development of a particular science, and it may be quantified and explained accurately later as the theory of this science develops. (2) N. Goodman (Ed.), Seven structures on similarity. Problems and Projects, 437 ?447. Bobbs-Merril, New York, 1972. implications for toxicology : A chemical “A” cannot be similar to a chemical “B” in absolute terms but only with respect to some measurable key feature
5
Chemical Grouping by Similarity
between structures Selected similar compounds Similarity between points ? Numerical Values
6
Structural similarity
Does not imply always similarity in activity Martin et al J.Med.Chem 45, Does not always imply similarity in descriptors Kubinyi, H., Chemical Similarity and Biological activity (with permission of the author) As illustrated in [Kubinyi], structurally similar compounds (eight compounds with the same connectivity and differing in only one or two substituents in this example) can have very different volume and surface potentials, hydrophobic and polar regions, hydrogen bond donor potentials, hydrogen bond acceptor potentials and molecular electrostatic potentials. This is also in contradiction with the long repeated “basics of QSAR”, asserting that similar compounds have similar properties and dissimilar compounds have dissimilar properties. Kybinyi, H., Chemical Similarity and Biological activity. Hugo Kubinyi Lectures,
7
Structurally similar compounds can have very different properties
Usually the modeller resorts to the similarity in structures with the hope that structurally similar compounds will also have the same mechanism of action [[i]]. This is a widely used approach, but such hope does not always come true. Several surprising structure-activity relationships demonstrate that chemically similar compounds may have significantly different biological actions and activities and different molecules can be very similar in their biological activities. Applying the results from one con-generic series to another one may lead to completely wrong conclusions [54, [ii], [iii], [iv], [v]]. As illustrated in [Kubinyi], structurally similar compounds (eight compounds with the same connectivity and differing in only one or two substituents in this example) can have very different volume and surface potentials, hydrophobic and polar regions, hydrogen bond donor potentials, hydrogen bond acceptor potentials and molecular electrostatic potentials. This is also in contradiction with the long repeated “basics of QSAR”, asserting that similar compounds have similar properties and dissimilar compounds have dissimilar properties. [[i]] Barratt, M.D., Castell, J.V., Chamberlain, M., Combes, R.D., Dearden, J.C., Fentem, J.H., Gerner, I., Giuliani, A., Gray, T.J.B., Livingstone, D.J., McLean Provan W., Rutten, F.J.J.A.L., Verhaar, H.J.M. and Zbinden, P.,, The Integrated Use of Alternative Approaches for Predicting Toxic Hazard The Report and Recommendations of ECVAM Workshop 8 [[ii]] Burger, A., Isosterism and bioisosterism in drug design, Prog. Drug. Res., 37, (1991). [[iii]] Patani, G.A. and LaVoie, E.J., Bioisosterism: A rational approach in drug design, Chem. Rev., 96, (1996). [[iv]] Kubinyi, H., Similarity and Dissimilarity - A Medicinal Chemist’s View, in: 3D QSAR in Drug Design. Volume II. Ligand-Protein Interactions and Molecular Similarity, H. Kubinyi, G. Folkers and Y. C. Martin, Eds., Kluwer/ESCOM, Dordrecht (1998), ; also published in: Persp. Drug Design Discov. 9/10/11, (1998). [[v]] Kubinyi H., Chemical Similarity and Biological Activity ,3rd Workshop on Chemical Structure and Biological Activity: Perspectives on QSAR 2001 (November 8-10, 2001) Sao Paolo, Brazil, Kybinyi, H., Chemical Similarity and Biological activity. Hugo Kubinyi Lectures,
8
Example: Y.Martin et al ( 2002) Do structurally similar molecules have similar biological activity ?
Set of 1645 chemicals with IC50s for monoamine oxidase inhibition Daylight fingertips 1024 bits long ( 0-7 bonds) Using Tanimoto coeff with a cut off value 0f 0.85 only 30 % of actives were detected Cutoff values % of actives detected % False positives J. Med. Chem. 2002,45,
9
How else to measure chemical similarity ?
Describe chemical compounds with a set of numerical values ( fingerprints, diverse descriptors, field values, etc.) Set up some measure between values (Euclidean distance, Tanimoto distance, Carbo similarity index, etc.) What do we actually measure ? And how it is related to the activity ?
10
The distance between numerical representations of chemical compounds
What do we measure ? The distance between numerical representations of chemical compounds A few warnings: The numerical representation is not unique The numerical representation includes only part of all the information about the compound A distance measure reflects “closeness” only if the data holds specific assumptions (next slide - example)
11
Distances - example by Euclidean distance we will decide that the red point is closer to the data set 2, while a human will note that it belongs to the data set 1. Data set 2 Data set 1 Distances give results which are not always expected intuitively Be aware of the assumptions behind distances (e.g. Euclidean distance gives good results with normally distributed data in orthogonal space) any distance scheme and classification techniques rely on certain assumptions and could in principle provide low error classification results only if underlying data distributions complies with these assumptions.
12
How do we represent a chemical compound ?
Fingerprints, Descriptors (more than 3000 available), electron density, various fields, etc. All representations lose information. We should ensure this information is not important. How?
13
Finding important information
A problem not unique to (Q)SAR Lot of methods available Most popular (e.g. PCA ) not the best Possible solution : look for the most discriminative information (example: descriptors which provide best discrimination between active and inactive compounds)
14
SAR vs. QSAR how could we say there is no difference ?
Two common things to this point: Both methods use numerical representation of chemical compounds; Both methods need to decide which representation to use; One more difference : “SAR is a qualitative not a quantitative relationship” Is this true indeed?
15
Similarity and Activity
Proximity with respect to descriptors does not necessary mean proximity with respect to the activity (example) This is only true if a linear relationship holds between descriptors and activity (examples) The linear relationship is only a special case, given the complexity of biochemical interactions. Its use should be justified in every specific case Structural similarity should be used with care (examples) Second, similarity searching in descriptor space could be deceiving. Proximity with respect to descriptors does not necessary imply proximity with respect to activity. In fact, this is only true if linear relationship holds between descriptors and activity, as shown in the review. However, the linear relationship is only a special case, given the complexity of biochemical interactions, and its frequent use is not always justified. Structural similarity should also be used with care, since it does not always imply similarity by activity, as shown by examples.
16
“Neighbourhood principle”
Molecules in the same local region (“neighbourhood”) of a descriptor space tend to have similar values of a desired property Contradictory evidence exists : both supporting and rejecting The efforts to computerize similarity assessment resulted in a number of different methods. One of the most popular among them is the search for compounds with similar activity in the descriptor space. This approach presupposes the existence of a set of descriptors, such that molecules in the same local region (“neighbourhood”) of this descriptor space tend to have similar values of a desired property [[i]]. This is assumed to be the fundamental axiom of molecular similarity in descriptor space and is often called the “neighbourhood principle” or “neighbourhood behaviour axiom”. The similarity according to the neighbourhood axiom is defined with respect to a molecular property of interest, which leads to multiple definitions of similarity, one for each property. As a result, it allows “similarity” to be defined in an objective way, well suited for computer analysis. The axiom allows taking the decision that two chemicals have close values of certain property if they have close descriptor values. There are numerous methods for exploiting this idea and statements that it is supported by the experience of synthetic chemists, but some publications claiming the opposite also exist [[i]]. A formal analysis whether (and when) this assumption holds is necessary before its application. [[i]] Martin, Y.C.; Brown, R.D. and Bures, M.G., Quantifying diversity, in Combinatorial Chemistry and Molecular Diversity in Drug Discovery, pp , Gordon, E.M.; Kerwin Jr., J. F.., editors, Wiley (1998). [[i]] Cramer, R., Patterson, D., Clark, R., Soltanashahi, F. and Lawless M., Virtual Compound Libraries: A New Approach to Decision Making in Molecular Discovery Research, J. Chem. Inf. Comput. Sci., 38, (1998).
17
“Neibourhood principle” Analysis
Neighbourhood in the descriptor space Similar activity values Descriptor Activity Depends on the relationship between the descriptors and activity !!!
18
“Neighbourhood principle” Lessons
In order to apply the “neighbourhood principle” the TYPE of the relationship between descriptor and activity should be known; The “neighbourhood principle” is genuine only if the relationship is LINEAR; The linear relationship is only a simple special case, given the complexity of biochemical interactions. Its use SHOULD BE JUSTIFIED in every specific case.
19
SAR vs QSAR SAR is based on the “similarity” principle;
The principle is assumed, but in the reality it is not always true; Similarity of structures Similarity of descriptors The authenticity depends on the type of the relationship between descriptors (numerical representation of chemicals) and activity; The type of the relationship should be known (or derived)
20
SAR vs. QSAR how could we say there is a difference ?
Three common things to this point: Both methods use numerical representation of chemical compounds; Both methods need to decide which representation to use; Both methods need to derive the relationship between numerical representation (descriptors, etc.) and activity.
21
Thank you! When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind: it may be the beginning of knowledge, but you have scarcely advanced to the stage of science. William Thomson, Lord Kelvin
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.