Download presentation
Presentation is loading. Please wait.
1
Toxicity vs CHEMICAL space
2
Can we predict toxicity of a chemical compound?
To predict toxicity of a compound a, we assume there exists Another compound B with known toxicity Similar properties with A Based on the similarity between a and b, we expect Toxicity of A = toxicity of b + small difference The result depends on the definition of similarity there are many similarity coefficients Structure-based similarity searches (SAR, QsAR) usually tiny change of chemical is based on trivial assumptions In general, failed to give small difference Regarded as a non-smooth trend ( for example in toxicity) known as non-smooth chemical space but no rule where to put how do we know right position of the chemical in the space Mendeleev solved positioning of the chemical elements
3
An example to position chemicals: How to position chemical?
reymond & awale (2012) proposed 42-dimensional space to position all known and unknown small molecules (chemicals) Every chemical is assigned with 42-integer valued molecular quantum numbers (MQN) positioned in 42-dimensional grid on R42 (can assume it a bin in R42) conducted principal component analysis (PCA) To extract most relevant dimensions (in the form of principal components, PC) To Represent projections of the chemical space in a PC-plane, For most databases, more than 70% of variance of MQN-space is covered by PC1, PC2 and PC3 MQN-Maps derived from projections in the PC-planes – an overview of their chemical space MQN-space was inline with already defined groups with a proper direction Groups of related bioactive compounds often cluster together on MQN-maps MQN-similarity does not select for substructure similarity -> can reveal non-trivial relationship between actives
4
A figure from Nguyen et al (2009)
does position matter? A figure from Nguyen et al (2009)
5
A drawback of MQN-space
MQN-space (1billion COMPOUNDS): known/synthesized, <0.01% unknown/generated, >99.9% ZINC database (21M) GDB-11 database (26.4M) no. of compounds no. of MQN-bins no. of single occupied MQN-bins no. of compounds in most occupied MQN-bin no. of shared MQN-bins no. of compounds in shared MQN-bins
6
A need to improve MQN-space
Do compounds in shared MQN-bins have same or similar biological activity (for example toxicity)? If yes, MQN-space perfectly works – but probably not (with probability) If no, need to distinguish between 2 compounds in a bin Re-position them in the mqn-space without disturbing other chemical’s position Using structural similarity
7
OUR PROPOSAL Each compound in a bin in mqn-space has 42 neighbours
For simplicity, suppose we have 2 compounds in a particular bin We know 1 compound will definitely stay at grid but the other (X) must shift we know all its 42 neighbours will definitely stay at their grid we will compute structure-based distance (SD) of X from all 42 neighbours For example, using soergel distance based on 2d structural similarity (0<SD<1) Choose minimum SD direction, and shift X into this direction with the shift length SD
8
Thank you! Further directions: Defining fixed points
Checking consistencies with existing mathematical spaces (e.g. R42)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.