Presentation is loading. Please wait.

Presentation is loading. Please wait.

P. Gramatica and F. Consolaro QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy.

Similar presentations


Presentation on theme: "P. Gramatica and F. Consolaro QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy."— Presentation transcript:

1 P. Gramatica and F. Consolaro QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy. E-mail: gramati@imiucca.csi.unimi.it Web-site: http://andromeda.varbio.unimi.it/~QSAR/ EEC PRIORITY LIST 1 The so-called “EEC Priority List 1” includes a large number of commercial chemicals dangerous to man and the environment. These chemicals, selected according to the Directive 76/464/EEC because of their environmental impact and diffusion, are very heterogeneous: they have unrelated structures and most of them have unknown mechanisms of action and type of effect. A final list of 202 compounds was obtained (by exclusion of chemicals impossible to study by our approach, i.e. inorganics, salts, etc., and by addition of some isomers of listed chemicals) and it was studied for structural similarity. DESCRIPTION OF MOLECULAR STRUCTURE A wide set of molecular descriptors is used here to describe the chemical structure of these compounds, the aim being to find an objective method to group such compounds on the basis of their structural aspects. In particular we have used: count descriptors that are the number of different kinds of atoms, functional groups, rings of different size or atom acceptors and donors of H-bonds; graph-invariants descriptors that include both topological and information indices and give information about the 2-D molecular structure and connectivity. molecular weight (MW) is always used. WHIM descriptors (1-2) that are molecular indices that represent different sources of chemical information about the whole 3D-molecular structure in terms of size, shape, symmetry and atom distribution. (1) R. Todeschini and P. Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant. Struct.-Act. Relat., 16 (1997) 113-119. (2) R. Todeschini, WHIM-3D/QSAR- Software for the calculation of the WHIM descriptors, rel. 4.1 for Windows, Talete srl, Milano (Italy) 1996. Download: http://www.disat.unimi.it/chm. PREDICT PROJECT This work concerns the EEC PREDICT (Prediction and Assessment of the Aquatic Toxicity of Mixtures of Chemicals) project. Its objective is to provide a suitable means for the early identification of environmental risks resulting from the combined effects of chemical mixtures, specifically focussing on aquatic pollutants of concern, such as the List 1 chemicals. One need of the PREDICT project is the identification of an objective method to group compounds only according to their structural features, and then identify the most representative compounds for each group. For this reason we used the chemometric approach for the 202 studied compounds, our aim being to identify from 15 to 20 different structural groups. CHEMOMETRIC METHODS Several chemometric analyses have been applied to the compounds (represented by molecular descriptors) to group the more similar ones, in accordance with a multivariate structural approach. The analyses performed are: Hierarchical Cluster Analysis: Hierarchical Cluster Analysis: hierarchical clustering was performed with the aim of finding clusters of the studied compounds in high dimensional space, using molecular descriptors as variables. Different distance metrics (Euclidean, Manhattan, Pearson) and different linkages (Complete, average, single, etc.) were used and compared to find the best way to cluster these compounds. Principal Component Analysis (PCA): Principal Component Analysis (PCA): this analysis was used to calculate just a few components from a large number of variables. These components allow the highlighting of the distribution of the compounds according to structure, and find the similarity between compounds assigned to the same cluster. Kohonen Maps: Kohonen Maps: this is an additional way of mapping similar compounds by using the so-called “self- organized topological feature maps”, which are maps that preserve the topology of a multidimensional representation within a toroidal two-dimensional representation. The position of the compounds in this map shows the similarity level of the structure of the List 1 compounds. Dendrogram of hierarchical cluster analysis. Euclidean distance - complete linkage. Variables = first 10 structural principal components RANKING The reported structural analyses allowed the identification of some groups of the more similar List 1 compounds. These groups reflect only structural patterns of molecules, without any evaluation of their activities. Taking into account that the aim of this work was the finding of the most representative compounds, with different structural aspects, from among the 202 List 1 molecules, we have proposed (to Rolf Altenburger, our PREDICT project partner) some possible candidate compounds: 1) dieldrin [71] or endrin [77] 2) 2,4,5-trichlorophenol [122] 3) naphtalene [96] 4) phoxime [103] 5) biphenyl [11] or benzidine [8] 6)  or  -hexachlorocyclohexane [respectively 85 and 85b] 7) chloroacetic acid [16] or 1,3-dichloropropane-2-ol [66] or epichlorohydrins [78] 8) 2,4-D [45] or simazine [106] or atrazine [130suppl] 9) 1,1,2,2-tetrachloroethane [110] or 1,2-dichloroethane [59] 10) fluoranthene [99] or benzo(b)fluoranthene [99c] 11) one of the 17 PCB 12) triphenyltin chloride [126] or triphenyltin acetate [125] 13) tributhyltin oxide [115] DISCUSSION The considered cluster analysis method is the one performed with Euclidean distance and Complete linkage, made on the first 10 principal components of our molecular descriptors. These components explain about the 84% of the total information regarding structural variability, so noise and not very important information is excluded. Analogously a PCA has been performed on all molecular descriptors and the compounds clustering in different groups have been highlighted with different colours. This analysis allows us to highlights the distribution of all compounds and the relationships existing between the different identified structural clusters. A same approach based on the first 10 PCA as input variables was used to perform a Kohonen Map that shows the position of all 202 considered compounds, according to their structural features. Also in this case different colours were used to discriminate between the different structural clusters. Benzene derivatives (2) Chloroaliphatic compounds (7) DDT - PCBs (11) Organo-phosphates (12) Phen.-Triaz. (10) PAH (15) Chlorinated aliphatics (9) 0 2s/P005


Download ppt "P. Gramatica and F. Consolaro QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy."

Similar presentations


Ads by Google