Download presentation
Presentation is loading. Please wait.
Published byMabel Wright Modified over 9 years ago
1
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining in Chemistry Markus C. Hemmer Computer-Chemie-Centrum, Universität Erlangen-Nürnberg D-91054 Erlangen, Germany
2
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 What is Data Mining ? Data Mining is an analytical process designed to explore large amounts of data in search for consistent patterns and systematic relationships. „...a non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data“ (Srikant, Agrawal, 1996)
3
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 100 1920194019601980 200 300 400 500 600 2000 700 800 Yearly number of documents in Chemical Abstracts Amount of Information in Chemistry 4 8 12 16 20 24 Millions 197019801990 2000 Number of registered substances Thousands
4
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 The Chemical Language C 10 H 13 Cl 2 O 3 PS Dichlophenthion Phosphorothioic acid O-2,4-dichlorophenyl O,O-diethyl ester ClC(C(=C1)OP(=S)(OCC)OCC)=CC(=C1)Cl
5
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Search for Cancerostatic Drugs similar substratesprotein/substrate complex
6
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 chemical reactivity biological activity Representation of Properties
7
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Non-linear Projection onto a Torus
8
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Comparison of Steroid Surfaces 3,20-Allopregnandion3,20-Pregnandion
9
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Descriptor of a Polycyclic System
10
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Visualization of Multidimensional Data
11
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Research and Projects at the CCC TeleSpec Evaluation of Reactions Drug Design Synthesis Design Structure/Spectrum Correlation Dissertation online SOL Biochemical Pathways ChemVis QSAR/QSPR VS-C
12
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Software Development at the CCC CORINA 3D structure generator PETRA atomic property calculator ARC descriptor generator KMAP Kohonen network generator CACTVS chemical information system EROS reaction prediction expert system CORA reaction classification system WODCA synthesis design expert system
13
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining Dienst – Chemie (Data Mining Service – Chemistry) Pattern Recognition Substructure Search Similarity Search Diversity Search Pattern Analysis Property Search
14
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Information Sources Simulation Analysis Databases Calculation
15
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 The Concept of Data Mining Service - Chemistry
16
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Descriptor Software
17
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Searching a Substructure substructure search
18
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Acknowledgements Chemical Information Dr. Thomas Engel Databases & Visualization Dr. Wolf-Dietrich Ihlenfeldt Frank Oellien Expert Systems Achim Herwig Genetic Algorithms Dr. Sandra Handschuh Neural Networks Dr. Andreas Teckentrup Dr. Lothar Terfloth Spectroscopy Dr. Paul Selzer Thomas Kostka Structures & Properties Thomas Kleinöder Christof Schwab Structure Coding Dr. Joao Aires de Sousa Dr. Valentin Steinhauer Synthesis Planning Dr. Matthias Pförtner Markus Sitzmann Team Coordination Prof. Dr. Johann Gasteiger
19
[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Contact Information Email: Johann.Gasteiger@ccc.chemie.uni-erlangen.de Markus.Hemmer@chemie.uni-erlangen.de WWW: http://www2.chemie.uni-erlangen.de
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.