Presentation is loading. Please wait.

Presentation is loading. Please wait.

[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining in Chemistry Markus C. Hemmer Computer-Chemie-Centrum, Universität Erlangen-Nürnberg.

Similar presentations


Presentation on theme: "[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining in Chemistry Markus C. Hemmer Computer-Chemie-Centrum, Universität Erlangen-Nürnberg."— Presentation transcript:

1 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining in Chemistry Markus C. Hemmer Computer-Chemie-Centrum, Universität Erlangen-Nürnberg D-91054 Erlangen, Germany

2 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 What is Data Mining ? Data Mining is an analytical process designed to explore large amounts of data in search for consistent patterns and systematic relationships. „...a non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data“ (Srikant, Agrawal, 1996)

3 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 100 1920194019601980 200 300 400 500 600 2000 700 800 Yearly number of documents in Chemical Abstracts Amount of Information in Chemistry 4 8 12 16 20 24 Millions 197019801990 2000 Number of registered substances Thousands

4 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 The Chemical Language C 10 H 13 Cl 2 O 3 PS Dichlophenthion Phosphorothioic acid O-2,4-dichlorophenyl O,O-diethyl ester ClC(C(=C1)OP(=S)(OCC)OCC)=CC(=C1)Cl

5 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Search for Cancerostatic Drugs similar substratesprotein/substrate complex

6 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 chemical reactivity biological activity Representation of Properties

7 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Non-linear Projection onto a Torus

8 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Comparison of Steroid Surfaces 3,20-Allopregnandion3,20-Pregnandion

9 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Descriptor of a Polycyclic System

10 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Visualization of Multidimensional Data

11 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Research and Projects at the CCC TeleSpec Evaluation of Reactions Drug Design Synthesis Design Structure/Spectrum Correlation Dissertation online SOL Biochemical Pathways ChemVis QSAR/QSPR VS-C

12 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Software Development at the CCC CORINA 3D structure generator PETRA atomic property calculator ARC descriptor generator KMAP Kohonen network generator CACTVS chemical information system EROS reaction prediction expert system CORA reaction classification system WODCA synthesis design expert system

13 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining Dienst – Chemie (Data Mining Service – Chemistry) Pattern Recognition Substructure Search Similarity Search Diversity Search Pattern Analysis Property Search

14 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Information Sources Simulation Analysis Databases Calculation

15 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 The Concept of Data Mining Service - Chemistry

16 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Descriptor Software

17 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Searching a Substructure substructure search

18 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Acknowledgements Chemical Information Dr. Thomas Engel Databases & Visualization Dr. Wolf-Dietrich Ihlenfeldt Frank Oellien Expert Systems Achim Herwig Genetic Algorithms Dr. Sandra Handschuh Neural Networks Dr. Andreas Teckentrup Dr. Lothar Terfloth Spectroscopy Dr. Paul Selzer Thomas Kostka Structures & Properties Thomas Kleinöder Christof Schwab Structure Coding Dr. Joao Aires de Sousa Dr. Valentin Steinhauer Synthesis Planning Dr. Matthias Pförtner Markus Sitzmann Team Coordination Prof. Dr. Johann Gasteiger

19 [vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Contact Information Email: Johann.Gasteiger@ccc.chemie.uni-erlangen.de Markus.Hemmer@chemie.uni-erlangen.de WWW: http://www2.chemie.uni-erlangen.de


Download ppt "[vermeer]slides/IR/DataMining.ppt © Gasteiger et al. C3C3 Data Mining in Chemistry Markus C. Hemmer Computer-Chemie-Centrum, Universität Erlangen-Nürnberg."

Similar presentations


Ads by Google