Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Interactive Datamining of Large-Scale Screening Datasets Klaus Engel, Thomas Ertl Visualization and Interactive.

Similar presentations


Presentation on theme: "© Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Interactive Datamining of Large-Scale Screening Datasets Klaus Engel, Thomas Ertl Visualization and Interactive."— Presentation transcript:

1 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Interactive Datamining of Large-Scale Screening Datasets Klaus Engel, Thomas Ertl Visualization and Interactive Systems Group University Stuttgart Frank Oellien, Wolf D. Ihlenfeldt Computer-Chemie-Centrum University Erlangen-Nuremberg

2 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Overview Multi-variate and multi-dimensional datasets Motivation Information Visualization Techniques Examples (ChemCodes Inc., NCI) Demo

3 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Overview Multi-variate and multi-dimensional datasets Motivation Information Visualization Techniques Examples (ChemCodes Inc., NCI) Demo

4 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Chemical data 0 2000000 4000000 6000000 8000000 10000000 12000000 14000000 16000000 18000000 Merck Katalog Synopsys PG ACX NCI DTP ChemInform Spresi Beilstein CAS Current datasets

5 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Multi-Variate and Multi-Dimensional Numeric Datasets Today Change in chemical synthesis technology new technologies (HTS, combinatorial synthesis)  experiments generate terabytes of data per year development of data mining and visualization tools could not keep pace most critical bottleneck in R&D today !  tools for interactive mining and information visualization are needed

6 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Tools for Interactive Visualization of Multi-Variate and Multi-Dimensional Data Standard applications barchart, 2D and pseudo 3D scatter plots, molecular spreadsheets limited to small subsets platform-dependent Our goal: applications that are simple to use allow straightforward interpretation of results generalized access to tabular numeric data platform-independent

7 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Overview Multi-variate and multi-dimensional datasets Motivation Information Visualization Techniques Examples (ChemCodes Inc., NCI) Demo

8 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 3D Tools for Interactive Information Visualization Information Visualization Applications that uses 3D capabilities of modern clients Glyph-based InfVis approaches Volume-based InfVis approaches

9 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Glyph-based InfVis Tools 3 orthogonal axes color shape size transparency surface effects animation up to ~100 Glyphs

10 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Java/Java3D InfVis Applet Tool Panel (filters, selection tools, details) Java3D Canvas Control Panel

11 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Java/Java3D InfVis Applet 3D Render Panel 3D Barchart3D Glyphs

12 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Java/Java3D InfVis Applet 3D Tool Panel Dynamic Filter Tools Selection Tools Detail Tools

13 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Java/Java3D InfVis Applet 3D Control Panel

14 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Advantages of Volume-based InfVis Tools Databases with millions of data points – Glyph-based InfVis approaches produce millions of geometric primitives interactive visualization not possible – Volume-based InfVis approaches can handle large number of data points interactive visualization using low-cost graphics hardware is possible

15 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Overview Multi-variate and multi-dimensional datasets Motivation Information Visualization Techniques Examples (ChemCodes Inc., NCI) Demo

16 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 ChemCodes Reaction Database 100 most important FGs ~75% chemistry 100 standard reactions Limits of standard reactions Functional Group Compatibility Generating Rules Goal: Analysis of the reaction space

17 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 ChemCodes - Reaction Optimization I Goal: Reaction Optimization: > 95% Yield 7 Dimensions: reagent, solvent, time, temperature, stoichiometry, reagent order, FG-compatibility

18 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 ChemCodes - Reaction Optimization II

19 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Functional Group Compatibility Check ChemCodes - Reaction Planning

20 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Example 2: NCI Anti-tumor / Anti-viral Database Initiated in April 1990 (modified 1994) ~ 250.000 compounds ~ 30.000 with anti-tumor screening data Enhanced NCI Database Browser > 30 different molecular properties up to 23 3D conformers per compound

21 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Lead Compound Discovery II

22 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Lead Compound Discovery II

23 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Overview Multi-variate and multi-dimensional datasets Motivation Information Visualization Techniques Examples (ChemCodes Inc., NCI) Demo

24 © Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Acknowledgment Prof. Johann Gasteiger Computer-Chemie-Centrum University of Erlangen-Nuremberg Prof. Thomas Ertl, Dipl. Inf. Klaus Engel Visualization and interactive Systems University of Stuttgart Dr. Patrick Kiser, Dr. Gary Eichenbaum ChemCodes Inc. Marc Nicklaus Laboratory of Medicinal Chemistry NCI, NIH Deutsche Forschungsgemeinschaft


Download ppt "© Oellien, Ihlenfeldt, Engel, Ertl C3C3 MMWS 2002 Interactive Datamining of Large-Scale Screening Datasets Klaus Engel, Thomas Ertl Visualization and Interactive."

Similar presentations


Ads by Google