Download presentation
Presentation is loading. Please wait.
Published byKory Craig Modified over 9 years ago
1
Structural Browsing Indices, Spotfire and Drug Discovery Mark Johnson 1 and Yong-jin Xu 2 1 Pannanugget Consulting; 2 Pharmacia, Inc. Spotfire Users Conference Philadelphia, May, 2001 mark@pannanugget.com © 2001, Pannanugget and Pharmacia, Inc. - All Rights Reserved.
2
Pulling Nuggets out of the Avalanche of Data High-throughput screening Larger project teamsCombinatorial chemistry Microarray technology External databases Internal databases Predicted parameters Data mergers & collaborations mark@pannanugget.com
4
A Low-Content View of the ACD Based on the Number of Atoms and Colored by the Number of Cyclic-System Hetero Atoms
5
A High-Content Cyclic-System View of the ACD
6
The Distinction between Low and High-Content Views is Gained or Lost in the First Step of Data Visualization Raw data Data tables Visual structures Views Data transformations Visual mappings View transformations Set of complex objects Imposed space of points Scatter plot or histogram Integrated Visualization Card, Mackinlay, & Shneiderman “Readings in Information Visualization”, p17
7
Viewing High-Dimensional Binary-Vector Spaces in Spotfire using Keyword-List Variables Complex object identifier Single high-content keyword-list variable 5-dimensional vector of 5 low-content variables Compound number Functional-group listAcidAmideKetoneSulfideAmine 1ketone00100 2sulfide00010 3amide amine01001 4acid10000 5acid amine10001
8
The Three Ways of Organizing Molecular Structures Substructure Partial Orderings Molecular Similarity Spaces Structural Browsing Indices
9
Questions Associated with Substructure Partial Orderings What structures contain a particular substructure? What structures contain a particular generic substructure? The basic spatial representation is a single indicator variable containing 1 for those structures satisfying the request and 0 otherwise.
10
Questions Associated with Molecular Similarity Spaces What structures are similar to a particular structure? How diverse is a collection of structures? Will a marketed collection of structures add significant diversity to our collection? How much do two collections overlap? The basic spatial representation is a high-dimensional table of low-content variables and/or possibly a matrix of pair- wise similarities (for collections of a 100 or less).
13
Questions Associated with Structural Browsing Indices Which structural classes are represented in a collection? Where is the overlap in two collections? Which classes of structures turned up active in a high- throughput screening program? What templates and positions have been investigated in a lead-optimization program and which are critical? The basic spatial representation is a small number of high- content variables with locally-related values.
14
Demos Exploring the ACD with a cyclic-system ordering Browsing lead-optimization synthetic efforts using a cyclic-system ordering and a side-chain ordering. Showing ACE inhibitors are aggregated in a maximal functional-group space
15
DisciplineKeyword lists CheminformaticsFunctional groups Ring systems Side chains Similar compounds Activity profiles ToxicologyToxicity profile Metabolic reactions GenomicsMolecular function Biological process Cellular component Similar genes or proteins Some Types of Keyword-List Variables in Drug Discovery.
16
Structural Browsing Indices: What, if anything besides the name, is new? Ring systems, functional groups, side chains are almost as old as chemistry. Adamson in the early 70s perceived and tabulated ring systems. Carhart et al. (1975) explored the reduced skeleton of a ring system. Lynch’s Sheffield group (1987) explored generic structure representations. Lawson’s similarity number (1990) is a “set-valued” browsing variable. Bemis and Murcko (1996, 1999) tabulated side chains, cyclic-systems, cyclic skeletons. LeadScope (2000) hierarchically structures overlapping classes based on chemical functionality and means of relating these to chemical properties. The need for the systematic development of molecular-equivalence-based browsing indices. The concept of a keyword-list variable and its importance in the visual data analysis.
17
Where will I be heading with Spotfire applications? Helping to shape and implement a vision of distributed visual data mining through publications, consulting, and workshops. Develop and distribute programs for constructing structural browsing indices and other keyword- list variables. Write a book on visual data mining. mark@pannanugget.com
18
Acknowledgements Yong-jin Xu CADD group Pharmacia & Upjohn chemists and biologists Research Informatics Team Bob Pearlman Molecular Design, Inc. Spotfire Inc. mark@pannanugget.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.