Structural Browsing Indices, Spotfire and Drug Discovery Mark Johnson 1 and Yong-jin Xu 2 1 Pannanugget Consulting; 2 Pharmacia, Inc. Spotfire Users Conference.

Slides:



Advertisements
Similar presentations
The new JKlustor suite Miklós Vargyas Solutions for Cheminformatics.
Advertisements

Solutions for Cheminformatics
STRING Prediction of protein networks through integration of diverse large-scale data sets Lars Juhl Jensen EMBL Heidelberg.
Analysis of High-Throughput Screening Data C371 Fall 2004.
MitoInteractome : Mitochondrial Protein Interactome Database Rohit Reja Korean Bioinformation Center, Daejeon, Korea.
Privileged Substructures Revisited: Target Community-Selective Scaffolds Jürgen Bajorath Life Science Informatics University of Bonn.
CHAPTER 4 CARBON AND THE MOLECULAR DIVERSITY OF LIFE
Although cells are 70-95% water, the rest consists mostly of carbon-based compounds. Proteins, DNA, carbohydrates, and other molecules that distinguish.
Collaborative Information Management: Advanced Information Processing in Bioinformatics Joost N. Kok LIACS - Leiden Institute of Advanced Computer Science.
Interoperation of Molecular Biology Databases Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International Menlo Park, CA
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Please turn in the iPad User Agreement
Microarray Gene Expression Data Analysis A.Venkatesh CBBL Functional Genomics Chapter: 07.
The Molecules of Life Chapter 3.
CSE 6406: Bioinformatics Algorithms. Course Outline
Topological Summaries: Using Graphs for Chemical Searching and Mining Graphs are a flexible & unifying model Scalable similarity searches through novel.
VAMOS Visualization of Accessible Molecular Space A new compound filtering and selection interface Spotfire User Conference - Europe - May , 2003.
Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi.
DNA microarray technology allows an individual to rapidly and quantitatively measure the expression levels of thousands of genes in a biological sample.
Carbon is Simply Amazing. Ch 4. With a total of 6 electrons, a carbon atom has 2 in the first shell and 4 in the second shell. –Carbon has little tendency.
Introduction to Chemoinformatics Irene Kouskoumvekaki Associate Professor December 12th, 2012 Biological Sequence Analysis course.
Carbon The LEGO of biological molecules!
Pages 34 to 36.  Can form 4 covalent bonds  Can form rings or long chains – allowing for complex structures.
Fall 2002CS/PSY Information Visualization Picture worth 1000 words... Agenda Information Visualization overview  Definition  Principles  Examples.
Custom Spotfire Applications for use in Drug Discovery Chris Louer Team Leader, Cheminformatics © 2001, GlaxoSmithKline, Inc. - All Rights Reserved.
New approaches to elucidating Structure Activity Relationships Chris Petersen Technical Manager, Informatics.
MolIDE2: Homology Modeling Of Protein Oligomers And Complexes Qiang Wang, Qifang Xu, Guoli Wang, and Roland L. Dunbrack, Jr. Fox Chase Cancer Center Philadelphia,
TOPIC 11 – ORGANIC CHEMISTRY. TOPIC 11 – Regents Review Organic compounds consist of carbon atoms bonded to each other in chains, rings, and networks.
Introduction to Chemistry Chapter 2. Introduction Matter - anything that has mass Made of elements (92 naturally occurring Element - substance that cannot.
© 2001, Boehringer, Inc. - All Rights Reserved. SCA: New Cluster Algorithm for Structural Diversity Analysis and Applications Presented at Spotfire Users.
BioPaths-Catalyze Drug Discovery, Development and Clinical Research
Chapter 5 The Periodic Table.
Structural Models Lecture 11. Structural Models: Introduction Structural models display relationships among entities and have a variety of uses, such.
Carbon and the Molecular Diversity of Life Biological macromolecules – carbohydrates, protein, lipids, and nucleic acids (DNA & RNA) - are all composed.
1 ArrayTrack Demonstration National Center for Toxicological Research U.S. Food and Drug Administration 3900 NCTR Road, Jefferson, AR
An overview of Bioinformatics. Cell and Central Dogma.
ECCR Overview/MLSCN. NIH Roadmap Series of initiatives designed to pursue major opportunities in biomedical research and gaps in current knowledge that.
A collaborative tool for sequence annotation. Contact:
Catalyst TM What is Catalyst TM ? Structural databases Designing structural databases Generating conformational models Building multi-conformer databases.
Introduction to Chemoinformatics and Drug Discovery Irene Kouskoumvekaki Associate Professor February 15 th, 2013.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
© 2014 Pearson Education, Inc. Carbon: The Backbone of Life  Living organisms consist mostly of carbon-based compounds  Carbon is unparalleled in its.
Chapter 4 Carbon and the Molecular Diversity of Life (aka Organic Chemistry)
Carbon is Simply Amazing. Ch 4
Carbon and the Molecular Diversity of Life
© 2017 Pearson Education, Inc.
Chapter 4 – Carbon and Molecular Diversity of Life
CARBON AND MOLECULAR DIVERSITY
Chapter 4 Carbon.
Chapter 4 Carbon jprthpwoirhtpwoith.
Carbon and the Molecular Diversity of Life
Chapter 22 Organic Compounds
Carbon and the Molecular Diversity of Life
Concept 4.1: Organic chemistry is the study of carbon compounds
Chemistry 24.3.
Carbon and the Molecular Diversity of Life
Visualizing Document Collections
Chemical Space Navigation using SpotFire DecisionSite
Carbon and the Molecular Diversity of Life
Carbon and the Molecular Diversity of Life
Consortium: National networks in 16 European countries.
Consortium: National networks in 16 European countries.
Data Analysis – Part1: The Initial Questions of the AFCS
Predicting Gene Expression from Sequence
Carbon and the Molecular Diversity of Life
Carbon and the Molecular Diversity of Life
Global analysis of the chemical–genetic interaction map.
Presentation transcript:

Structural Browsing Indices, Spotfire and Drug Discovery Mark Johnson 1 and Yong-jin Xu 2 1 Pannanugget Consulting; 2 Pharmacia, Inc. Spotfire Users Conference Philadelphia, May, 2001 © 2001, Pannanugget and Pharmacia, Inc. - All Rights Reserved.

Pulling Nuggets out of the Avalanche of Data High-throughput screening Larger project teamsCombinatorial chemistry Microarray technology External databases Internal databases Predicted parameters Data mergers & collaborations

A Low-Content View of the ACD Based on the Number of Atoms and Colored by the Number of Cyclic-System Hetero Atoms

A High-Content Cyclic-System View of the ACD

The Distinction between Low and High-Content Views is Gained or Lost in the First Step of Data Visualization Raw data Data tables Visual structures Views Data transformations Visual mappings View transformations Set of complex objects Imposed space of points Scatter plot or histogram Integrated Visualization Card, Mackinlay, & Shneiderman “Readings in Information Visualization”, p17

Viewing High-Dimensional Binary-Vector Spaces in Spotfire using Keyword-List Variables Complex object identifier Single high-content keyword-list variable 5-dimensional vector of 5 low-content variables Compound number Functional-group listAcidAmideKetoneSulfideAmine 1ketone sulfide amide amine acid acid amine10001

The Three Ways of Organizing Molecular Structures Substructure Partial Orderings Molecular Similarity Spaces Structural Browsing Indices

Questions Associated with Substructure Partial Orderings What structures contain a particular substructure? What structures contain a particular generic substructure? The basic spatial representation is a single indicator variable containing 1 for those structures satisfying the request and 0 otherwise.

Questions Associated with Molecular Similarity Spaces What structures are similar to a particular structure? How diverse is a collection of structures? Will a marketed collection of structures add significant diversity to our collection? How much do two collections overlap? The basic spatial representation is a high-dimensional table of low-content variables and/or possibly a matrix of pair- wise similarities (for collections of a 100 or less).

Questions Associated with Structural Browsing Indices Which structural classes are represented in a collection? Where is the overlap in two collections? Which classes of structures turned up active in a high- throughput screening program? What templates and positions have been investigated in a lead-optimization program and which are critical? The basic spatial representation is a small number of high- content variables with locally-related values.

Demos Exploring the ACD with a cyclic-system ordering Browsing lead-optimization synthetic efforts using a cyclic-system ordering and a side-chain ordering. Showing ACE inhibitors are aggregated in a maximal functional-group space

DisciplineKeyword lists CheminformaticsFunctional groups Ring systems Side chains Similar compounds Activity profiles ToxicologyToxicity profile Metabolic reactions GenomicsMolecular function Biological process Cellular component Similar genes or proteins Some Types of Keyword-List Variables in Drug Discovery.

Structural Browsing Indices: What, if anything besides the name, is new? Ring systems, functional groups, side chains are almost as old as chemistry. Adamson in the early 70s perceived and tabulated ring systems. Carhart et al. (1975) explored the reduced skeleton of a ring system. Lynch’s Sheffield group (1987) explored generic structure representations. Lawson’s similarity number (1990) is a “set-valued” browsing variable. Bemis and Murcko (1996, 1999) tabulated side chains, cyclic-systems, cyclic skeletons. LeadScope (2000) hierarchically structures overlapping classes based on chemical functionality and means of relating these to chemical properties. The need for the systematic development of molecular-equivalence-based browsing indices. The concept of a keyword-list variable and its importance in the visual data analysis.

Where will I be heading with Spotfire applications? Helping to shape and implement a vision of distributed visual data mining through publications, consulting, and workshops. Develop and distribute programs for constructing structural browsing indices and other keyword- list variables. Write a book on visual data mining.

Acknowledgements Yong-jin Xu CADD group Pharmacia & Upjohn chemists and biologists Research Informatics Team Bob Pearlman Molecular Design, Inc. Spotfire Inc.