Genomics Research Institute University of Cincinnati Compound Library Wm. L. Seibel January 10, 2007
Overview Library Overview Compound Characteristics – Design Concepts – Drug-Like Library Screening Options Summary of Library Advantages
Compound Repository Haystack Neat Compound Storage – Capacity = 200,000 bottles – Current = 207,000 bottles – Freezer storage when appropriate Solar (Solution Archive) – DMSO solutions – Capacity = 1.8 million tubes, 10,000 deep well (96) plates, 13,600 shallow well (384) plates – Current = 325,000 unique compounds Related Compound Handling and Dissolution instruments. Housed at P&G’s Mason Business Center in ca sf lab space
Haystack ® Neat Chemical Storage
Solar ® Solution Storage
Library Design Principles
Target Identification Target Validation Compound Screening LEAD Compound Optimization Drug Lead Discovery …greatly simplified Compound Selection can greatly enhance Efficiency H2L Activity Confirmation
Library Compounds The UC/GRI Compound Library is comprised of compounds from four general categories: – 1. Compounds purchased from numerous sources selected to provide a diverse representation across “drug-like” structural properties. – 2. Compounds purchased that specifically target kinases and GPCRs – 3. Compounds prepared in-house specifically for projects in kinases, GPCRs, phosphatases, ion channels and proteases donated from P&G Pharmaceuticals. – 4. Combinatorial Chemistry contract syntheses (Lower Priority Cmpds). This screening library is broadly diverse across drug- like space, with enhanced concentrations in areas of key biological relevance, including notably, kinases and GPCRs.
P&G Pharma Selected Compounds Chemically Diverse – Represented uniformly across drug-like space. – Want to ensure uniform, comprehensive and diverse representation of compounds across the structural & property types that are typical of drugs and lead structures. Compounds selected based on drug like properties (within “Drug-Like Space”) – Chemical and Property Filters – Lipinski, Veber etc. rules Total P&G investment to assemble repository = $22 M (over past 10 years)
Vendor Database Remove duplicates Remove reactives, Unusual groups, & toxicophores (80 substructures) MW filter Solubility Filter Lipinski Rule of Five > 5 H-bond donors MW < 500 c log P < 5 N's + O's < 10 “Cleaned” database 26 databases >4 million structures Chemical Property Filters
Diversity Analysis
Describing Molecular Structure Convert molecular structure into numerical values by making computations of specific structural features Structure Computations Numeric Descriptors Relevant to Binding Functions
Diversity Assessment Methodology Used BCUT descriptors * – R.S. Pearlman, UT at Austin – DiverseSolutions (now available from Tripos) Computed ~120 BCUTs Selected a best subset of 6 BCUTs – 6D space – visualization is a challenge * J. Chem. Inf. Comput. Sci. 1999, 39,
Pearlman’s BCUT descriptors * 6D Chem-space (structure-space) – 2 atomic partial-charge descriptors – 2 atomic polarizability descriptors – a hydrogen-bond acceptor descriptor – a hydrogen-bond donor descriptor *
Concept of Chemistry Space Desc-1 Desc-2
Defining Drug Space Based on structures of “drug-like” compounds from The World Drug Index (WDI) The Nation Cancer Institute Open Database Desc-1 Desc-2 Desc-1 Desc-2
Diverse Subset Selection Avoiding “redundant” representations
Compound Supply External Suppliers (20+ vendors) – Brokerage Houses Individual Compounds (Diversity) Target Directed Libraries – Combinatorial Chemistry Companies Corporate Suppliers – P&G Pharmaceuticals Focus Areas - Medicinal Chemistry Kinase, GPCR, Phosphatase, Ion Channel, proteases Lead ID – Combinatorial Chemistry
Vendor “Dependability” A B C D E F G H I J K L M N O
On the Other Hand… Even within Drug-like space, certain classes can be somewhat clustered. This library therefore has added “focused libraries” from internal synthesis and external vendors emphasizing compounds relevant to: – GPCRs – Kinases
P&G Pharma Selected Compounds Defined by experienced medicinal chemists – Broad, uniform distribution across Drug Space with concentrations of density in key areas from directed purchase and in house synthesis. Compare to 5 vendor screening collections – 3,000 to 500,000 compounds – 27% - 56% of vendors’ compound collections do NOT meet criteria for drug-like – UC Compound collection is 2X to 100X more chemically diverse across Drug Space. Vendor Libraries are inherently predisposed to clustered groupings. We can pick the best, most relevant compounds from each.
Screening Library Options
Screening Library Design Options Diverse broad collections – Comprehensive screening against all available compounds (ca. 250,000 cmpds) – Screening against a representative subset of available compounds (e.g cmpds) Class-associated compounds – Compounds with structural features often associated with a particular target (e.g. kinases). Structure-based compound selection – Virtual Screening of a crystal structure or high quality homology model to identify the most likely inhibitors (ca cmpds), followed by assay of these compounds. – Virtual Screening as above based on pharmacophore models from known ligands of the target.
Diverse Subset Selection Same Concept as Previously Diversity Analysis 5,000 Cmpd Abstract
Diverse Subset Selection Execute Assay on subset of compounds MTS Assay Identify Hits in Assay 5,000 Cmpd Abstract
Diverse Subset Selection Pull Similar Compounds from original 250K Set Similarity Search 300 Cmpd Similarity Library
Diverse Subset Selection Pull Similar Compounds from original 250K Set MTS Assay Identify Hits in Assay 300 Cmpd Similarity Library This Cycle can be repeated several times until no new actives are found
Selection of Nearest Neighbors of Hits Biological hit Near neighbor
Iterative Cycling Assay 5000 Cmpd Representative Library ~ 1000 Cmpd NN Library ~ 20 Cmpd Hit List NN Search of UC/GRI Library NN Search of Commercial Compounds ~ 50 Cmpd Hit List Assay 2-3 Iterations Final Set Final Hit List ~ 1000 Cmpd NN Library
Diverse Subset Selection Pull Similar Compounds from Commercial 4.8M Set Similarity Search 4.8 M Commercial Library 300 Cmpd Similarity Library Assay for actives, and cycle hits back through similarity search loop.
Class-Associated Compounds Select compounds similar to compounds known to intereact with target class Similarity Analysis 250,000 Cmpd Library 15,000 Cmpd Library Target Active
Virtual Screening Screen GRI/UC library Screen Commercial Cmpds
Iterative Cycling Assay 5000 Cmpd Representative Library ~ 1000 Cmpd NN Library ~ 20 Cmpd Hit List NN Search of UC/GRI Library NN Search of Commercial Compounds ~ 50 Cmpd Hit List Assay 2-3 Iterations Final Set Final Hit List ~ 1000 Cmpd NN Library Hits of any origin can enter the cycle at this point.
Hit to Lead Follow-up (H2L)
How to determine optimal hits for follow-up – Confirm ID and activity of hits – Cluster into groups of related compounds – Develop preliminary SAR info on each cluster ID Key features for binding & selectivity – Assess Each Cluster for optmization “Which compounds have fewest problems?” Synthetic Ease Proprietary Assessment Selectivity Issues Physical Properties Metabolic Handles Cellular Activity
Summary
UC/P&GP Library Advantages Quality Advantages – Library carefully constructed to span drug-like space. – Compounds restricted to those with properties consistent with clinical materials. – Proven to produce viable hits for follow-up programs. – Comparisons have uniformly been favorable relative to commercial vendor sets. – Includes targeted subsets of compounds for key areas: GPCRs, Kinases, Phosphatases, Ion channels. Practical Advantages – SD file of structures and ID tags furnished for unrestricted use. – Many compounds from commercial sources, so resupply likely to be easy. – Materials supplied in microtiter plates (96 or 384) as requested. – Solution Stores made from local dry stores, so follow-up assays will be rapid. Technical Advantages – Act as Liaison with screening group (internal or external). – Participate in advisory committee for compound acquisition decisions.
Library Use and Data Interpretation Library Design Assistance – Computational assistance in selecting diverse subsets or directed subsets. – Computational assistance in selecting compounds similar to known leads (Nearest Neighbor). – Computational assistance in virtual screening by pharmacophore or protein docking.
Library Use and Data Interpretation Follow-up Assistance – Resupply assistance, synthesis info, supplier info – Assistance in obtaining related available compounds (Similarity, substructure, Unity, Pharmacophore). – Provide preliminary lit search info (known info, IP, etc) on prominent hits. – Clustering of hits into chemical/pharmacophore classes included. – Provide help identifying chemistry groups with related interests for collaborations – Provide assistance in connecting with contract chemistry services (consult).
Questions
Thank you
Acknowledgements Operations – Stacey Frazier – Kathy Gibboney Computational – Matt Wortman – David Stanton – Prakash Madhav Management – Ruben Papoian – Sandra Nelson – Joseph Gardner – Kenny Morand