Screen Ligand based virtual screening presented by … maintained by Miklós Vargyas Last update: 13 April 2010
Screen Virtual screening by topological descriptors
Screen performs high throughput virtual screening of compound libraries using similarity comparisons by various molecular descriptors. Description of the product Screen Availabilty JChemBase JChem Oracle cartridge Instant Jchem Server version standalone command line application programs KNIME PipelinePilot
Various 2D descriptors ChemAxon chemical fingerprint (CCFP) PipelinePilot ECFP/FCFP ChemAxon pharmacophore fingerprint (CPFP) BCUT Scalars (logP, logD, Szeged index …) custom descriptors, in-house fingerprints Optimized similarity measures Improves similarity prediction depends on set of known actives high enrichment ratios in virtual screening Multiple queries 3 types of hypotheses combined hit lists Key features
Versatile Use various descriptors in your well established model Access your trusted in-house fingerprint in IJC, JCB, JCART Easy integration in corporate discovery pipelines Search chemical files directly no need to import structures in database New descriptors are pluggable in deployed systems Optimal Consistent similarity scores Smaller hit set More focused library Benefits
More consistent similarity scores Benefits regular Tanimoto optimized Tanimoto
High enrichment ratio Fewer false hits Known actives are true positive hits (ACE inhibitors) Benefits
Results NPY-5 (pharmacophore similarity)
β2-adrenoceptor (pharmacophore similarity) Results
Case study at Axovan GPCR activity prediction distinguishing between GPCR subclasses GPCR-Tailored Pharmacophore Pattern Recognition of Small Molecular Ligands Modest von Korff and Matthias Steger, JCICS 2004, 44
Screen roadmap New molecular descriptors –ECFP/FCFP (in 5.4) –Shape descriptors (in 5.4) Hidden use of the optimiser –No-pain black-box approach –Simultaneous multi-descriptor search Enhanced IJC integration –Easy descriptor configuration and generation –Similarity search type instead of descriptors, metrics and other unfriendly concepts
Screen roadmap GUI –New web interface (HTML/AJAX) –Desktop application for descriptor generation 3D shape similarity –fast pre-filtering by 3D fingerprint –Alignment based volumetric Tanimoto calculation –scaffold hopping by maximizing topological dissimilarity and spatial similarity
Supplementary slides
query targets query fingerprint metric target fingerprints hits A typical approach
queries targets hypothesis fingerprint optimized metric target fingerprints hits optimization ChemAxons approach
Chemical fingerprint generation: 500/s Pharmacophore fingerprint generation calculated: 80/s rule-based: 200/s Screening: 12000/s Optimization: 10s/metric Hardware/software environment: P4 3GHz, 1GB RAM Red Hat Linux 9 Java Performance
Use of various fingerprints and metrics in JSP UGM presentation by Aureus Pharma Improved Virtual Screening Strategies and Enrichment of Focused Libraries in Active Compounds Using Target- Oriented Databases Implementations
Chemical, pharmacological or biological properties of two compounds match. The more the common features, the higher the similarity between two molecules. Chemical Pharmacophore Molecular similarity
Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics. Quantitative assessment of similarity of structures need a numerically tractable form molecular descriptors, fingerprints, structural keys Similarity measures
(, ) = 0.68 (, ) = Standard metrics
hashed binary fingerprint encodes topological properties of the chemical graph: connectivity, edge label (bond type), node label (atom type) allows the comparison of two molecules with respect to their chemical structure Construction 1. find all 0, 1, …, n step walks in the chemical graph 2. generate a bit array for each walks with given number of bits set 3. merge the bit arrays with logical OR operation Topological chemical fingerprint
lengthwalkbit array 0C C – H C – C C – C – H C – C – O C – C – O – H ALL CCOHH H H HH Construction of chemical fingerprint
Chemical similarity
encodes pharmacophore properties of molecules as frequency counts of pharmacophore point pairs at given topological distance allows the comparison of two molecules with respect to their pharmacophore Construction 1. perceive pharmacophoric features 2. map pharmacophore point type to atoms 3. calculate length of shortest path between each pair of atoms 4. assign a histogram to every pharmacophore point pairs and count the frequency of the pair with respect to its distance Topological pharmacophore fingreprint
Rule based approach donor Rule 1: The pharmacophore type of an atom is an acceptor, if it is a nitrogen, oxygen or sulfur, and it is not an amide nitrogen or sulfur, and it is not an aniline nitrogen, and it is not a sulfonyl sulfur, and it is not a nitro group nitrogen. acceptor Pharmacophore perception
sp2 atom n-cyano-methil piperidine donor exception extra rules large number of rules maintenance, performance Exceptions to simple rules
pH = 7 pH = 1 acceptor donor pH pH specific rules large number of rules maintenance, performance Effect of pH
Step 1: estimation of pK a allows the determination of the protonation state for ionizable groups at the given pH Step 2: partial charge calculation Pharmacophore perception Calculation based approach
Step 3: hydrogen bond donor/acceptor recognition Step 4: aromatic perception Step 5: pharmacophore property assignment acceptor negatively charged acceptor acceptor and donor hydrophobic none Pharmacophore perception Calculation based approach
Pharmacophore type coloring: acceptor, donor, hydrophobic, none. Pharmacophore fingerprint
0 1 2 AA1AA2AA3AA4AA5AA6 0 1 AA1AA2AA3AA4AA5AA6 D E = AA1AA2AA3AA4AA5AA AA1AA2AA3AA4AA5AA6 D E =0.45 Fuzzy smoothing
query targets query fingerprint metric target fingerprints hits Virtual screening using fingerprints
queries targets hypothesis fingerprint metric target fingerprints hits Multiple query structures
allows faster operation compiles features common to each individual actives reduces noise Active Active Active Minimum Average Median Hypothesis types Advantages Hypothesis fingerprints
AdvantagesDisadvantages Minimum strict conditions for hits if actives are fairly similar false results with asymmetric metrics misses common features of highly diverse sets very sensitive to one missing feature Average captures common features of more diverse active sets less selective if actives are very similar Median captures common features of more diverse active sets specific treatment of the absence of a feature less sensitive to outliers less selective if actives are very similar Hypothesis fingerprints
Too many hits The need for optimization
Inconsistent dissimilarity values The need for optimization
asymmetry factor scaling factor asymmetry factor weights Parametrized metrics
selected targets training set test setknown actives query set training set test set Step 1 optimize parameters for maximum enrichment Step 2 validate metrics over an independent test set Optimization of metrics
query set training set Step 1 optimize parameters for maximum enrichment query fingerprint parametrized metric Optimization of metrics
v1v1 v2v2 v3v3 vivi vnvn potential variable value temporarily fixed value running variable value final value Optimization of metrics
test set Step 2 validate metrics over an independent test set query set query fingerprint optimized metric Optimization of metrics
Similar structures get closer Results of Optimization
2. Hit set size reduced Active set: 18 mGlu-R1 antagonists Target set: randomly selected drug-like structures Results of Optimization
3. Higher enrichment Results of Optimization
4. Top ranked structures are spikes offers a more intuitive way to evaluate the efficiency of screening based on sorting random set hits and known actives on dissimilarity values and counting the number of random set hits preceding each active in the sorted list number of spikes retrieved number of virtual hits Results of Optimization
ACE (pharmacophore similarity) Results
NPY-5 (pharmacophore similarity)
β2-adrenoceptor (pharmacophore similarity) Results
3D flexible search Expected top performance 200 structures/s