a Virtual Compound Space Screening a Virtual Compound Space Szabolcs Csepregi Ferenc Csizmadia Szilárd Dóránt Nóra Máté György Pirok Zsuzsanna Szabó Jenő Varga Miklós Vargyas ChemAxon Ltd. Máramaros köz 3/a 1037 Budapest Hungary www.chemaxon.com
Drug research Finding or making a needle in the hay stack? virtual screening JChem Screen de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
Drug research Finding or making a needle in the hay stack? virtual screening JChem Screen de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
Virtual Screening Find something similar to a fistful of needles corporate database structures found known actives
Molecular similarity How to tackle it? Quantitative assessment of similarity/dissimilarity of structures need a numerically tractable form molecular descriptors, fingerprints, structural keys Sequences/vectors of bits, or numeric values that can be compared by distance functions, similarity metrics.
Virtual screening using fingerprints Multiple query structures 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0101110100110101010111111000010000011111100010000100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0101110100110101010111111000010000011111100010000100001000101000 hits queries hypothesis fingerprint metric 0000000100001101000000101010000000000110000010000100001000001000 0100010110010010010110011010011100111101000000110000000110001000 0100010100011101010000110000101000010011000010100000000100100000 0001101110011101111110100000100010000110110110000000100110100000 0100010100110100010000000010000000010010000000100100001000101000 0100011100011101000100001011101100110110010010001101001100001000 0101110100110101010111111000010000011111100010000100001000101000 0100010100111101010000100010000000010010000010100100001000101000 0001000100010100010100100000000000001010000010000100000100000000 0100010100010011000000000000000000010100000010000000000000000000 0100010100010100000000000000101000010010000000000100000000000000 0101010101111100111110100000000000011010100011100100001100101000 0100010100011000010000011000000000010001000000110000000001100000 0000000100000000010000100000000000001010100000000100000100100000 0100010100010100000000100000000000010000000000000100001000011000 0001000100001100010010100000010100101011100010000100001000101000 0100011100010100010000100001001110010010000010001100000000101000 0101010100010100010100100000000000010010000010010100100100010000 targets target fingerprints
Optimized virtual screening Parameterized metrics asymmetry factor scaling factor asymmetry factor weights
How good is optimized virtual screening? β2-adrenoceptor antagonist
Is virtual screening a discovery tool? Scaffold hopping
Drug research Finding or making a needle in the hay stack? virtual screening JChem Screen de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
JChem AnalogMaker Workflow Lead Candidates
Fragmentation Examples Fragmentation rules Original molecule Generated fragments Fragment 1 amide 2 Amide Fragment 2 amide 1 ester 1 Ester Fragment 3 ester 2
Fragmentation RECAP rules 1 = amide 2 =ester 3 = amine 4 = urea 5 = ether 6 = olefin 7 = quaternary nirogen 8 = aromatic N carbon 9 = lactam N carbon 10 = aromatic carbon – aromatic carbon 11 = sulphonamide Xiao Qing Lewell, Duncan B. Judd, Stephen P. Watson, Michael M. Hann; RECAP – retrosynthetic combinatorial analysis procedure: a powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry. J. Chem. Inf. Comput. Sci. 1998, 38, 511–522
JChem AnalogMaker General algorithm start create building block library generate pharmacophore hypothesis of active compounds create several starting compounds by random combination of some building blocks select parent structure convergence or end of optimization generate variants of parent stop
Variant generation Example: TOPAS modifier G. Schneider et al, J. Comput.-Aided Mol. Design, 14(2000): 487-494 G. Schneider et al, Angew. Chem. Int. Ed., 39(2000): 4130-4133
Drug research Finding or making a needle in the hay stack? virtual screening JChem Screen de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
Drug research Finding or making a needle in the hay stack? virtual screening JChem Screen ? de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
random virtual synthesis Drug research Screening a virtual compound space virtual screening JChem Screen random virtual synthesis JChem Synthesizer de novo design JChem AnalogMaker advantages disadvantages fast hits are readily available for in vitro screening limited number of available compounds advantages disadvantages fast virtual molecules are likely to be synthetically available practically infinite virtual compound space structural novelty advantages disadvantages practically unlimited virtual compound space structural novelty synthetic accessibility of virtual hits is a problem
Screening a virtual compound space Smart reactions Generic (simple) the equation describes the transformation only few hundred generic reactions can form the basic armory of a preparative chemist Specific (complex) chemo-, recognizes reactive and inactive functional groups regio-, "knows" directing rules stereo-, inversion/retention Customizable to improve reaction model quality
Smart reactions Chemoselectivity REACTIVITY: !match(ratom(3), "[#6][N,O,S:1][N,O,S]", 1)
Smart reactions Chemoselectivity REACTIVITY: !match(ratom(3), "[#6][N,O,S:1][N,O,S]", 1) && !match(ratom(3), "[N,O,S:1][C,P,S]=[N,O,S]", 1)
Smart reactions Regioselectivity SELECTIVITY: -charge(ratom(1)) TOLERANCE: 0.0045
Smart reactions Regioselectivity SELECTIVITY: -charge(ratom(1)) TOLERANCE: 0.0045
Smart reaction library Example Baeyer-Villiger ketone oxidation SELECTIVITY: charge(ratom(2), "sigma")
Smart reaction library Baeyer-Villiger ketone oxidation Generic reaction
Smart reaction library Example Baeyer-Villiger ketone oxidation
Virtual compound space JChem Synthesizer Screen Hits Active set1 Workflow Virtual compound space Available chemicals Synthesizer Screen Hits Active setn Smart reaction library
JChem Synthesizer example Dopamine D2 actives Active sets were kindly provided by Aureus Pharma within a research collaboration between Aureus and ChemAxon.
JChem Synthesizer example Virtual hits similarity: 2D pharmacophore fingerprint, weighted Euclidean metric optimized for 20 random d2 actives
JChem Synthesizer example Best virtual hits 9.88 9.82 9.53 9.73
JChem Synthesizer example Synthesis path step 1 Knoevenagel-Doebner condensation
JChem Synthesizer example step 2 Baylis-Hillman vinyl alkylation
JChem Synthesizer example step 3 Lawesson thiacarbonylation
JChem Synthesizer example step 4 Dess-Martin alcohol oxidization
JChem Synthesizer example Software and performance data virtual reactions: 500-1000 reactions/s random synthesis: 10-20 structures/s pharmacophore fingerprint generation: 100 structure/s (includes pharmacophore point perception) metric optimization: 57 sec (13 parameterized metrics, 20 structures in training set, 50 spikes) virtual screening: 7500 structure/s pure Java client: P4 1.6GHz, RH Linux, java 1.4.2 database server: P4 2.4GHz, Windows XP, MySQL
Acknowledgements ChemAxon Jean-Michael Drancourt François Petitet Modest von Korff, Matthias Steger (Axovan is now part of Actelion.) Alex Allardyce ChemAxon
Contact Miklós Vargyas mvargyas@chemaxon.hu office: +36 1 453 2661 mobile: +36 70 381 3205