Structural Search Using ChemAxon Tools Szabolcs Csepregi JChem version 5.3, April 2010
Structural Search Using ChemAxon Tools Interfaces Search types and options Query features, Stereo searching Special search types: reaction, R-group search, Chemical Terms filters Searching against Combinatorial Markush structures Fingerprint screening Performance Applications of structural search: R-group decomposition, Standardizer, Reactor, Pmapper, Fragmenter Future plans All examples were generated by Marvin
Structural search interfaces Example web GUI-s: JSP (Java Server Pages) AJAX example: Javascript and JChem Web Services Command line: jcsearch Java and .NET API: MolSearch class: in memory JChemSearch class: in database Cartridge: Oracle SQL JChem Web Services JChemBase JSP example: http://www.jchem.com/examples/jsp1_x/index.jsp Jcsearch user’s guide: http://www.jchem.com/doc/user/Jcsearch.html API Documentation: http://www.jchem.com/doc/api/index.html Search classes: chemaxon.sss.search.Search chemaxon.sss.search.MolSearch chemaxon.jchem.db.JChemSearch Instant JChem JChem For Excel
Structural search type Search types in JChem Structural search type Query Result Atom By Atom Search or structural search: Similarity search: Different Descriptors Different Metrics Substructure Superstructure Full structure Duplicate MC(E)S – maximum common (edge) substructure
Search options Some selected structure search options: Stereo on/off/diastereomers Ignore charge/isotope/radical/ valence/polymers, etc. Vague bond matching options Chemical Terms filter Tautomer search (even in substructure search) Inverse hit list Maximum search time / number of hits Combine with non-structure conditions Ordering of results Similarity type / metric 5
Hit coloring and alignment
Query features 1. Atomic features Query atom types: any(A, AH) hetero (Q, QH) list, not list metal (M, MH) halogen (X, XH) periodic table groups (G1-18) Pseudo atoms e.g. “Resin” Explicit lone pairs (match to implied lone pairs as well.) Charge, isotope, radical Link nodes (repeatable): http://www.jchem.com/doc/user/Query.html#qatoms
Query features 2. Query properties Symbol Description H<n> Total hydrogen count a Aromatic A Aliphatic R<n> Ring count in SSSR r<n> Ring size in SSSR v<n> Valence X<n> Connectivity D<n> Degree h<n> Implicit H count rb<n> rb* Ring bond count *: as drawn s<n> s* Substitution count *: as drawn u Unsaturated atom http://www.jchem.com/doc/user/Query.html#qatoms
Query features 3. Atomic SMARTS features SMARTS atoms: Additional query properties: Example: Carbonyl C, but not amide Symbol Description & ; , ! Logical operators $(<smarts>) Recursive smarts +0, -0 Zero charge http://www.jchem.com/doc/user/Query.html
Query features 4. Homology atoms Can be used: In queries against molecule and reaction tables. In Markush structures Built-in and user-defined groups http://www.jchem.com/doc/user/Query.html
Query features 5. Bond features & components Query bond types: Any, single or double, single or aromatic, double or aromatic Bond topology: chain/ring Smarts bonds Component level grouping Symbol Description - = # Single, double, triple : aromatic & , ; ! Logical operators @ Ring bond / \ /? \? Directional bond (cis/trans) http://www.jchem.com/doc/user/Query.html#qbonds http://www.jchem.com/doc/user/Query.html#compLevGr Symbol Description (C.C) Same component (C).(C) Different component C.C No component restrictions
Coordination compounds Atom-to-atom (dative) and multicenter coordinate bonds. Alternative representations: Position variation bond http://www.jchem.com/doc/user/Query.html#qbonds http://www.jchem.com/doc/user/Query.html#compLevGr
Hydrogens H representations: Example: Explicit Implicit Query H count: total (H<n>) implicit (h<n>) Example: Considered in ABAS Explicit H Implicit H Query H count Query Target Target Query http://www.jchem.com/doc/user/Query.html#explH
Stereo searching 1. Double bonds Not cis Not trans Cis or trans (unknown) Trans Cis Meaning Depiction Levels of check: All Only marked double bonds (MDL: stereo care flag) None http://www.jchem.com/doc/user/Query.html#stereobond
Stereo searching 2. Tetrahedral chirality Stereo bond types: Relative stereo configuration Chiral flag model Enhanced stereo representation: AND<n>, OR<n>, ABS groups Up or down Down Up http://www.jchem.com/doc/user/Query.html#chirality
Groups integration (query & target) Both sides are treated similarly by the search: Abbreviations (super-atom S-groups): Multiple groups: Other S-groups supported: component, mixture, formulation , many polymer brackets:
Reaction search Reactants, agents, products Transformation recognition (mapping) Stereospecific reactions (inversion, retention) Reactant grouping Reacting center http://www.jchem.com/doc/user/Query.html#reaction
R-group search Scaffold, R-group definitions Monovalent, divalent R-groups R-logic Occurrence If-then Rest H http://www.jchem.com/doc/user/Query.html#markush
Undefined R-atoms - No substitution elsewhere retrieves:
Polymer storage and search Comprehensive representation Source based and structure based Copolymer types, mixtures, ladder-type polymers, etc Phase shifting End groups: specific, undefined, etc. Flexible Attached data search Wide range of polymer search options
Chemical Terms filter Chemically aware filtering for structure and similarity searches Elements of the Chemical Terms language structure matching functions (describing functional groups, reaction sites, similarity, etc) property calculations (partial charge distribution, pKa, logP, HB donors, acceptors, topological descriptors, etc) arithmetic and logic-operators Examples http://www.jchem.com/doc/user/Evaluator.html Lipinski rule of 5 (mass() <= 500) && (logP() <= 5) && (donorCount() <= 5) && (acceptorCount() <= 10); Veber filter (rotatableBondCount() <= 10) && (PSA() <= 140);
Markush structure registration and search Markush structures Markush structure registration and search Markush features R-groups Atom lists, bond lists Position variation bond Link nodes and repeating units Homology groups Compatible enumeration plugin http://www.jchem.com/doc/user/Query.html#explH
Fingerprint screening in the database JChem database searches use fingerprint technology for fastest search results. It rapidly* filters out most non-hits - usually more than 99% of them are rejected. Supported fingerprint types: Chemical hashed fingerprints User-defined additional structural keys * Average screening time in a 3-million cached table: ~0.1s JChem table Hits for the query Search query Fingerprint screening Need to be searched Screened out Atom by atom search Results
Application: R-group decomposition JChem is able to identify the ligands of a given scaffold at specified substitution positions: Query(scaffold) Result Library R-group decomposition
Further applications of structural search in JChem Transformations - Standardizer & Reactor Identification of pharmacophoric groups - Pmapper nitro: amidine: Identification of bond cleavage - Fragmenter ether cut: Converting covalent form of alcoholates to ionic form: Enamine-amine tautomerism:
Duplicates not checked Performance Query Number of hits Search time 2 0.91 s 93 0.98 s 6,001 1.30 s 146,256 5,66 s Substructure searching in 19.5 million structures (Pubchem) JChem Base 5.2.2, Intel Quad Q6600 2.4GHz, 8 GB RAM; Oracle 10.2.0.3 Compound registration: Number of compounds Elapsed time Duplicates not checked Duplicates checked 10,000 21 s 26 s 100,000 2 min 4 s 2 min 34 s 200,000 4 min 24 s 5 min 13 s
Future plans R-group decomposition GUI in client applications Visualization of similarity search results using MCS Diastereomer search Markush search enhancements (homology variation conditions, maximum common substructure, etc)
Summary JChem suite: contains a broad range of chemical search facilities, including Markush structure analysis. Structural search is a useful tool for many applications.
References JChem Query Guide http://www.chemaxon.com/jchem/doc/user/Query.html Chemical Terms reference http://www.chemaxon.com/jchem/marvin/help/chemicalterms/ChemicalTerms.html JChem Base JSP demo page http://www.chemaxon.com/jchem/examples/db_search/index.jsp Jcsearch command line tool http://www.chemaxon.com/jchem/doc/user/Jcsearch.html API documentation http://www.chemaxon.com/jchem/doc/api/index.html (chemaxon.sss.search.MolSearch, chemaxon.jchem.db.JChemSearch) JChem Base http://www.chemaxon.com/product/jc_base.html JChem Cartridge http://www.chemaxon.com/product/jc_cart.html Instant JChem http://www.chemaxon.com/product/ijc.html JChem for Excel http://www.chemaxon.com/products/jchem-for-excel/
Thank you for your attention Máramaros köz 3/a Budapest, 1037 Hungary info@chemaxon.com www.chemaxon.com