Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics
Why standardize structures? Canonicalisation Uniformization of structures without changing the chemical content to recognize duplicates, functional groups (aromatization, mesomers, tautomers,... ) Beautification Making the structures visually more attractive ( dearomatization, cleaning coordinates, wedge orientation,... ) Modification Conversion of structures by modifying its original content as a preparation step for further chemoinformatics tasks (transformations, removing stereo, removing R-groups,...).
Canonicalisation making hydrogens explicit converting to canonical mesomer form transforming to user defined mesomer form Hydrogens aromatizing Kekülé rings Resonant structures converting to canonical tautomer form removing user defined fragments transforming to user defined tautomer form Tautomers expanding stoichiometry Other removing small fragments making hydrogens implicit setting the chiral flag
Mesomers
Tautomers oxo-enol, enamine-imine
Fragment removal
Specific counterion removal
Solvent removal
Beautification calculating 2D coordinates Hydrogens converting aromatic rings to Kekülé format Resonant structures making hydrogens implicit Cleaning reallocating wedge bonds contracting/expanding/ ungrouping abbreviated and multiple groups Groups template based cleaning 3D geometry optimization
Template-based Cleaning 2D-coordinate calculation of macrocycles or bridged systems
query Template-based Cleaning orienting search results to the query
client Canonicalization During Database Import Relational Database input structures canonicalization configurationoriginal structurescanonicalized structures server Standardizer JChem Base / Cartridge
client Sending Query to the Database Relational Database server query structure canonicalization configurationcanonicalized query query is compared to the canonicalized structures Standardizer JChem Base / Cartridge
Displaying Result Structures Relational Database original structures server client beautification configuration beautified structures Standardizer JChem Base / Cartridge
Modification custom transformations +
API and command line interface Standardizer st = new Standardizer(new File("standardize.xml")); st.standardize(mol); standardize input.sdf -c config.xml -o output.smiles
Standardizer GUI
Applications: Virtual Synthesis
Applications: Structure Databases
Acknowledments Ferenc Csizmadia Nóra Máté István Cseh Szabó Attila Alex Allardyce Szilárd Dóránt Péter Kovács Szabolcs Csepregi Java Solutions for Cheminformatics