Whats new in JChem back-end and Markush storage, search and enumeration Szabolcs Csepregi Solutions for Cheminformatics.

Slides:



Advertisements
Similar presentations
February 2013 Szilárd Dóránt Scientific & technical Presentation Pipeline Pilot Integration.
Advertisements

Solutions for Cheminformatics
Solutions for Cheminformatics April 2010 Company and product overview.
Virtual Synthesis - Reactor
August 2010, ACS National meeting, Boston Representation of Markush structures from molecules towards patents Szabolcs Csepregi Solutions for Cheminformatics.
1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia First presented at Applications of Cheminformatics.
Solutions for Cheminformatics September 2009 Company and product overview.
Version 5.3, February 2010 Scientific & technical presentation JChem Base.
Scientific & technical presentation JChem Cartridge for Oracle
1 Szabolcs Csepregi*, Szilárd Dóránt, Nóra Máté, Miklós Vargyas, Péter Kovács, György Pirok, Ferenc Csizmadia January, 2007 Structural Search Using ChemAxon.
May, 2008 Presenting: Szabolcs Csepregi The ChemAxon Markush project overview and development discussion.
Scientific & technical presentation Fragmenter Nóra Máté Sept 2005.
Integrating ChemAxon technology into your End User Applications Java solutions for cheminformatics Ver. Mar., 2005.
JKlustor clustering chemical libraries presented by … maintained by Miklós Vargyas Last update: 25 March 2010.
Scientific & technical presentation Calculator Plugins January 2011.
Instant JChem INFORMATICS MATTERS
Scientific & technical presentation MarvinSketch and MarvinView
Java Solutions for Cheminformatics Feb 2008 Whats new for PP.
Scientific & technical presentation Structure Visualization with MarvinSpace Oct 2006.
Version 5.3, April 2010 The ChemAxon Markush project overview and development discussion.
Calculator Plugins József Szegezdi, Nóra Máté. ChemAxon Calculator Plugins ChemAxons plugin handling mechanism provides a framework for calculating various.
Structural Search Using ChemAxon Tools
JChem Web Services Server Jonathan Lee Solutions for Cheminformatics Technical Product Presentation.
Scientific & technical presentation Standardizer January 2008.
Chemical Naming Daniel Bonniot, PhD October 2008.
Nov 2008 Scientific & technical presentation JChem for Excel.
Pipeline Pilot Integration Szilard Dorant Solutions for Cheminformatics.
4 August 2009Copyright © 2009 – Kelaroo, Inc. Kelaroo & ChemAxon Robert D. Feinstein, PhD Vice President & CSO, Kelaroo, Inc.
JChem Base chemical database
Java Solutions for Cheminformatics June 2007 Company and product overview.
In Silico Synthesis György Pirok, Nóra Máté. Elements of the Virtual Synthesis Technology A language for describing chemical rules –Chemical Terms A library.
ChemAxon's Java Components in a Heterogeneous, Server-Centric Application Environment ChemAxon 2005 User Group Meeting May 19th and 20th, Budapest, Hungary.
Solutions for Cheminformatics
Prediction of Xenobiotic Metabolism and Major Metabolites György Pirok Solutions for Cheminformatics.
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Nóra Máte István Cseh Szilárd Dóránt Péter Kovács Szabolcs Csepregi Ferenc Csizmadia.
Interfacing the JChem Suite outside of Java Jonathan Lee Solutions for Cheminformatics.
Welcome to San Diego!! Alex Drijver, CEO Solutions for Cheminformatics.
ChemAxon in 3D Gábor Imre, Adrián Kalászi and Miklós Vargyas Solutions for Cheminformatics.
Java Solutions for Cheminformatics April 2006 JChem Cartridge For Oracle - Latest.
UGM, June, 2007 Presenting: Szabolcs Csepregi JChem Base and Cartridge latest.
Instant JChem - current status and what's coming soon. Tim Dudgeon Solutions for Cheminformatics.
ChemAxon - Pipeline Pilot Integration
1 Szabolcs Csepregi May, 2005 Structural Search Using ChemAxon Tools.
Leveraging ChemAxon Cheminformatics in an Integrated Drug Discovery and Development Platform Zhenbin Li, Paul Starbard, Jim Gregory, Donald Chen, Paul.
19 May 2005Copyright © 2005 – Kelaroo, Inc. Kelaroo Applications & ChemAxon Components: Reagent Management Robert D. Feinstein, Ph.D. Kelaroo, Inc. –
Chemaxon's chemo-informatics toolkit integration into the Affectis Data Management System Database Automated Data Integration - Example: IC50 Data generated.
UGM, June, 2007 Szabolcs Csepregi Markush: Whats new, development discussions.
Name to structure, Structure to name, chemicalize.org Daniel Bonniot Solutions for Cheminformatics.
1 György Pirok, Szilárd Dóránt May, 2005 What is Marvin and how to...
June, 2007 Akos Papp Corporate Registration System - A future solution.
DeltaSofts ChemCart Next Generation Access to Research Data ChemAxon User Group Meeting Budapest, Hungary June 13-14, 2007.
Partnering ChemAxon Nóra Lapusnyik, Alexander Drijver Solutions for Cheminformatics.
June, 2007 David Spender*, Erika Biró What's new in Marvin and development discussion.
ChemAxon for Developers Ferenc Csizmadia 2008 November – Last updated: 2010 April.
Solutions for Cheminformatics Marvin features and news Akos Papp.
An integrated suite of applications using ChemAxon components
1 Miklós Vargyas, Judit Papp May, 2005 MarvinSpace – live demo.
Name to structure, Structure to name, chemicalize.org Daniel Bonniot de Ruisselet Solutions for Cheminformatics.
2008 Accelrys EUGM Pipelining ChemAxon Szilard Dorant Solutions for Cheminformatics.
Standardizer Molecular Cosmetics for Chemoinformatics György Pirok Java Solutions for Cheminformatics.
Instant JChem 2009 US + EU Seminars Confidential. Copyright© 2009 ChemAxon Kft, Informatics Matters Ltd Instant JChem Instant JChem Seminar series Q
Java Solutions for Cheminformatics March About Us Molecule Drawing and Visualization Structure Searching Cartridge Structure Standardization Molecular.
Solutions for Cheminformatics
Dr. Matthew Wright Product Director.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
September 2014, Version Szilárd Dóránt Scientific & technical Presentation Pipeline Pilot Integration.
May 2009 ChemAxon - What’s New?. What’s new and hot? All products have seen enhancements in the past 12 months BUT WHAT’S REALLY HOT?
EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton.
June 2016, Version Scientific & technical Presentation Pipeline Pilot Integration.
Pipeline pilot Components
Presentation transcript:

Whats new in JChem back-end and Markush storage, search and enumeration Szabolcs Csepregi Solutions for Cheminformatics

Contents ChemAxon chemical database tools Main features of JChem Base, Cartridge Example interfaces: JSP, ASP, AJAX examples Integration with other CXN products Markush structure storage, search and enumeration Recent developments, plans

Chemical database products JChem Base –A library for adding chemical structures into relational database systems. Available in Java, JSP and.NET –Open-source web application example is available. JChem Cartridge for Oracle –Extends Oracle SQL with chemical operators and index. –SQL interface for ChemAxon functionality Instant JChem –An all-in-one desktop chemical database application. JChem Web Services – SOAP interface to JChem Base JC4XL – Excel integration (coming) 3

Compatibility and integration Supported chemical file formats: SMILES MDL MOL/RXN/SDF/RDF (v2000 and v3000) CML, MRV IUPAC and traditional names InChI, mol2, PDB, etc. Database engines: Oracle, MySQL, MS SQL Server, MS Access, PostgreSQL, IBM DB2, Derby, etc. All operating systems through: Java API (JChem Base).NET API (JChem Base + IKVM) – for Windows SQL (Cartridge) 4

Structure searching: features Substructure, Similarity, Full, Full fragment, etc. search types Wide range of query atoms Query properties R-group queries Full SMARTS support Coordination compounds Link nodes Pseudo atoms, Lone pairs Relative stereo Reaction search features Polymers Position variation Hit coloring

Structure searching: options Some selected structure search options: –Chemical Terms filter constraint –Tautomer search –Stereo on/off –Ignore charge/isotope/radical/valence/polymers –Vague bond matching modes: or aromatic; ignore bond types – Inverse hit list – Maximum search time / number of hits – SQL SELECT statement for pre-filtering – Ordering of results –etc. 6

Structure search: performance 7 JChem Base 5.2.0, Intel Quad Q GHz, 8GB RAM; Oracle Number of compounds Elapsed time Duplicates not checked Duplicates checked 10,00021 s26 s 100,0002 min2 min 36 s 200,0003 min 45 s5 min 5 s QueryNumber of hitsSearch time s s 5, s 142, s Compound registration: Substructure search in PubChem (19.5 million compounds):

Table types Control allowed chemical structures and available operations Molecule Reaction Markush Query Any structure 8

Example web applications Open source JSP, ASP examples –Marvin applets are used for query drawing and structure visualization AJAX example –Back-end is JChem Web Services –No Java is needed for browsing Demo 9

Integration Integration with other ChemAxon tools: –Custom, uniform chemical representation. (Standardizer – see separate presentation today.) –Automatically calculated properties by Chemical Terms Calculated columns (Calculator plugins) –Additional similarity calculations (Screen - JChem Base only) –Tautomer handling: Tautomer search Tautomer duplicate filter table/index option Custom tautomer transforms or canonical tautomer using Standardizer –Query drawing and structure visualization (Marvin) Provides the most consistent interface and back-end. 10

Integration Additional Cartridge functionality –JChem index (for non-JChem tables) –Communication with Oracle optimizer –Reaction based enumeration (Reactor) –Format conversions – image generation also –Markush enumeration (Calculator plugins) –Property predictions through Chemical Terms (Calculator plugins) 11

Registration system New component for registration system is under development (API only) Main features: –Customizable business logic Multilevel duplication control Customizable corporate registration ID Handling of salts, batches, lots, samples, and mixtures –Identification, split and registration of salt and solvent structures Storage of input structures in original format –Mock registration (dry run) –Pre-registration through a transitory area –Basic, customizable implementation examples Separate examples for chemists and registrars Web and Instant JChem interfaces will follow later 12

Handling of Markush structures

Markush structures Combinatorial Markush structure registration and search features handled in search and enumeration –R-groups (nesting to any depth) –Atom lists, bond lists –Position variation bond –Link nodes –Repeating units –Homology groups (aryl, alkyl, etc.) Built-in User-defined Compatible Markush enumeration plugin

Markush Enumeration Markush enumeration plugin –Full enumeration –Selected parts only –Random enumeration –Calculate library size: exact size of huge Markush libraries arbitrary precision or Magnitude –Scaffold alignment and coloring –Markush code –Optional example homology group enumeration

Markush storage & search Available in JChem Base and Instant JChem No enumeration involved – can handle very complex Markush structures (tested up to 10 40, but no explicit limits were built in.) Substructure and Full structure search Basic query features supported Substructure hit visualization: Markush structure reduction

Markush demo

Whats new

Whats new: JChem Base 5.1 –Position variation in queries –New fast & reliable tautomer duplicate search 5.2 –.NET API –Polymer storage and search –New query options and features including searching of attached data, group matching of undefined R-atoms, repeating units. –Improved substructure search performance –JChem Web Services –New metrics for similarity search (Tversky, etc.) (5.2.2)

Whats new: JChem Base Polymer support details Polymer brackets and properties(type, connectivity, etc.) considered during search and registration Attached data search (optional) – attached to atoms/bonds/brackets Source- and structure-based representation equivalence is checked (but can be switched off) –Addition to a double bond. E.g. polystyrene. –Polymerization through elimination of water or HCl. E.g. polyester, polyamide.

Whats new: JChem Base Polymer support details (cont.) Ladder type polymers Phase-shifting (for ht SRU) (can be switched off) End group matching: –* atoms: unspecified end groups –Search option to switch on/off end group matching Copolymer types: co, alt, rnd, blk, grf, xl, mer, mod Polymer mixtures New search options

Whats new: Cartridge-specific 5.1 –Tautomer duplicate filtering index option –Alter index option –Improved import speed (5.1.3) –Improved upgrade: no need to remove/recreate indices (5.1.4) 5.2 –Interactive installer –Increased substructure search performance (5.2.2) –Tversky similarity search (5.2.2)

Whats new: Markush New Features –Homology groups 19 built-in groups Customizable: –Examples (for built-in groups, enumeration only), –Full user-defined homology groups defined by R-group definition Marvin templates for easier sketching –Import reagent files as R-groups –Position variation and Repeating units

Plans

Plans: JChem Base & Cartridge JChem Base Further speed improvements (SSS, similarity) New vague bond level options R-group decomposition integration Improved support for Screen molecular descriptors Cartridge Screen molecular descriptors (BCUT, pharmacophore similarity, chemical hashed fp, etc) and metrics (Euclidean, Dice, etc.) for similarity search User-defined descriptor fingerprints Markush tables and search JChem Server, JChem cluster

Plans: Markush –.VMN import (format used by Merged Markush Service & Derwent World Patent Index) –Multiple graphical attachment points of R-groups –Homology variation queries –Overlap analysis of Markush structures –Homology group properties (# of atoms, branching points, # of heteroatoms, etc.) –Conditions for Markush variables

Summary JChem Base and Cartridge are comprehensive and efficient Markush structure storage, search and enumeration now reaching patent features coverage Continuous development, improvements in the pipeline

Find out more Product descriptions & links Forum Presentations and posters Download