Presentation is loading. Please wait.

Presentation is loading. Please wait.

EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos.

Similar presentations


Presentation on theme: "EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos."— Presentation transcript:

1 EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos

2 Indexing, searching and dissemination of chemical information Cheminformatics Algorithms and Toolkits Natural Products and Metabolomics Chemoinformatics and Metabolism Group Research

3 17.05.20153 Chemical Entities of Biological Interest A database containing a freely available, manually annotated dictionary of molecular entities focused on ‘small’ chemical compounds. Provides a method to navigate the chemical space via an ontology ChEBI aims to provide a central, definitive reference of chemical nomenclature.

4 17.05.20154 Dictionary Resource for Nomenclature http://www.ebi.ac.uk/chebi

5 17.05.20155 5 Mostly small entities Big entities too like alumina amylose metaborate Excludes proteins and nucleic acids What does ChEBI cover?

6 17.05.20156

7 7

8 ChEBI Web Services Programmatic access to a ChEBI entry SOAP based Java implementation Clients currently available in Java and perl Four methods with which to access data getLiteEntity getCompleteEntity getOntologyParents getOntologyChildren Documented at http://www.ebi.ac.uk/chebi/webServices.do. http://www.ebi.ac.uk/chebi/webServices.do 17.05.20158

9 ChEBI Status 17.05.20159

10 ChEBI further info http://www.ebi.ac.uk/chebi Mailing lists: chebi-help@ebi.ac.uk chebi-announce@lists.sourceforge.net chebi-ontology@lists.sourceforge.net Submitting data http://www.ebi.ac.uk/chebi/submissions 17.05.201510

11 17.05.201511 >90.000 Lines of Code, >900 Classes, > 9000 Methods Library Generation Virtual Screening Molecular Property Prediction Visualization (1) Steinbeck, C.; Hoppe, C.; Kuhn, S.; Guha, R.; Willighagen, E. L. Current Pharmaceutical Design 2006, 12, 2111-2120. (2) Steinbeck<, C.; Han, Y. Q.; Kuhn, S.; Horlacher, O.; Luttmann, E., Willighagen, E. Journal of Chemical Information and Computer Sciences 2003, 43, 493-500. http://cdk.sourceforge.net The Chemistry Development Kit (CDK): An Open Source Java-Library for Structural Chemo- and Bioinformatics

12 17.05.201512 I/O (CML, MDL Molfile, SDF, PDB) SMILES InChI Input/Output Structure-Diagram-Layout (SDG) 2D Rendering 3D Rendering Visualization 3D Model-Builder Atom-Typing Force-Field Representation of Biomolecular Structures Modelling Isomorphism detection Maximum-Common-Substructure Searches SMARTS- and Substructure searches Ring searches Aromaticity detection Chemical Graphs Deterministic Isomer generator Stochastic Structure Generators via Simulated Annealing Genetic Algorithms Library Enumeration Fingerprinting > 70 QSAR-Descriptors QSAR model building Properties The Chemistry Development Kit (CDK)

13 17.05.201513 Example: Structure Diagram Generation

14 17.05.201514 -COOH Hetero- aryl 0011010010 Bitscreen coding for structural features O-Alkyl- NH 2 Alky IMolecule superstructure = MoleculeFactory.makeIndole(); IMolecule substructure = MoleculeFactory.makePyrrole(); Fingerprinter fingerprinter = new Fingerprinter(); BitSet superBS = fingerprinter.getFingerprint(superstructure); BitSet subBS = fingerprinter.getFingerprint(substructure); boolean isSubset = FingerprinterTool.isSubset(superBS, subBS); Example: Fingerprinting

15 17.05.201515 67 registered developers on SF 86 people subscribed to cdk-devel list 111 people subscribed to cdk-user list CDK in numbers

16 17.05.201516 80,966 downloads since 2001 CDK in numbers

17 17.05.201517 CDK article (2003) cited 68 times CDK in numbers

18 CDK info 17.05.201518 Project home page: http://cdk.sourceforge.net/ Mailing list: cdk-user@lists.sourceforge.net cdk-devel@lists.sourceforge.net Documentation http://pele.farmbio.uu.se/nightly/

19 OrChem Oracle chemistry plug-in using the Chemistry Development Kit (CDK) providing substructure and similarity searches for chemical graphs.Chemistry Development Kit OrChem is suitable for Oracle 11G and onwards Not an Oracle data cartridge - it doesn't need Oracle's extensibility architecture because its Java components run as Java stored procedures inside the Oracle standard JVM (Aurora). 17.05.201519

20 Problem 17.05.201520 Chemical substructure or similarity searching is computationally expensive especially on a large dataset?

21 OrChem database structure 17.05.201521

22 Example OrChem Queries Similarity search select * from table( orchem_simsearch.search( 'OC4=C(C(=C3OC(C)(COC=1C=CC(=CC=1)CC2C(=O)NC(=O)S 2)CCC3=C4C)C)C','SMILES',0.8,null,'N') ) ; Substructure search select orchem_subsearch.search(molfile,'MOL',50,'Y') from compounds where molregno=12345; 17.05.201522

23 Fingerprint distribution 17.05.201523

24 Parallel vs. Non parallel 17.05.201524 Performance of substructure search on 3.5 million compounds

25 Substructure benchmarking 17.05.201525 Performance of substructure search on 3.5 million compounds

26 Similarity Benchmarking 17.05.201526

27 OrChem info http://orchem.sourceforge.net/ Mailing list: orchem-devel@lists.sourceforge.net 17.05.201527

28 17.05.201528


Download ppt "EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos."

Similar presentations


Ads by Google