Download presentation
Presentation is loading. Please wait.
Published byBryce Anthony Webster Modified over 9 years ago
1
EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos
2
Indexing, searching and dissemination of chemical information Cheminformatics Algorithms and Toolkits Natural Products and Metabolomics Chemoinformatics and Metabolism Group Research
3
17.05.20153 Chemical Entities of Biological Interest A database containing a freely available, manually annotated dictionary of molecular entities focused on ‘small’ chemical compounds. Provides a method to navigate the chemical space via an ontology ChEBI aims to provide a central, definitive reference of chemical nomenclature.
4
17.05.20154 Dictionary Resource for Nomenclature http://www.ebi.ac.uk/chebi
5
17.05.20155 5 Mostly small entities Big entities too like alumina amylose metaborate Excludes proteins and nucleic acids What does ChEBI cover?
6
17.05.20156
7
7
8
ChEBI Web Services Programmatic access to a ChEBI entry SOAP based Java implementation Clients currently available in Java and perl Four methods with which to access data getLiteEntity getCompleteEntity getOntologyParents getOntologyChildren Documented at http://www.ebi.ac.uk/chebi/webServices.do. http://www.ebi.ac.uk/chebi/webServices.do 17.05.20158
9
ChEBI Status 17.05.20159
10
ChEBI further info http://www.ebi.ac.uk/chebi Mailing lists: chebi-help@ebi.ac.uk chebi-announce@lists.sourceforge.net chebi-ontology@lists.sourceforge.net Submitting data http://www.ebi.ac.uk/chebi/submissions 17.05.201510
11
17.05.201511 >90.000 Lines of Code, >900 Classes, > 9000 Methods Library Generation Virtual Screening Molecular Property Prediction Visualization (1) Steinbeck, C.; Hoppe, C.; Kuhn, S.; Guha, R.; Willighagen, E. L. Current Pharmaceutical Design 2006, 12, 2111-2120. (2) Steinbeck<, C.; Han, Y. Q.; Kuhn, S.; Horlacher, O.; Luttmann, E., Willighagen, E. Journal of Chemical Information and Computer Sciences 2003, 43, 493-500. http://cdk.sourceforge.net The Chemistry Development Kit (CDK): An Open Source Java-Library for Structural Chemo- and Bioinformatics
12
17.05.201512 I/O (CML, MDL Molfile, SDF, PDB) SMILES InChI Input/Output Structure-Diagram-Layout (SDG) 2D Rendering 3D Rendering Visualization 3D Model-Builder Atom-Typing Force-Field Representation of Biomolecular Structures Modelling Isomorphism detection Maximum-Common-Substructure Searches SMARTS- and Substructure searches Ring searches Aromaticity detection Chemical Graphs Deterministic Isomer generator Stochastic Structure Generators via Simulated Annealing Genetic Algorithms Library Enumeration Fingerprinting > 70 QSAR-Descriptors QSAR model building Properties The Chemistry Development Kit (CDK)
13
17.05.201513 Example: Structure Diagram Generation
14
17.05.201514 -COOH Hetero- aryl 0011010010 Bitscreen coding for structural features O-Alkyl- NH 2 Alky IMolecule superstructure = MoleculeFactory.makeIndole(); IMolecule substructure = MoleculeFactory.makePyrrole(); Fingerprinter fingerprinter = new Fingerprinter(); BitSet superBS = fingerprinter.getFingerprint(superstructure); BitSet subBS = fingerprinter.getFingerprint(substructure); boolean isSubset = FingerprinterTool.isSubset(superBS, subBS); Example: Fingerprinting
15
17.05.201515 67 registered developers on SF 86 people subscribed to cdk-devel list 111 people subscribed to cdk-user list CDK in numbers
16
17.05.201516 80,966 downloads since 2001 CDK in numbers
17
17.05.201517 CDK article (2003) cited 68 times CDK in numbers
18
CDK info 17.05.201518 Project home page: http://cdk.sourceforge.net/ Mailing list: cdk-user@lists.sourceforge.net cdk-devel@lists.sourceforge.net Documentation http://pele.farmbio.uu.se/nightly/
19
OrChem Oracle chemistry plug-in using the Chemistry Development Kit (CDK) providing substructure and similarity searches for chemical graphs.Chemistry Development Kit OrChem is suitable for Oracle 11G and onwards Not an Oracle data cartridge - it doesn't need Oracle's extensibility architecture because its Java components run as Java stored procedures inside the Oracle standard JVM (Aurora). 17.05.201519
20
Problem 17.05.201520 Chemical substructure or similarity searching is computationally expensive especially on a large dataset?
21
OrChem database structure 17.05.201521
22
Example OrChem Queries Similarity search select * from table( orchem_simsearch.search( 'OC4=C(C(=C3OC(C)(COC=1C=CC(=CC=1)CC2C(=O)NC(=O)S 2)CCC3=C4C)C)C','SMILES',0.8,null,'N') ) ; Substructure search select orchem_subsearch.search(molfile,'MOL',50,'Y') from compounds where molregno=12345; 17.05.201522
23
Fingerprint distribution 17.05.201523
24
Parallel vs. Non parallel 17.05.201524 Performance of substructure search on 3.5 million compounds
25
Substructure benchmarking 17.05.201525 Performance of substructure search on 3.5 million compounds
26
Similarity Benchmarking 17.05.201526
27
OrChem info http://orchem.sourceforge.net/ Mailing list: orchem-devel@lists.sourceforge.net 17.05.201527
28
17.05.201528
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.