Download presentation
Presentation is loading. Please wait.
Published byBaldric Fleming Modified over 9 years ago
1
EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: The story so far Paula de Matos
2
ChEBI: The story so far2 Private DataPublic Data
3
ChEBI: The story so far3 The state of affairs of bioinformatics in 2002 Bioinformatics is booming Human Genome sequence rough draft published June 2000 Free resources and free data
4
ChEBI: The story so far4 A different story for chemoinformatics Private data and private software
5
ChEBI: The story so far5 Too hard to solve… lets put our head in the sand
6
ChEBI: The story so far6 Bioinformatics data too large to keep track of chemical compounds 100000 Protein entries in SwissProt (2002) 20 million entries in EMBL Database (2002) Small databases unable to keep track ENZYME resources ~ 3500 enzymatic reactions
7
ChEBI: The story so far7 New initiatives start up PubChem Chemical repository, millions of entries, focus on screening assays ChEBI Manually annotated database, nomenclature reference and compound database, tens of thousands of entries
8
ChEBI: The story so far8 Principles of foundation December 2002 email exchanges within the EBI to address the issue of chemistry Three principles outlined 2002200320042005200620072008
9
ChEBI: The story so far9 “Nothing held in the database must be proprietary or derived from a proprietary source that would limit its free distribution/availability to anyone.”
10
ChEBI: The story so far10 “Every data item in the database should be fully traceable and explicitly referenced to the original source/version.”
11
ChEBI: The story so far11 “Although the EBI will provide a web interface, the entirety of the data should be available to all without constraint as, for example, SQL table dumps, ASCII tables, and XML (e.g. DAML+OIL)”
12
ChEBI: The story so far12 We make a start using existing resources Integrate three resources KEGG Compound IntEnz Chemical Ontology Annotation starts summer 2003 Focus on nomenclature 2002200320042005200620072008
13
ChEBI: The story so far13 Our first release was modest but it was a start 21 July 2004 2783 annotated entities Data: ChEBI Name, ChEBI Id IUPAC Names, Synonyms Formula Cross-references 2002200320042005200620072008
14
ChEBI: The story so far14 We introduce structures - Sep 2005 Molfiles InChI (IUPAC International Chemical Identifier) SMILES (Simplified Molecular Input Line Entry System) Image (PNG) 2002200320042005200620072008
15
ChEBI: The story so far15 Marvin in ChEBI
16
ChEBI: The story so far16 We start editing the chemical ontology – Dec 2005 2002200320042005200620072008
17
ChEBI: The story so far17 2002200320042005200620072008 Internationalisation of web pages – March 2006
18
ChEBI: The story so far18 Internationalisation of data – Feb 2008 2002200320042005200620072008
19
ChEBI: The story so far19 Web Services - Oct 2006 Programmatic access to a ChEBI entry SOAP based Java implementation Clients currently available in Java and perl Four methods with which to access data getLiteEntity getCompleteEntity getOntologyParents getOntologyChildren 2002200320042005200620072008
20
ChEBI: The story so far20 Automated Cross References – Aug 2007 Current Databases: UniProtKB, Reactome, BioModels, IntAct, SABIO-RK, PubChem and ArrayExpress 2002200320042005200620072008
21
ChEBI: The story so far21 2002200320042005200620072008 Chemical Structure Searching – May 2008
22
ChEBI: The story so far22 After all this, where are we?
23
ChEBI: The story so far23
24
ChEBI: The story so far24
25
ChEBI: The story so far25 Annotation is linear
26
ChEBI: The story so far26 Number of web hits grows Total pure entry hits in April: 42,612 / 273,219 Total web services hits in April: 88,226 Web hits for 2007:
27
ChEBI: The story so far27 Diversity of users Constant challenge of balancing our users' varied interests.
28
ChEBI: The story so far28 Our positives Nomenclature database Manually annotated data Attention to detail Free and accessible Loyal users
29
ChEBI: The story so far29 Our not so positives Size for some people Not well integrated into other bioinformatics resources Community interaction No software publicly available to manipulate the database
30
ChEBI: The story so far30 Involve the community Create a submission web based tool Users can easily submit their entities on a one to one basis Also allowing bulk submission from other resources.
31
ChEBI: The story so far31 Improvements to data depth Addition of more Xrefs: PDB, MACIE ??? Addition of more chemical attributes? What chemical attributes? Text mining projects to extract relevant chemical information from patents, journals European Patent Office
32
ChEBI: The story so far32 Going Open Source Commercial software packages will be replaced with Open Source Long term goal: allow people to create a free local instance of ChEBI Distribution of data in useful formats: CML, SDF
33
ChEBI: The story so far33 Proposed changes to the ontology New relationships “Is disjoint from” molecular entities organic molecular entities organic ions inorganic molecular entities
34
ChEBI: The story so far34 Is alloprote of succinate(2−) CHEBI:30031 succinic acid CHEBI:15741 Is alloprote of
35
ChEBI: The story so far35 Has biological role and Has application Has biological role
36
ChEBI: The story so far36 Currently working with the Swiss Institute of Bioinformatics building a database of biochemical reactions called Rhea All reactions mapped to ChEBI Encourage use of ChEBI nomenclature CHEBI:15422 C10H16N5O13P3 CHEBI:16027 C10H14N5O7P CHEBI:16761 C10H15N5O10P2 EC 2.7.4.3 “ATP + AMP = 2 ADP”
37
ChEBI: The story so far37 Acknowledgements ChEBI Team Paula de Matos, Kirill Degtyarenko, Marcus Ennis, Janna Hastings, Christoph Steinbeck Alumni Michael Darsow, Mickael Guedj, Alan McNaught, Martin Zbinden ChEBI supporters Rolf Apweiler, Michael Ashburner, Henning Hermjakob, Janet Thornton IntEnz Team Rafael Alcantara, Volker Ast, Kristian Axelsen, Anne Morgat EPO Collaborators Helene Courrier, Stephane Nauche, Jeremy Parsons Database supporters ArrayExpress, IntAct, Reactome, SABIO-RK, RSC, GO, RESID etc…
38
ChEBI: The story so far38 Discussion Points Data Depth New Relationships Encourage Nomenclature Community
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.