Download presentation
Presentation is loading. Please wait.
1
ChEBI Kirill Degtyarenko, EMBL-EBI / EPO
2
Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings Alan McNaught * Inma Spiteri Christoph Steinbeck Martin Zbinden * The team
3
Chemical Entities of Biological Interest – an EBI database/dictionary of ‘ biochemical compounds ’ ChEBI: What is it?
4
Can be defined as consisting of “ molecules not directly encoded by the genome... that are either the products of nature or are synthetic products used... to intervene in the processes of living organisms ” [Michael Ashburner] What are the ‘ biochemical compounds ’ ?
5
“ Any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity ” [IUPAC “ Gold Book ” ]IUPAC “ Gold Book ” Molecular entity
6
Molecular entities trans -vaccenic acid Groups trans -vaccenoyl group Classes fatty acids In fact, ChEBI contains
7
‘ Small molecules ’ ? Yes, but big molecules as well! alumina amylose metaborate poly(vinyl alcohol)
8
Current status (17.12.08)
9
1-D ChEBI Numeric ID Carefully checked terminology Unambiguous ChEBI name IUPAC names Cross-references to free resources
10
Unambiguous ChEBI name CHEBI:28918 L -adrenaline not just ‘ adrenaline ’
11
2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid Systematic Name (IUPAC) 1 2 3 4 5 6 1 2 3 4 5 6
12
flufenamic acid (INN English) acide flufénamique (INN French) ácido flufenámico (INN Spanish) acidum flufenamicum (INN Latin) Flufenaminsäure (German) Common Name
13
The Unpronounceables CHEBI:48935 ( E )-roxithromycin IUPAC name: (3 R,4 S,5 S,6 R,7 R,9 R,10 E,11 S,12 R,13 S,14 R )-4-(2,6-dideoxy-3- C -methyl-3- O -methyl-α- L - ribo -hexopyranosyloxy)-14- ethyl-7,12,13-trihydroxy-10-{[(2- methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3- (dimethylamino)-β- D - xylo -hexopyranosyloxy]- 3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one
14
CHEBI:32109 ( Z )-roxithromycin What is the common name of roxithromycin? CHEBI:48935 ( E )-roxithromycin INN: roxithromycin
15
Roxithromycin (2) CHEBI:48844CHEBI:48844 roxithromycin ( E )-roxithromycin( Z )-roxithromycin
16
What is thiamine? CHEBI:18385 thiamine(1+) aka thiamine CHEBI:33283 thiamine(1+) chloride INN: thiamine CHEBI:49105CHEBI:49105 thiamine(2+) dichloride aka thiamine chloride hydrochloride aka thiamine hydrochloride
17
“ Better to see the face than to hear the name ” (Zen proverb) Structures and identifiers based on structures offer new ways of crosslinking to other databases Structure search Need for 2-D
18
ChEBI 9 10 0 0 0 0 999 V2000 11.8219 -7.2713 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 11.8219 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6074 -7.0165 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -6.8574 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 12.6039 -8.3505 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 11.1072 -8.5027 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 13.0886 -7.6818 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 10.3923 -7.2713 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 10.3888 -8.0922 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 1 2 2 0 0 0 0 1 3 1 0 0 0 0 1 4 1 0 0 0 0 2 5 1 0 0 0 0 2 6 1 0 0 0 0 3 7 1 0 0 0 0 4 8 2 0 0 0 0 6 9 2 0 0 0 0 5 7 2 0 0 0 0 8 9 1 0 0 0 0 M END Connection table
19
2-D ChEBI One or more 2-D (or 3-D) connection tables One is default Autogenerated images (PNG) Default diagrams should be unambiguous
20
The Fine Art of chemical drawing
21
Linear forms of monosaccharides
22
Pyranose forms of monosaccharides
23
Fused systems ( R )-camphor ambiguousunambiguous
24
Square planar geometry cisplatintransplatin
25
§SMILES §InChI From 2-D back to 1-D
26
S implified M olecular I nput L ine E ntry S pecification Developed by David Weininger in 1988 Extended by others (e.g. Daylight) String of standard ASCII characters A number of valid SMILES can be produced for the same molecule SMILES (1)
27
SMILES (2) §N1C=NC2=C1C=NC=N2 §c1ncc2ncnc2n1 §C=1N\C=N/C\2=N/C=N\C=1/2 §c1ncnc2/N=C\Nc12 §n1cc2c(nc1)ncn2 §[H]c1nc([H])c2n([H])c([H])nc2n1
28
InChI (1) IUPAC International Chemical Identifier or InChIInChI Open source Developed by Stein, Heller, Tchekhovskoi and McNaught Used by NIST, PubChem, CML… and ChEBI
29
InChI (2) InChI=1/C5H4N4/c1-4-5(8-2-6-1)9-3-7-4/h1-3H,(H,6,7,8,9)/f/h7H InChIKey=KDCGOANMDULRCW-QDQILVOLCG
30
Limitations (1) Stereochemistry other than sp 3 tetrahedral and sp 2 trigonal planar Polymers Conformers Radicals/different spin state Topological isomers Mixtures Markush structures
31
Limitations (2) InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2 cisplatintransplatin
32
3-D ChEBI cisplatin
33
Compositional uncertainty Positional uncertainty Configurational uncertainty Conformational uncertainty Uncertainty and ambiguity in chemistry
34
Examples an alkali metal cation vanadate( V ) anion [ 2 H]ethanol Compositional uncertainty
35
Examples L -bromohistidine residue pteroic acid (several tautomers)pteroic acid Positional uncertainty
36
Examples androstane rel -(2 R,3 R )-2-amino-3-methylpentanoic acid tetradec-11-enoic acid Configurational uncertainty
37
Examples cyclohexane: chair, boat, twist protein secondary structure: , , … Conformational uncertainty
38
Molecular structure ontology Subatomic particle ontology Role ontology Biological role Application ChEBI ontology
39
Molecular structure ontology catecholamines Biological role hormone Application antiglaucoma bronchodilator cardiostimulant L -adrenaline
40
The family relations L -cysteine L -cysteine() L -cysteinate(2–) L -cysteinate(1–) L -cysteinyl L -cysteinium L -cysteino L -cystein- S -yl L -cysteine residue L -cysteinate residue D -cysteine cysteine L -cysteine zwitterion
41
Relationships in ChEBI ∆ Is A generic ⋄ Has Part generic ♯ Is Conjugate Acid Of specific ♭ Is Conjugate Base Of specific Is Enantiomer Of specific Is Tautomer Of specific ℛ Is Substituent Group From specific ℋ Has Parent Hydride specific ℱ Has Functional Parent specific Has Role generic?
42
Is A relationship ∆ L -cysteinecysteine is a
43
Is Enantiomer Of L -cysteine ∆∆ D -cysteine is enantiomer of
44
L -cysteinium Has Part ⋄ L -cysteine hydrochloride is part of has part
45
Is Conjugate Acid Of ♯ L -cysteine L -cysteinate(1–) is conjugate acid of L -cysteinium L -cysteinate(2–) ♯♯
46
Is Conjugate Base Of ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭♭
47
Acid/base relationships ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭ ♯ ♭♯ ♯
48
Is Tautomer Of L -cysteine L -cysteine zwitterion is tautomer of
49
Is Tautomer Of 3 H -pyrrole2 H -pyrrole 1 H -pyrrole
50
salutaridinol Has Parent Hydride has parent hydride is parent hydride of ℋ morphinan
51
7- O -acetylsalutaridinol Has Functional Parent has functional parent is functional parent of ℱ salutaridinol
52
L -cysteinyl Is Substituent Group From L -cysteine L -cysteine residue L -cysteino ℛ ℛ ℛ * * * *
53
The family relations L -cysteine L -cysteine() L -cysteinate(2–) L -cysteinate(1–) L -cysteinyl L -cysteinium L -cysteino L -cystein- S -yl L -cysteine residue L -cysteinate residue D -cysteine cysteine L -cysteine zwitterion ♭♯ ♯♭ ℛ ℛ ℛ ℛ ℛ ℱ ∆ ∆ ♯ ♭ ♯ ♭ ♯ ♭♯ ♭
54
Ontology of L-cysteineL-cysteine
55
Ontology of L -cysteine (1)
56
Ontology of L -cysteine (2)
57
Thank you
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.