“ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office
§good N aming practice l how to give most appropriate names §good O ntology practice l how to link the entity of interest by defined logical relationships to other entities good D rawing practice how to draw unambiguous 2-D diagrams Good anNODation practice
or How to Give Most Appropriate Names Good Naming Practice
2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid Systematic Name (IUPAC)
flufenamic acid (INN English) acide flufénamique (INN French) ácido flufenámico (INN Spanish) acidum flufenamicum (INN Latin) Flufenaminsäure (German) Common Name
The Unpronounceables CHEBI:48935 ( E )-roxithromycin IUPAC name: (3 R,4 S,5 S,6 R,7 R,9 R,10 E,11 S,12 R,13 S,14 R )-4-(2,6-dideoxy-3- C -methyl-3- O -methyl-α- L - ribo -hexopyranosyloxy)-14- ethyl-7,12,13-trihydroxy-10-{[(2- methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3- (dimethylamino)-β- D - xylo -hexopyranosyloxy]- 3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one
CHEBI:32109 ( Z )-roxithromycin What is the common name of roxithromycin? CHEBI:48935 ( E )-roxithromycin INN: roxithromycin
Roxithromycin (2) CHEBI:48844CHEBI:48844 roxithromycin ( E )-roxithromycin( Z )-roxithromycin
What is thiamine? CHEBI:18385 thiamine(1+) aka thiamine CHEBI:33283 thiamine(1+) chloride INN: thiamine CHEBI:49105CHEBI:49105 thiamine(2+) dichloride aka thiamine chloride hydrochloride aka thiamine hydrochloride
Problem is not unique to ChEBI Cf. phenol vs phenols phenol metabolism vs phenols metabolism Bad solution: article use a phenol metabolism? Solution: prepositional phrases metabolism of phenols Plurals and singulars
or How to Draw Unambiguous 2-D Diagrams Good Drawing Practice
Linear forms of monosaccharides
Pyranose forms of monosaccharides
Fused systems ( R )-camphor ambiguousunambiguous
Square planar geometry InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2 cisplatintransplatin SMILES: [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H]
Compositional uncertainty Positional uncertainty Configurational uncertainty Conformational uncertainty Uncertainty and ambiguity in chemistry
Examples an alkali metal cation vanadate( V ) anion [ 2 H]ethanol Compositional uncertainty
Examples L -bromohistidine residue pteroic acid (several tautomers)pteroic acid Positional uncertainty
Examples androstane rel -(2 R,3 R )-2-amino-3-methylpentanoic acid tetradec-11-enoic acid Configurational uncertainty
Examples cyclohexane: chair, boat, twist protein secondary structure: , , … Conformational uncertainty
or How to Link the Entity of Interest by Defined Logical Relationships to Other Entities Good Ontology Practice
Molecular structure ontology Subatomic particle ontology Biological role ontology Application ontology ChEBI ontology
Relationships in ChEBI ∆ Is A generic ⋄ Is Part Of generic ♯ Is Conjugate Acid Of specific ♭ Is Conjugate Base Of specific Is Enantiomer Of specific Is Tautomer Of specific ℛ Is Substituent Group From specific ℋ Has Parent Hydride specific ℱ Has Functional Parent specific
Is A relationship ∆ L -cysteinecysteine is a
L -cysteinium Is Part Of ⋄ L -cysteine hydrochloride is part of has part
Is Enantiomer Of L -cysteine ∆∆ D -cysteine is enantiomer of
Is Tautomer Of 3 H -pyrrole2 H -pyrrole 1 H -pyrrole
Is Conjugate Acid Of ♯ L -cysteine L -cysteinate(1–) is conjugate acid of L -cysteinium L -cysteinate(2–) ♯♯
Is Conjugate Base Of ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭♭
Acid/base relationships ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭ ♯ ♭♯ ♯
L -cysteinyl Is Substituent Group From L -cysteine L -cysteine residue L -cysteino ℛ ℛ ℛ * * * *
salutaridinol Has Parent Hydride has parent hydride is parent hydride of ℋ morphinan
7- O -acetylsalutaridinol Has Functional Parent has functional parent is functional parent of ℱ salutaridinol
Live annotation demo
Going to SourceForge…
Reading a request…
Going to curator tool…
Search result…
Adding new entry…
Editing new entry…
Success!
Let’s draw
Approving structure
Success again!
Using ACD/Name (1)
Using ACD/Name (2)
Adding IUPAC name (1)
Adding IUPAC name (2)
Classifying (1)
Classifying (2)
Classifying (3)
Classifying (4)
The last touch (1)
The last touch (2)
Responding request…
A job well done…
Rafael Alcántara Michael Ashburner Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings Alan McNaught * Chris Steinbeck Martin Zbinden * The team
Kristian Axelsen Hélène Courrier Anne Morgat Ian Unwin Our faithful Users EU: funding Thanks