“ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office.

Slides:



Advertisements
Similar presentations
Chemical named entity recognition and literature mark-up Colin Batchelor Informatics Department Royal Society of Chemistry
Advertisements

EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: The story so far.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
Ch 16 Amines Homework problems: 16.9, 16.10, 16.21, 16.25, 16.39,
Organic Chemistry. Isomerism Isomers have identical composition but different structures Two forms of isomerism – –Constitutional (or structural) –
 IUPAC  BOHR DIAGRAMS FOR ATOMS  TYPES OF CHEMICAL BONDS  MOLECULAR COMPOUNDS ◦ TYPES  SIMPLE COVALENT  COMMON NAME  IONIC COMPOUNDS ◦ SIMPLE IONIC.
EBI is an Outstation of the European Molecular Biology Laboratory. Chemoinformatics and Metabolism Paula de Matos.
Christoph Steinbeck Cologne University Bioinformatics Center (CUBIC) Folie 1 16:39:56 Reviving Analytical Data of the Past with Open Submission Databases.
Application of OBO Foundry Principles in GO Chris Mungall Lawrence Berkeley Labs NCBO GO Consortium.
EBI is an Outstation of the European Molecular Biology Laboratory. IntEnz Integrated relational Enzyme database 23 May 2015.
Shapes of Alkanes “Straight-chain” alkanes have a zig-zag orientation when they are in their most straight orientation Straight chain alkanes are also.
ChEBI Kirill Degtyarenko, EMBL-EBI / EPO. Rafael Alcántara Michael Ashburner * Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings.
1 BIOLOGICAL DOMAIN ONTOLOGIES & BASIC FORMAL ONTOLOGY Barry Smith.
Carbohydrates What are they? –Sugars, starches & much more –Most abundant molecules on Earth –End products of photosynthesis.
14.6 Amides Tylenol, an aspirin substitute, contains acetaminophen. Acetaminophen is an amide. It acts to reduce fever and pain; however, it has little.
Fundamentals of Biochemistry
© 2013 Pearson Education, Inc. Fundamentals of General, Organic, and Biological Chemistry, 7e John McMurry, David S. Ballantine, Carl A. Hoeger, Virginia.
ECE Chapter Two. CHEMICAL COMPOSITION OF THE BODY What is an atom? What is an atom? What is an atom What is an atom An atom is the smallest indivisible.
Crowdsourced Curation of Chemistry Data. How Bad is Online Chemistry Data? Antony Williams Wolfram Summit, September 2010.
Introduction to metabolism: Compounds, Reactions, Enzymes and Pathways Kristian Axelsen, Alan Bridge Elisabeth Coudert & Anne Morgat SIB Swiss Institute.
EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: an EBI chemistry reference.
Chemistry of Cells.
Editing the Gene Ontology Midori A. Harris GO Editorial Office EBI, Hinxton, UK.
Principles of Database Design, Part II AIMS 2710 R. Nakatsu.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
EMBL-EBI Adel Golovin MSDsite The project is funded by the European Commission as the TEMBLOR, contract-no. QLRI-CT under the RTD programme.
Community Ontology Development Lessons from the Gene Ontology.
Structure of chemical compounds Bonds and isomery Richard Vytášek 2008 Presentation is only for internal purposes of 2nd Medical faculty.
Martin Golebiewski Scientific Databases and Visualization Group EML Research, Heidelberg 2nd BioModels.net Training Camp th of January 2007, Manchester,
ChemModLab: A Web-based Cheminformatics Modeling Laboratory S. Stanley Young + ECCR and ChemSpider Teams.
Organic Chemistry Chapter 12.
EBI is an Outstation of the European Molecular Biology Laboratory. ChEBI: The story so far Paula de Matos.
TermGenie – Granting Biocurators’ Wishes for the GeneOntology BioCurator Meeting 2013 Heiko Dietze – Lightning Talk.
Phenote Mark Gibson Berkeley Bioinformatics and Ontology Project (BBOP) National Center for Biomedical Ontologies(NCBO) Lawrence Berkeley National Lab.
CHEM 2411 Review What did you learn in Organic Chemistry I?
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
2 3 where in the body ? where in the cell ?
Chapter 3 Alkanes: Nomenclature, Conformational Analysis, and an Introduction to Synthesis.
Social Roles and Relationships.
ChEBI, text mining and ontological best practice Colin Batchelor Royal Society of Chemistry
EBI is an Outstation of the European Molecular Biology Laboratory. Rhea Annotated reactions database 17 December 2015.
EBI is an Outstation of the European Molecular Biology Laboratory. Literature Resources at the EBI Information Workshop on European Bioinformatics Resources.
EBI is an Outstation of the European Molecular Biology Laboratory. Tutorial 5: ChEBI - On-line Submission and Curation.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
Phenote Mark Gibson Berkeley Bioinformatics and Ontology Project (BBOP) National Center for Biomedical Ontologies(NCBO) Lawrence Berkeley National Lab.
Differentiate between physical and chemical changes and properties.[CHE.4A] October 2014Secondary Science - Chemistry.
Chapter 8 Opener Carbohydrates General formula: ~(CH 2 O)n Biological Roles Structural (e.g. cellulose in plants) Molecular recognition (modification of.
A marriage of chemistry and biology Aligning the Gene Ontology with CHEBI.
And natural products of plant origin ChEBI Janna Hastings.
Essential Organic Chemistry
OncoTrack Bioinformatics Workshop Max Planck Institute for Molecular Genetics, Berlin Wednesday 6 th November 2013 TimeSubject 13:30-15:00 Introduction.
EBI is an Outstation of the European Molecular Biology Laboratory. Semantic Interoperability Framework Sarala M. Wimalaratne (RICORDO project)
Modeling Non-Peptide Structures ChemBE 414/614 Guest Lecture.
단백질의 다양성 ( 그림 5.1) 5.1 아미노산 - 아미노산 이름 및 약어 ( 표 5.1), 표준아미노산 ( 그림 5.2), - 일반구조 ( 그림 5.3): α- 탄소원자, 곁사슬, 카르복실기, 아미노기 - 프로린은 고리모양 ( 곁사슬과 아미노질소사이 ) -pH7 에서.
Cheminformatics and Metabolism Team The EBI Enzyme Portal.
251 st ACS National Meeting 15 th March 2016 The ChEBI Database and Ontology: a key resource for chemical biology and metabolomics Gareth Owen EMBL-EBI,
Structure of chemical compounds
Classifying Chemistry: Current Efforts in Canada
4.13 Disubstituted Cyclohexane
Committee of Experts World Intellectual Property Organization
Open PHACTS 1.3 Release ( triples)
Integrated relational Enzyme database
Dimitris Dimitropoulos
Access to HE Diploma Pharmacy and Biomedical Science
Welcome, Class of 2021!. Welcome, Class of 2021!
Semester Exam Review Foothill Chemistry.
The Gene Ontology: an evolution
CH 3-5: pH, pKa and Acid/Base Structure
The ChEBI ontology Modelling chemical entities: current challenges
SUBMITTED BY: DEEPTI SHARMA BIOLOGICAL DATABASE AND SEQUENCE ANALYSIS.
Presentation transcript:

“ Good annotation practice ” for chemical data: ChEBI experience Kirill Degtyarenko European Patent Office

§good N aming practice l how to give most appropriate names §good O ntology practice l how to link the entity of interest by defined logical relationships to other entities  good D rawing practice how to draw unambiguous 2-D diagrams Good anNODation practice

or How to Give Most Appropriate Names Good Naming Practice

2-{[3-(trifluoromethyl)phenyl]amino}benzoic acid Systematic Name (IUPAC)

flufenamic acid (INN English) acide flufénamique (INN French) ácido flufenámico (INN Spanish) acidum flufenamicum (INN Latin) Flufenaminsäure (German) Common Name

The Unpronounceables CHEBI:48935 ( E )-roxithromycin IUPAC name: (3 R,4 S,5 S,6 R,7 R,9 R,10 E,11 S,12 R,13 S,14 R )-4-(2,6-dideoxy-3- C -methyl-3- O -methyl-α- L - ribo -hexopyranosyloxy)-14- ethyl-7,12,13-trihydroxy-10-{[(2- methoxyethoxy)methoxy]imino}-6-[3,4,6-trideoxy-3- (dimethylamino)-β- D - xylo -hexopyranosyloxy]- 3,5,7,9,11,13-hexamethyloxacyclotetradecan-2-one

CHEBI:32109 ( Z )-roxithromycin What is the common name of roxithromycin? CHEBI:48935 ( E )-roxithromycin INN: roxithromycin

Roxithromycin (2) CHEBI:48844CHEBI:48844 roxithromycin ( E )-roxithromycin( Z )-roxithromycin

What is thiamine? CHEBI:18385 thiamine(1+) aka thiamine CHEBI:33283 thiamine(1+) chloride INN: thiamine CHEBI:49105CHEBI:49105 thiamine(2+) dichloride aka thiamine chloride hydrochloride aka thiamine hydrochloride

 Problem is not unique to ChEBI  Cf. phenol vs phenols  phenol metabolism vs phenols metabolism  Bad solution: article use  a phenol metabolism?  Solution: prepositional phrases  metabolism of phenols Plurals and singulars

or How to Draw Unambiguous 2-D Diagrams Good Drawing Practice

Linear forms of monosaccharides

Pyranose forms of monosaccharides

Fused systems ( R )-camphor ambiguousunambiguous

Square planar geometry InChI=1/2ClH.2H3N.Pt/h2*1H;2*1H3;/q;;;;+2/p-2 cisplatintransplatin SMILES: [H][N]([H])([H])[Pt](Cl)(Cl)[N]([H])([H])[H]

 Compositional uncertainty  Positional uncertainty  Configurational uncertainty  Conformational uncertainty Uncertainty and ambiguity in chemistry

Examples  an alkali metal cation  vanadate( V ) anion  [ 2 H]ethanol Compositional uncertainty

Examples  L -bromohistidine residue  pteroic acid (several tautomers)pteroic acid Positional uncertainty

Examples  androstane  rel -(2 R,3 R )-2-amino-3-methylpentanoic acid  tetradec-11-enoic acid Configurational uncertainty

Examples  cyclohexane: chair, boat, twist  protein secondary structure: , ,  … Conformational uncertainty

or How to Link the Entity of Interest by Defined Logical Relationships to Other Entities Good Ontology Practice

Molecular structure ontology Subatomic particle ontology Biological role ontology Application ontology ChEBI ontology

Relationships in ChEBI ∆ Is A generic ⋄ Is Part Of generic ♯ Is Conjugate Acid Of specific ♭ Is Conjugate Base Of specific  Is Enantiomer Of specific  Is Tautomer Of specific ℛ Is Substituent Group From specific ℋ Has Parent Hydride specific ℱ Has Functional Parent specific

Is A relationship ∆ L -cysteinecysteine is a

L -cysteinium Is Part Of ⋄ L -cysteine hydrochloride is part of has part

Is Enantiomer Of  L -cysteine ∆∆ D -cysteine is enantiomer of

Is Tautomer Of 3 H -pyrrole2 H -pyrrole  1 H -pyrrole 

Is Conjugate Acid Of ♯ L -cysteine L -cysteinate(1–) is conjugate acid of L -cysteinium L -cysteinate(2–) ♯♯

Is Conjugate Base Of ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭♭

Acid/base relationships ♭ L -cysteine L -cysteinate(1–) L -cysteinium L -cysteinate(2–) ♭ ♯ ♭♯ ♯

L -cysteinyl Is Substituent Group From L -cysteine L -cysteine residue L -cysteino ℛ ℛ ℛ * * * *

salutaridinol Has Parent Hydride has parent hydride is parent hydride of ℋ morphinan

7- O -acetylsalutaridinol Has Functional Parent has functional parent is functional parent of ℱ salutaridinol

Live annotation demo

Going to SourceForge…

Reading a request…

Going to curator tool…

Search result…

Adding new entry…

Editing new entry…

Success!

Let’s draw

Approving structure

Success again!

Using ACD/Name (1)

Using ACD/Name (2)

Adding IUPAC name (1)

Adding IUPAC name (2)

Classifying (1)

Classifying (2)

Classifying (3)

Classifying (4)

The last touch (1)

The last touch (2)

Responding request…

A job well done…

Rafael Alcántara Michael Ashburner Volker Ast * Michael Darsow * Paula de Matos Marcus Ennis Janna Hastings Alan McNaught * Chris Steinbeck Martin Zbinden * The team

Kristian Axelsen Hélène Courrier Anne Morgat Ian Unwin Our faithful Users EU: funding Thanks