Presentation is loading. Please wait.

Presentation is loading. Please wait.

MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays

Similar presentations


Presentation on theme: "MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays"— Presentation transcript:

1 MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays
Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI, Hinxton, Cambridge, UK Dec. 5, 2001 Chris Stoeckert, U. Penn

2 Ontology Usage for Genes in EpoDB
EpoDB is a prototype system of genes expressed during erythropoiesis Built before microarrays were readily available Illustrate usage of an ontology of gene parts and controlled vocabularies of gene (and gene family) names

3 Stoeckert, Salas, Brunk, Overton (1999) Nucl. Acids Res. 26:288

4 EpoDB “Gene Ontology”

5 EpoDB Gene Landmark Query

6 Standardisation of Microarray Data and Annotations -MGED Group
The MGED group is a grass roots movement initially established at the Microarray Gene Expression Database meeting MGED 1 (14-15 November, 1999, Cambridge, UK). The goal of the group is to facilitate the adoption of standards for DNA-array experiment annotation and data representation, as well as the introduction of standard experimental controls and data normalisation methods. Members are from around the world in academia, government, and industry.

7 MGED Working Groups Annotation: Experiment description and data representation standards (Alvis Brazma, EMBL-EBI) Format: Microarray data XML exchange format (Paul Spellman, UC Berkeley) Ontology: Ontologies for sample description (Chris Stoeckert, U Penn) Normalization: Normalization, quality control and cross-platform comparison (Gavin Sherlock, Stanford U)

8 MGED Documents Annotation -> Minimal Information About a Microarray Experiment (MIAME) What should go into a microarray database Brazma et al. Nature Genetics 29: , 2001 Format -> Microarray Gene Expression (MAGE) Object Model and XML DTD How microarray databases will talk to each other

9 Relationship of MGED Efforts
MIAME DB MAGE Annotation Format Ontologies External Internal MGED Ontology MIAME DB External Ontologies/CVs Ontologies provide common terms and their definitions for describing microarray experiments.

10 What is an ontology? (In the computer science not philosophy sense)
An ontology is a specification of concepts that includes the relationships between those concepts. Removes ambiguity. Provides semantics and constraints. Allows for computational inferences and reliable comparisons

11 Types of Ontologies Taxonomy Frame-based (object-oriented)
Tree structure. IS-A hierachy Variants - Gene Ontology (DAG) Frame-based (object-oriented) Classes and attributes EcoCyc Description logic (DL) Reasoning about concept (class) relationships Combine terms with constraints (sanctioning) GRAIL (GALEN, TAMBIS) Ontology Inference Layer (OIL) Combines Frames and DLs Uses Web standards XML and RDF Debate on what’s an ontology - not address that here Nor a complete primer on ontologies

12 Taxonomy Terms for common usage
Homo sapiens, not human, not homo sapeins NCBI ID = 9606 Hierarchy provides unambiguous levels of equivalence Homo sapiens and Mus musculus are of the class Mammalia but Drosophila melanogaster is not. Can use taxonomic hierarchies for other types of information e.g., Human Developmental Anatomy (U. of Edinburgh)

13 CBIL Anatomy Hierarchy

14 Microarray Information to be Captured
Figure from: David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

15 Tables Describing Samples in RAD (RNA Abundance Database)
Hybridization Conditions Label Sample Treatment Disease Devel. Stage ExperimentSample Taxon Anatomy RelExperiments Exp.ControlGenes ControlGenes Experiment ExpGroups Groups

16 Anatomy Table Used by RAD

17 Usage of Anatomy Hierarchy to Query RAD

18 MGED Ontology Working Group Goals
Identify concepts Collect available controlled vocabularies and ontologies for concepts Define concepts Formalize concept relationships

19

20 Species Resources

21

22 Concept Definitions

23

24 MGED Ontology Working Group Goals
Identify concepts Collect available controlled vocabularies and ontologies for concepts Define concepts Formalize concept relationships

25 Usage of Concepts and Resources for Microarrays
MIAME glossary Provide definitions for types of information (concepts) listed in MIAME MIAME qualifier, value, source Provide pointers to relevant sources that can be used to What can be done without structured concepts

26 MIAME Section on Sample Source and Treatment
sample source and treatment ID as used in section 1 organism (NCBI taxonomy) additional "qualifier, value, source" list; the list includes: cell source and type (if derived from primary sources (s)) sex age growth conditions development stage organism part (tissue) animal/plant strain or line genetic variation (e.g., gene knockout, transgenic variation) individual individual genetic characteristics (e.g., disease alleles, polymorphisms) disease state or normal target cell type cell line and source (if applicable) in vivo treatments (organism or individual treatments) in vitro treatments (cell culture conditions) treatment type (e.g., small molecule, heat shock, cold shock, food deprivation) compound is additional clinical information available (link) separation technique (e.g., none, trimming, microdissection, FACS) laboratory protocol for sample treatment

27 Excerpts from a Sample Description courtesy of M. Hoffman, S
Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source: in-house bred mice (contact: Sex: female [ MGED ] Age: weeks after birth [ MGED ] Growth conditions: normal controlled environment oC average temperature housed in cages according to German and EU legislation specified pathogen free conditions (SPF) 14 hours light cycle 10 hours dark cycle Developmental stage: stage 28 (juvenile (young) mice) [ GXD "Mouse Anatomical Dictionary" ] Organism part: thymus [ GXD "Mouse Anatomical Dictionary" ] Strain or line: C57BL/6 [International Committee on Standardized Genetic Nomenclature for Mice] Genetic Variation: Inbr (J) 150. Origin: substrains 6 and 10 were separated prior to This substrain is now probably the most widely used of all inbred strains. Substrain 6 and 10 differ at the H9, Igh2 and Lv loci. Maint. by J,N, Ola. [International Committee on Standardized Genetic Nomenclature for Mice ] Treatment: in vivo [MGED] intraperitoneal injection of Dexamethasone into mice, 10 microgram per 25 g bodyweight of the mouse Compound: drug [MGED] synthetic glucocorticoid Dexamethasone, dissolved in PBS Use for MIAME qualifier, value, source triplet Use for MAGE ontology entry

28 MGED Ontology Working Group Goals
Identify concepts Collect available controlled vocabularies and ontologies for concepts Define concepts Formalize concept relationships

29 MGED Biomaterial Ontology
Under construction Using OILed (Not wedded to any one tool) Generate multiple formats: RDFS, DAML+OIL Define classes, provide relations and constraints, identify instances Motivated by MIAME and coordinated with MAGE

30 MAGE BioMaterial Model

31 Building a Microarray Ontology

32 Ontology Available as RDFS

33 Ontology in Browseable Form

34 Example of Internal Terms

35 Example of External Terms

36 Example of Combined Internal and External: Treatment

37 OWG Use Cases Return a summary of all experiments that use a specified type of biosource. Use “age” to select and order experiments Use Mouse Anatomical Dictionary Stage 28 to pick experiments according to “organism part” Return a summary of all experiments done examining effects of a specified treatment E.g., Look for “CompoundBasedTreatment”, “in vivo” Select “Compound” based on CAS registry number Order based on “CompoundMeasurement” Build gene networks based on biomaterial description Generate a distance metric based on biosource and use in calculation of correlation with gene expression level Generate an error estimation based on biosample (i.e., even when biosources are identical, there will be variation resulting from different treatments)

38 Ontology Working Group Highlights
First pass ontology of biomaterial descriptions Participated in Bio-ontologies Consortium Meeting at ISMB 2001. Mail list of about 200 subscribers

39 Ontology Working Group Plans
Finish building biomaterial description ontology Expand efforts to include remaining parts of a microarray experiment Demonstrate usage to the microarray community

40 Acknowledgements The members of the MGED Ontology Working Group for their contributions The Bio-Ontologies Consortium for encouragement and guidance This presentation is available at


Download ppt "MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays"

Similar presentations


Ads by Google