Download presentation
Presentation is loading. Please wait.
Published byJuniper Dean Modified over 9 years ago
1
1 MIAME The MIAME website: http://www.mged.org © 2002 Norman Morrison for Manchester Bioinformatics.
2
2 Overview Why capture meta-data? The data capture challenges –What to capture? –How to capture it? –Who agrees what to capture?
3
3 Post-genome data bioinformatics genome transcriptome proteome interactome metabolome textome mobileome phenome
4
4 Why meta-data? Genome data is static Post-genome is very state-dependant –Transcriptome = no. of cell types * no. no of environmental conditions –Annotation matters –Data comparisons matter –Learn from the gene debacle Protein-tyrosine phosphatase, non-receptor type 6, Protein-tyrosine phosphatase 1C, PTP-1C, Hematopoietic cell protein-tyrosine phosphatase, SH-PTP1, Protein- tyrosine phosphatase SHP-1 LARD, death receptor 3 beta, WSL-1R protein, lymphocyte associated receptor of death, death receptor 3 We need repositories
5
5 Microarray Repositories A repository is a primary source of data generated by experimentalists. Its main role is to enforce standards and quality thresholds and to make data widely available. Needs standards.
6
6 Microarray Repositories II Repositories allow for easier data exchange between groups Ensure that key details are kept BUT: –What should be captured and how Requires international cooperation –Minimal Information for the Annotation of Microarray experiments (MIAME) –Developed within MGED
7
7 MIAME – Major Sections Array design –Reporters –Features –Control elements Experimental design –Experiment type –Sample details –Hybridisations –Measurements
8
8 The Six Parts of MIAME 1.Experimental design: the set of hybridization experiments as a whole 2.Array design: each array used and each element (spot, feature) on the array 3.Samples: samples used, extract preparation and labeling 4.Hybridizations: procedures and parameters 5.Measurements: images, quantification and specifications 6.Normalization controls: types, values and specifications
9
9 MIAME Glossary
10
10 Value of audit Based on (qualifier, value, source) Qualifier: cell type Value: epithelial Source: Gray’s anatomy (38th ed.) or Qualifier: treatment Value: 15heat shock Source: Smith and Jones, Nature Genet. (1992)
11
11 MIAME definitions Available from www.mged.org A minimum document to be read All details mentioned in MIAME should be captured somewhere –Know where they are Latest draft: Version 1.1 (Draft 5, March 5, 2002) –Discussed at MGED IV See also: A. Brazma, et al., Nature Genetics, vol 29 (December 2001), pp 365 - 371
12
12 MIAME part 1: – array description In principle this is someone else’s problem –(e.g. Affymetrix, Clonetech, etc.) Three levels of array design elements: –feature – the location on the array –reporter – the nucleotide sequence present in a particular location on the array –composite sequence – a set of reporters used collectively to measure an expression of a particular gene, exon, or splice-variant Array design has 5 parts: 1.1 Array related information 1.2 Reporter information 1.3 Feature information 1.4 Composite sequences 1.5 Control elements
13
13 MIAME part 2: Experimental design This is your problem Experimental design has four parts 2.1 Experimental design 2.2 Sample 2.3 Hybridisation 2.4 Measurements
14
14 2.1 Experimental design Design and purpose of the set of hybridisations Author, lab and contact Experiment type Experimental factors Number of hybridisations Common reference QC steps Experiment description (plus refs) Anything else
15
15 2.2 Sample Biosource properties – organism, contact, cell type, sex,…. Biomaterial manipulation – growth conditions, in vivo treatment, compound Sample labelling – label used, amount, method Spiked controls – feature, type Anything else
16
16 2.3 Hybridisation Relationship between samples and arrays Protocol – full description Anything else
17
17 2.4 Measurement Raw data – scanner files, scanning protocol Scanning protocol – parameter settings Analysis and quantification – analysis output, protocol – e.g. algorithms Normalisation – strategies and algorithms, final gene expression table Anything else
18
18 MAGE, ontologies and maxd The MAGE website: http://www.mged.org © 2002 Norman Morrison for Manchester Bioinformatics.
19
19 Outline MIAME is useful, but ….. –How can we represent it computationally? –How can we use it to share and exchange data? –The wonderful world of XML –The evil that is free text ontologies and controlled vocabularies Maxd – MIAME supportive, MAGE-ML compliant analysis of microarray data.
20
MAGE-ML, MAGE-OM MIAME sets a standard for what knowledge (meta-data) to capture But how to do it? Need a knowledge model – a schema to represent the knowledge and the relationships between them.
21
Knowledge capture UML – Universal Modelling Language provides a methodology for capturing knowledge in ways that are computationally tractable (cf database schemas) MAGE-OM is the MGED approved UML model which attempts to capture the concepts in MIAME
22
XML A UML diagram is not useful by itself MAGE-ML is an attempt to capture MAGE-OM in XML (eXtended Markup Language) – the next generation HTML MAGE-ML provides a structure for a text document (marked up with tags) which describes a microarray experiment
23
MAGE-ML MAGE-ML is not nice! –Complex –Not easily human readable –Needs software tools to help create it –Very rich MAGE-ML is the standard we have to work to.
24
ArrayExpress ArrayExpress is the new public microarray data repository based at the EBI Provides tools to help create MAGE-ML Experiments will not be entered unless the annotation is of a high quality
25
Making MAGE useable For a repository we need a relational database – not an object model We have created a relational implementation of the MAGE-OM which is MIAME compliant (based on an early UML diagram for arrayexpress) - maxdSQL
26
Data repositories Relational version of MAGE-OM
27
Outstanding issues – free text MAGE provides a structure for the knowledge – not a prescription for what gets put in How to control what people put in the free text areas of MIAME/MAGE (the mickey mouse /. problem) How do we define what is meant in ways that other people/software understand
28
Solution 1 Controlled vocabularies –Agreed lists of terms (and definitions) that a community agree to use Pros: technical simple, easy to implement Cons: limiting, how to get agreement?, terms on there own are not very descriptive
29
Solution 2 Ontologies –Can be thought of as a set of agreed terms and the relationships between them (a taxonomy is a simple ontology in which the only relationship allowed is an is-a relationship) Pros: a very rich and powerful infrastructure Cons: complex Many developments – a space to watch –Chris Stoeckert and Helen Parkinson http://www.cbil.upenn.edu/Ontology/MGED_ontology.html
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.