Download presentation
Presentation is loading. Please wait.
Published byStella Barrett Modified over 8 years ago
1
ArrayExpress Ugis Sarkans EMBL - EBI www.ebi.ac.uk/arrayexpress
2
Outline why the domain model is not simple ArrayExpress object model ArrayExpress implementation status future developments
3
Underlying principles must be able to accommodate needs of a technology that is under constant development must be able to manage data in absence of standard measurement units and standards for reliability information gene expression data have any meaning only in the context of what are the experimental conditions –controlled vocabularies and ontologies needed for unambiguous sample annotation MIAME-compliant
4
ArrayExpress - conceptual overview
5
Simple version of AE object model - ArrayExpressBasic
6
Motivation for 2 object models many spots - one gene raw data - cleaned-up data - ratios - normalizations - higher-level analysis how detailed sample description is needed? for data mining we need ways to unify several datasets: –array features across different array platforms –samples from different experiments –various raw and derived measurements
7
ArrayExpressComplete
8
Scope of ArrayExpress object models useable for a public repository as well as a laboratory database (e.g., as a part of LIMS) implementation of “intermediate” models possible mapping to RDBMS tables - not necessarily straightforward models and documentation available at www.ebi.ac.uk/arrayexpress
9
ArrayExpress - features able to import MAML format can deal with both raw and processed data independence of: –experimental platforms –image analysis methods –data normalization methods object model-based query mechanism will support upcoming OMG standard for expression data
10
Key constructs in the AE object model structured sample descriptions notion of ExpressionValueSet several dimensions for ExpressionValues Transformations working on ExpressionValueSets and their dimensions
11
Structured representation of sample and treatment relations Sample source Primary sample 1 Primary sample 2 Derived sample 1 Labeled extract 1 Extract 1 Derived sample 2 A new state of sample source Extract 2 Labeled extract 2Hybridization labeling extraction treatment
12
Microarray expression value representation expression value types primary images composite images e.g., green/red ratios primary spots composite spots primary measurements derived values
13
Current status object model - stable, supports current MIAME physical database schema MAML data loader populated with one dataset from EMBL currently accessible through SQL
14
In development data loader - changes following MAML evolution annotation & MAML export tool Web interface to ArrayExpress –programmatic interface will follow
15
Proposed architecture data submission & curation database data warehouse application server Web server image server? ArrayExpress curation pipeline MAML data
16
Future developments will support upcoming OMG standard for gene expression data (XML, queries) diagrammatic interface to sample description submodel integration with other databases analytical tools running on top of ArrayExpress data curation pipeline development
17
Acknowledgements –MGED - MIAME, MAML –Incyte - Genomic Knowledge Platform –OMG gene expression data proposal submitters - Rosetta & NetGenics
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.