ArrayExpress - a Public Repository for Microarray Based Gene Expression Data European Bioinformatics Institute - EMBL outstation and German Cancer Research Centre
The decision to establish a public gene expression data repository for array based gene expression data F Responding to EBI Industry partner needs, EBI got involved in gene expression data analysis in 1997 F As the result of this work the need for the repository became apparent about 18 months ago F After consulting many of the major microarray laboratories world-wide, the decision was made to start designing the database (Nature 398, p. 646, March 99)
ArrayExpress database F A public repository for array based gene expression data F Draft conceptual design for a public repository for array based gene expression data in Rational Rose F In close collaboration with DKFZ and Sanger Centre F Discussions with other potential data submitters world wide
Expression Profiler F Internet based tool for gene expression data analysis tool ( with a database of yeast expression profiles for about 100 experimental conditions from Stanford and MIT F Option for uploading users own data F Available online (first online tool as far as we know) F Still under development, will be used as an “attractor” for data submissions
Motivation of the design F When representing data from physical experiments we have to decide on the level of pre-processing before storing F Two perspectives – high level data analysis – not loosing essential information F One number per spot vs. image, gene centric vs. spot centric
Conservative compromise F Raw data captured F Information structured in a way that high level views can be easily precomputed F ArrayExpress – image analysis output for each spot – “divide and conquer” approach in annotations – reusability of information already in the database
Two types of submissions F Experiment - a set of hybridisations F Array description
“Divide and conquer” approach F Experiment - a set of hybridisations F A single hybridisation - array + sample F Array description (grid + spots) – spots may be linked to genes F Sample description F Hybridisation analysis (scanning, quantitation, data)
Top level structure of ArrayExpress database Publication (e.g., PubMedCentral ) External links HybridisationArray Gene (e.g., EMBL ) Target Source (e.g., Taxonomy ) Analysis Experiment ArrayExpress
ArrayExpress - the conceptual schema