Download presentation
Presentation is loading. Please wait.
Published byBrittney Waters Modified over 9 years ago
1
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe Rocca-Serra, Marco Brandizi, Nataliya Sklyar, Eamonn Maguire, Chris Taylor, Gabriella Rustici and Susanna-Assunta Sansone 1. Growing complexity of datasets However, being focused on particular communities’ interests, be their individual ‘omics’ technologies or specific biological/biomedical disciplines, leads to duplication of effort, and more seriously, the development of (largely arbitrarily) different standards. This fragmentation severely hinders the interoperability of databases and tools, reporting standards and ultimately integration of datasets. ArrayExpress and PRIDE - EBI production systems for microarray and proteomics data respectively - illustrate the implementation of such a scenario. Acknowledgements The EU integrated project CarcinoGENOMICS (http://www.carcinogenomics.eu, LSHB-2006-037712), EU network of Excellence NuGO (http://www.nugo.org, NoE-503630), BBSRC grants (workshop on standards and ontology, BB/E025080/1, and MIBBI BB/G000638/1), UK NERC Bioinformatics Centre partnership fund and the EMBL-EBI. The authors also acknowledge the contributions of the ArrayExpress and Pride teams; also the MIBBI, OBO Foundry, OBI, FuGE and ISA-TAB communities.http://www.carcinogenomics.euhttp://www.nugo.org European Bioinformatics Institute is an Outstation of the European Molecular Laboratory The marriage of traditional approaches with genomics, transcriptomics, proteomics and metabol/nomics technologies (hereafter referred as ‘omics’) has created not only opportunities, but also substantial new informatics challenges. For example, consider the reporting of a complex multi-assay study looking at the effect, on a number of subjects, of a compound inducing liver damage by characterizing the urine metabolic profile (by mass spectroscopy), measuring liver protein and gene expression (by mass spectrometry and DNA microarrays, respectively), and conducting conventional histopathological analysis. Figure 2 Independent databases, with different submission/exchange formats; diverse representations of the metadata; use of different terminologies. Many groups have risen to this challenge; standards for collecting, describing, formatting, submitting and exchanging both metadata and data are either under development or have been released. Several standards initiatives addressing particular technologies or defined domains of application (e.g., genomics, microarray, proteomics, metabol/nomics and system biology models) have emerged from the academic community, in many cases with the support of commercial organizations such as instrument vendors. Such initiatives are focused on supporting tool interoperability and data exchange among public and proprietary systems, by developing 3 kinds of (de facto) reporting standards: minimal information specification (checklists), semantics (ontologies) and syntax (file formats). Figure 1 Example of a multi-assay study. References 1.Taylor CF, Field D, Sansone SA,… Rocca-Serra P et al. (2008) The MIBBI Project. Nat Biotechnol. Aug;26(8):889-896. http://http://www.mibbi.orghttp://http://www.mibbi.org 2.Smith B, Ashburner M, Rosse C,… Rocca-Serra P, …Sansone SA et al. (2007) The OBO Foundry. Nat Biotechnol, Nov;25(11):1251-5. http://www.obofoundry.orghttp://www.obofoundry.org 3.Ontology for Biomedical Investigations (OBI): http://obi-ontology.orghttp://obi-ontology.org 4.Jones AR, Miller M, Aebersold R,… Sansone SA et al. (2007) The Functional Genomics Experiment model (FuGE). Nat Biotechnol. Oct;25(10):1127-1133. http://fuge.sf.nethttp://fuge.sf.net 5.Sansone SA, Rocca-Serra P, Brandizi M,… Taylor CF et al. (2008) The First MGED RSBI (ISA-TAB) Workshop. OMICS. Jun;12(2):143-9. http://isatab.sf.nethttp://isatab.sf.net 2. Fragmentation of reporting standards It is pivotal that such complex metadata (i.e., sample characteristics, study design, assay execution, sample-data relationships) are reported in a standard manner to correctly interpret the final results (or data) that they contextualize. Fortunately, several synergistic activities foster the harmonization of the 3 kinds of standards being developed. Over 22 groups participate in the MIBBI project, which offers a one-stop shop for those exploring the range of extant ‘minimum information’ checklists, and which fosters collaborative and integrative development [1]. More than 60 groups participate in the OBO Foundry [2] to coordinate the development of orthogonal, interoperable ontologies, such as OBI [3], to support data integration Several groups participate in the FuGE project to develop a generic data model to underpin a variety of XML-based file formats [4]. And recently, a growing number of communities have started to work collaboratively on ISA-TAB, a tabular framework for presenting metadata [5], and serve to as a user-friendly presentation layer for XML-based formats (via a XSLT). 3. Towards interoperable reporting standards Interoperable reporting standards facilitate the development of standards-compliant products by academic and commercial software developers, instrument vendors, etc. They do so by limiting the range and variability of standards for such parties to consider. The BioInvestigation Index infrastructure aims to create a common structured representation and storage mechanism for metadata, and the sample-data relationships for biological, biomedical and environmental studies, which commonly range from simple one assay-based to complex multi-assay studies, as illustrated in Figure 1. The infrastructure relies on existent production systems, such as ArrayExpress and PRIDE, but avoids the fragmentation by leveraging on the synergistic reporting standards described in section 3. The infrastructure’s main components and their use of the synergistic reporting standards are described in the figure below. BioInvestigation Index – Overview The standards scenario - Introduction Prototype launch in Fall/Winter 2008: www.ebi.ac.uk/bioinvindex. Information and announcements at: www.ebi.ac.uk/net-project.www.ebi.ac.uk/bioinvindexwww.ebi.ac.uk/net-project
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.