EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.

Slides:



Advertisements
Similar presentations
NERC Environmental Bioinformatics Centre Practical sharing of environmental molecular data Bela Tiwari NERC Environmental Bioinformatics Centre.
Advertisements

GENOMICS MEETS GRID INTRODUCTION CHRIS THOMPSON. BACKGROUND SR £120M NATIONAL PROGRAMME ON e- SCIENCE £8M BBSRC PROGRAMME ON BIOINFORMATICS AND.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Presentation at WebEx Meeting June 15,  Context  Challenge  Anticipated Outcomes  Framework  Timeline & Guidance  Comment and Questions.
Visualisationmodule Catherine Leroy, Pierre Marguerite, Bhuwan Tiwari, Niran Abeygunawardena, Sergio Contrino, Anna Farne, Ele Holloway, Gaurab Mukherjee,
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
EBI Proteomics Services Team – Standards, Data, and Tools for Proteomics Henning Hermjakob European Bioinformatics Institute SME forum 2009 Vienna.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
The Imperial College Tissue Bank A searchable catalogue for tissues, research projects and data outcomes Prof Gerry Thomas - Dept. Surgery & Cancer The.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
Functional Genomics Ontology FuGO and Metabolomics Society Ontology group Susanna-Assunta Sansone Nutr/Toxicogenomics Projects Coordinator EMBL-EBI Metabolomics.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
FuGO: Development of a Functional Genomics Ontology (FuGO) Patricia L. Whetzel 1, Helen Parkinson 2, Assunta-Susanna Sansone 2,Chris Taylor 2, and Christian.
Data Management in the DOE Genomics:GTL Program Janet Jacobsen and Adam Arkin Lawrence Berkeley National Laboratory University of California, Berkeley.
How to Organize the World of Ontologies Barry Smith 1.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Vivien Bonazzi Ph.D. Program Director: Computational Biology (NHGRI) Co Chair Software Methods & Systems (BD2K) Biomedical Big Data Initiative (BD2K)
1 FACS Data Management Workshop The Immunology Database and Analysis Portal (ImmPort) Perspective Bioinformatics Integration Support Contract (BISC) N01AI40076.
RDA Wheat Data Interoperability Working Group Outcomes RDA Outputs P5 9 th March 2015, San Diego.
Dawn Wright Oregon State University Ned Dwyer Coastal & Marine Resources Centre, Ireland The International Coastal Atlas Network (ICAN) FGDC Marine & Coastal.
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
Interoperability ERRA System.
1 st (RSBI) ISA-Tab Workshop – Scope and Outcome  Tackle today's need for exchange of multi-omics experiments Evaluate the ISA-TAB straw-man (incomplete)
The MGED Society Facilitating Data Sharing and Integration with Standards CTSA Omics Data Standards Working Group Chris Stoeckert Dept. of Genetics and.
CASIMIR Networking Meeting Heathrow, July 2007 CASIMIR WP4 Data Representation John Hancock Duncan Davidson.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Bioinformatics and medicine: Are we meeting the challenge?
The NCIP Nanotechnology Working Group Nano WG Fall 2013 Kick-Off.
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Open Biomedical Ontologies. Open Biomedical Ontologies (OBO) An umbrella project for grouping different ontologies in biological/medical field –a repository.
WP7 Data Integration & Interoperability Committee members Amos Bairoch, chair ( SIB) Michael Ashburner, deputy-chair ( University of Cambridge ) Lydie.
2 st ISA-TAB workshop Outcome/Summary (to date) Workshops on Data Standards (WODS) – EBI, Cambridge, UK 16 th, 17 th and 18 th June 2008 This workshop.
The Environmental Genomics Thematic Programme Data Centre Dawn Field, Director.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
IntAct- An Open Standard and Software for Protein-Protein Interaction Data Henning Hermjakob 1, Luisa Montecchi-Palazzi 9, Chris Lewington 1, Dan Wu 1,
1 LS DAM Overview and the Specimen Core February 16, 2012 Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Elaine Freund,
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
David Carr The Wellcome Trust Data management and sharing: the Wellcome Trust’s approach Economic & Social Data Service conference.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
Project Database Handler The Project Database Handler dbCCP4i is a brokering application that mediates interactions between the project database and an.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Representing Flow Cytometry Experiments within FuGE Josef Spidlen 1, Peter Wilkinson 2, and Ryan Brinkman 1 1 BC Cancer Research Centre, Vancouver, BC,
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
THE MURDOCK Study: A Rich Data Resource for Biomarker Discovery and Validation Brian D. Bennett 1, Jessica D. Tenenbaum 1, Victoria Christian 1, Melissa.
SysMO-DB and ISA Katy Wolstencroft, University of Manchester, UK.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
Workshop: Linking Models and Data in SysMO Katy Wolstencroft, SysMO-DB University of Manchester, UK.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
1 The Virtual Physiological Human ToolKit Jonathan Cooper University of Oxford On behalf of the ToolKit team  VPH NoE, 2009 All Hands Meeting 2009 Oxford.
Describing Bioinformatic Metadata at EBI James Malone
An International Centre for Mouse Genetics CASIMIR WP4 Data Representation John Hancock MRC Harwell.
High Risk 1. Ensure productive use of GRID computing through participation of biologists to shape the development of the GRID. 2. Develop user-friendly.
Cloud-based e-science drivers for ESAs Sentinel Collaborative Ground Segment Kostas Koumandaros Greek Research & Technology Network Open Science retreat.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
12 th Meeting of the GBIF Participant Nodes Committee 6-7 October 2013, Berlin, Germany Data mobilization and use for international policy Olaf Bánki Senior.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
EMBRACE Workshop Appled Gene Ontology ITB – CNR Bari, Italy 7. – 9. November 2007 Domenica D’Elia, Giulia De Sario, Andreas Gisel, Cecilia Saccone, Angelica.
ELIXIR: Authentication and Authorization Infrastructure Requirements
European Open Science Cloud All Hands Meeting Pisa 8-9 March 2018
OBI – Standard Semantic
Bird of Feather Session
M-H Pinard-van der Laan
Presentation transcript:

EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe Rocca-Serra, Marco Brandizi, Nataliya Sklyar, Eamonn Maguire, Chris Taylor, Gabriella Rustici and Susanna-Assunta Sansone 1. Growing complexity of datasets However, being focused on particular communities’ interests, be their individual ‘omics’ technologies or specific biological/biomedical disciplines, leads to duplication of effort, and more seriously, the development of (largely arbitrarily) different standards. This fragmentation severely hinders the interoperability of databases and tools, reporting standards and ultimately integration of datasets. ArrayExpress and PRIDE - EBI production systems for microarray and proteomics data respectively - illustrate the implementation of such a scenario. Acknowledgements The EU integrated project CarcinoGENOMICS ( LSHB ), EU network of Excellence NuGO ( NoE ), BBSRC grants (workshop on standards and ontology, BB/E025080/1, and MIBBI BB/G000638/1), UK NERC Bioinformatics Centre partnership fund and the EMBL-EBI. The authors also acknowledge the contributions of the ArrayExpress and Pride teams; also the MIBBI, OBO Foundry, OBI, FuGE and ISA-TAB communities. European Bioinformatics Institute is an Outstation of the European Molecular Laboratory The marriage of traditional approaches with genomics, transcriptomics, proteomics and metabol/nomics technologies (hereafter referred as ‘omics’) has created not only opportunities, but also substantial new informatics challenges. For example, consider the reporting of a complex multi-assay study looking at the effect, on a number of subjects, of a compound inducing liver damage by characterizing the urine metabolic profile (by mass spectroscopy), measuring liver protein and gene expression (by mass spectrometry and DNA microarrays, respectively), and conducting conventional histopathological analysis. Figure 2 Independent databases, with different submission/exchange formats; diverse representations of the metadata; use of different terminologies. Many groups have risen to this challenge; standards for collecting, describing, formatting, submitting and exchanging both metadata and data are either under development or have been released. Several standards initiatives addressing particular technologies or defined domains of application (e.g., genomics, microarray, proteomics, metabol/nomics and system biology models) have emerged from the academic community, in many cases with the support of commercial organizations such as instrument vendors. Such initiatives are focused on supporting tool interoperability and data exchange among public and proprietary systems, by developing 3 kinds of (de facto) reporting standards: minimal information specification (checklists), semantics (ontologies) and syntax (file formats). Figure 1 Example of a multi-assay study. References 1.Taylor CF, Field D, Sansone SA,… Rocca-Serra P et al. (2008) The MIBBI Project. Nat Biotechnol. Aug;26(8): Smith B, Ashburner M, Rosse C,… Rocca-Serra P, …Sansone SA et al. (2007) The OBO Foundry. Nat Biotechnol, Nov;25(11): Ontology for Biomedical Investigations (OBI): 4.Jones AR, Miller M, Aebersold R,… Sansone SA et al. (2007) The Functional Genomics Experiment model (FuGE). Nat Biotechnol. Oct;25(10): Sansone SA, Rocca-Serra P, Brandizi M,… Taylor CF et al. (2008) The First MGED RSBI (ISA-TAB) Workshop. OMICS. Jun;12(2): Fragmentation of reporting standards It is pivotal that such complex metadata (i.e., sample characteristics, study design, assay execution, sample-data relationships) are reported in a standard manner to correctly interpret the final results (or data) that they contextualize. Fortunately, several synergistic activities foster the harmonization of the 3 kinds of standards being developed. Over 22 groups participate in the MIBBI project, which offers a one-stop shop for those exploring the range of extant ‘minimum information’ checklists, and which fosters collaborative and integrative development [1]. More than 60 groups participate in the OBO Foundry [2] to coordinate the development of orthogonal, interoperable ontologies, such as OBI [3], to support data integration Several groups participate in the FuGE project to develop a generic data model to underpin a variety of XML-based file formats [4]. And recently, a growing number of communities have started to work collaboratively on ISA-TAB, a tabular framework for presenting metadata [5], and serve to as a user-friendly presentation layer for XML-based formats (via a XSLT). 3. Towards interoperable reporting standards Interoperable reporting standards facilitate the development of standards-compliant products by academic and commercial software developers, instrument vendors, etc. They do so by limiting the range and variability of standards for such parties to consider. The BioInvestigation Index infrastructure aims to create a common structured representation and storage mechanism for metadata, and the sample-data relationships for biological, biomedical and environmental studies, which commonly range from simple one assay-based to complex multi-assay studies, as illustrated in Figure 1. The infrastructure relies on existent production systems, such as ArrayExpress and PRIDE, but avoids the fragmentation by leveraging on the synergistic reporting standards described in section 3. The infrastructure’s main components and their use of the synergistic reporting standards are described in the figure below. BioInvestigation Index – Overview The standards scenario - Introduction Prototype launch in Fall/Winter 2008: Information and announcements at: