The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.

Slides:



Advertisements
Similar presentations
Improving Learning Object Description Mechanisms to Support an Integrated Framework for Ubiquitous Learning Scenarios María Felisa Verdejo Carlos Celorrio.
Advertisements

The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Data Access & Integration in the ISPIDER Proteomics Grid L. Zamboulis, H. Fan, K. Bellhajjame, J. Siepen, A. Jones, N. Martin, A. Poulovassilis, S. Hubbard,
Functional Genomics Ontology FuGO and Metabolomics Society Ontology group Susanna-Assunta Sansone Nutr/Toxicogenomics Projects Coordinator EMBL-EBI Metabolomics.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
Metadata For CARMEN Phillip Lord and Frank Gibson.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
Data Management in the DOE Genomics:GTL Program Janet Jacobsen and Adam Arkin Lawrence Berkeley National Laboratory University of California, Berkeley.
Proposal for a Standard Representation of the Results of GC-MS Analysis: A Module for ArMet Helen Fuell 1, Manfred Beckmann 2, John Draper 2, Oliver Fiehn.
Automatic Data Ramon Lawrence University of Manitoba
Mapping Physical Formats to Logical Models to Extract Data and Metadata Tara Talbott IPAW ‘06.
Data Warehouse Components
Introduction to Software Design Chapter 1. Chapter 1: Introduction to Software Design2 Chapter Objectives To become familiar with the software challenge.
4/20/2017.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Group E: Data Exchange Co-chairs Nigel Hardy, Chris Taylor.
1 MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI.
Data Curation and Management activities within the UCT Computational Biology Group Dr Nicky Mulder.
Database System Concepts and Architecture
AIXM Users’ Conference, March Implementing AIXM in Instrument Flight Procedures Automation Presenter: Iain Hammond MacDonald, Dettwiler &
Web Services Description Language CS409 Application Services Even Semester 2007.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
Proteome data integration characteristics and challenges K. Belhajjame 1, R. Cote 4, S.M. Embury 1, H. Fan 2, C. Goble 1, H. Hermjakob, S.J. Hubbard 1,
Kepler/pPOD: Scientific Workflow and Provenance Support for Assembling the Tree of Life UC DAVIS Department of Computer Science The Kepler/pPOD Team Shawn.
© 2007 by Prentice Hall 1 Introduction to databases.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
MIAMExpress development October 2002 Mohammad shojatalab
P15 Lai Xiaoni (U077151L) Qiao Li (U077194E) Saw Woei Yuh (U077146X) Wang Yong (U077138Y)
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Application portlets within the PROGRESS HPC Portal Michał Kosiedowski
The european ITM Task Force data structure F. Imbeaux.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
1 © 1999 Microsoft Corp.. Microsoft Repository Phil Bernstein Microsoft Corp.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
Quality views: capturing and exploiting the user perspective on data quality Paolo Missier, Suzanne Embury, Mark Greenwood School of Computer Science University.
Data Integration and Management A PDB Perspective.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of Manchester.
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.
Nature Reviews/2012. Next-Generation Sequencing (NGS): Data Generation NGS will generate more broadly applicable data for various novel functional assays.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Representing Flow Cytometry Experiments within FuGE Josef Spidlen 1, Peter Wilkinson 2, and Ryan Brinkman 1 1 BC Cancer Research Centre, Vancouver, BC,
A Practical Approach to Metadata Management Mark Jessop Prof. Jim Austin University of York.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Extending FuGE into other domains Andrew Jones School of Computer Science, University of Manchester
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Mining the Biomedical Research Literature Ken Baclawski.
1 Registry Services Overview J. Steven Hughes (Deputy Chair) Principal Computer Scientist NASA/JPL 17 December 2015.
Web Technologies for Bioinformatics Ken Baclawski.
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
A facilitator to discover and compose services Oussama Kassem Zein Yvon Kermarrec ENST Bretagne.
Slide 1 Service-centric Software Engineering. Slide 2 Objectives To explain the notion of a reusable service, based on web service standards, that provides.
EnVisioning Data Integration SME forum 2009, Vienna Henning Hermjakob Henning Hermjakob
OMERO.editor Where next? (After Beta3). Goal of Editor? (1) To record a complete description of the experiment. Like a lab notebook that someone else.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
ArrayExpress Ugis Sarkans EMBL - EBI
Data Warehouse Components
Data Exchange & Public Reference Data
Grid Portal Services IeSE (the Integrated e-Science Environment)
Web Ontology Language for Service (OWL-S)
Phil Bernstein Microsoft Corp.
Service-centric Software Engineering
Semantic Markup for Semantic Web Tools:
(System Development Life Cycle)
Presentation transcript:

The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester

History Data sharing for ‘omics data tackled by various groups: –MAGE format for microarrays (MGED 2002) –PEDRo for proteomics (U. Man 2003) Problems for functional genomics: –Common parts modelled differently –Labs performing both techniques must create 2 complex applications to describe similar concepts –Difficult to integrate data Two efforts to merge MAGE and PEDRo (2004) –Merged models even more complex –Did not cover other techniques e.g. metabolomics –But, significant advantages if upstream details can be described only once!

Introduction to FuGE Functional Genomics Experiment model (FuGE) Models common components across functional genomics experiments –Sample description, experimental variables protocols, multidimensional data Three uses of FuGE: 1.A data format for representing laboratory workflows 2.Supplement existing data formats with additional metadata to describe their context within a workflow 3.A framework for building new data formats

FuGE Common Bio Measurement Audit Ontology Protocol Reference Investigation Data Material Conceptual Molecule Common: General data format management Auditing Referencing external resources Protocols Bio: Investigation structure Data Materials (organisms, solutions, compounds) Theoretical molecules e.g. sequences, metabolites stored in a database FuGE structure Description FuGE exists as: 1. Object model (UML) UML  XML Schema 2. XML schema...and Java STK, Hibernate relational DB binding etc.

Use 1: Experiment Workflow Material Treatment Material Treatment Material Treatment Material Data Acquisition Data Data Transformation Data = Inputs and outputs = ProtocolApplication Data

Use 2: Tie Together External Formats ProtocolApplication MaterialExternalData mzData file File format definition Parser will exist to extract data / parameters from mzData file Material can be used to describe the sample. This connects the MS data with a separation workflow inputMaterialoutputData

Use 3: Build extension data formats

FuGE Status Milestone 1 (Sept 2005) Milestone 2 (Dec 2005) Milestone 3 (May 2006) Beta Java software toolkit –M2 (March 2006); M3 (Sept 2006) FuGE v1 (candidate) –Currently in PSI standards process –Expected to stablise from process by March/April 07

Formats extending from FuGE MAGE version 2 (MGED) GelML and GelInfoML (PSI) analysisXML (PSI) spML (PSI / MSI) NMR (FuGE being evaluated by MSI) Planned migration for mzData and other PSI formats Upstream workflow description for all groups –investigation structure and variables, sample description etc. –Allows assembly of studies that cross-technology boundaries in one data format

Conclusions FuGE accepted by MGED, PSI and MSI –for developing future data formats –for describing parts of experiments common across technology Moving toward convergence of data formats Simplify process of developing new data standards Will facilitate data integration and submission of data to public repositories Improve the uniformity of data sets in public repositories thus facilitates querying Web:

Acknowledgements FuGE development –Angel Pizarro (UPenn), Michael Miller (Rosetta), Paul Spellman (Lawrence Berkley) –MGED, PSI, Fred Hutchinson CRC, Genologics PSI –Chris Taylor, Henning Hermjakob, Randy Julian MSI –Nigel Hardy and Helen Jenkins (Aber) Work on FuGE in Manchester is funded by the BBSRC Web: