Download presentation
Presentation is loading. Please wait.
Published byLionel Dixon Modified over 8 years ago
1
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society
2
What is FuGE? Various groups have tried to fuse MAGE and PEDRo in the past –Such a model would be difficult to manage FuGE is a model of the common components of functional genomics experiments Aims to help the development of data standards Should allow some cross-compatibility between different ‘omics experiments Microarray & proteome standards will use parts of FuGE for some data formats
3
So, what is FuGE? An object model in UML (close to 1 st stable release) An XML Schema (in development) A software API (will be created from UML) FuGE use ontologies extensively, such as MGED Ontology or its successor (FuGO) Developed by members of MGED / PSI with input from cross-omics experimentalists e.g. RSBI
4
What is FuGE not…? Not an effort to create one data standard for all lab techniques –This problem is hard at technical level and v hard getting agreement from all groups Not a model for metabolomics metadata –But it might help in the development of one –…and we would like to encourage input from the metabolomics community
5
FuGE Structure 2 sections: Common and Bio Common – components that aid the development of a rich data standard –Protocols, external references, auditing and security settings Bio – biological specific components –Biological (or chemical) materials, bio sequences –Summary of an investigation structure –References to data model specific to each domain
6
Protocols Protocols have a set of ordered atomic actions –Actions are user-entered text or ontology terms Protocols can be associated with Software and Equipment Protocols, Software and Equipment can have a set of defined Parameters Mechanism for defining a standard protocol, and an instance of a protocol (date, operator…) Nested protocols can be defined for representing complex procedures –An Action can be a reference to another Protocol
7
FuGE Workflow Material Treatment Material Treatment Material Treatment Material Data Acquisition Data Data Transformation Data = Inputs and outputs of Protocols = Instance of some Protocol
8
FuGE Workflow Material Treatment Material Treatment Material Treatment Material Data Acquisition Data Data Transformation Data Materials defined using terms from ontologies Treatments defined by Protocols Data represented in domain specific format FuGE is the “glue” for sticking components together
9
Other useful components Each object can be tagged with audit info: –Who made a change, when, what type of change Security information: –users, groups for accessing/changing data Consistent mechanism for identifying objects –Life sciences IDs (LSIDs) used to uniquely ID components –Objects can be referenced across documents Mechanism for linking to external databases, literature refs and ontologies
10
Investigation model Stores a summary of the investigation to facilitate queries Purpose of investigation (hypothesis) Design of the investigation –e.g. strain differences, gene knockout, drug doses, time course Stores the important variables –Values from ontology e.g. gene names, units etc… Links from variables to relevant data items
11
Benefits of shared components Queries over common annotation –Samples, hypotheses, protocols Shared software for experimental annotation and analysis –Microarrays, proteomics and metabolomics (and other experiments!) performed in same lab Developing standards for each technique is a hard problem –Shared resources could alleviate the problems (audit, security, identifying objects, ontologies)
12
Using FuGE in Practice 1.Imports parts of UML or XML Schema and extend with domain-specific components Example: Attempting to integrate FuGE with our Manchester metabolomics database 2.Reference a FuGE entry for investigation structure and bio samples 3.Define ontologies and use FuGE as it is for experimental metadata This would not include a format for mass spec or NMR data, which would also be needed
13
Conclusions FuGE was created to solve the general problem: –What are the common requirements for a “functional genomics” data standard? MGED will use FuGE for generating MAGE version 2 PSI evaluating FuGE for protein separation standard format FuGE-based systems being implemented by a number of organisations FuGE could help develop a metabolome format http://fuge.sourceforge.net
14
Acknowledgements FuGE has been developed in collaboration with many groups, including: –Angel Pizarro (U Penn) –Paul Spellman (Lawrence Berkley) –Michael Miller (Rosetta) –Members of Fred Hutchinson CRC, Seattle –RSBI –Various other members of MGED and PSI http://fuge.sourceforge.net
15
Describable Identifiable
16
Common.Description Many classes inherit from Describable Link to Audit / Security details URI and text description
17
Protocol
18
Audit
19
Investigation
20
Material
21
Common.Data Ordered set of Dimensions Data stored in Matrix Matrix must be extended with subclasses
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.