FuGO An Ontology for Functional Genomics Investigation Susanna-Assunta Sansone (EBI): Overview Trish Whetzel (Un of Pen): Microarray Daniel Schober (EBI):

Slides:



Advertisements
Similar presentations
I. Spasić,1 D. Schober,2 S. Sansone,2 D. Rebholz-Schuhmann,2 D
Advertisements

OMV Ontology Metadata Vocabulary April 10, 2008 Peter Haase.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
BioOntologies SIG, ISMB/ECCB 2007Daniel Schober, EMBL-EBI 1 Naming conventions for ontology engineering Daniel Schober, PhD The European Bioinformatics.
Experiences from the NCBO OBO-to-OWL Mapping Effort Dilvan A. Moreira, University of São Paulo Mark A. Musen, Stanford University.
The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania.
SONet (Scientific Observations Network) and OBOE (Extensible Observation Ontology): Mark Schildhauer, Director of Computing National Center for Ecological.
Family Resource Center Association January 2015 Quarterly Meeting.
Functional Genomics Ontology FuGO and Metabolomics Society Ontology group Susanna-Assunta Sansone Nutr/Toxicogenomics Projects Coordinator EMBL-EBI Metabolomics.
Using text mining techniques to support the expansion of controlled vocabularies Irena Spasić
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
Information and Business Work
Ontology Notes are from:
FuGO: Development of a Functional Genomics Ontology (FuGO) Patricia L. Whetzel 1, Helen Parkinson 2, Assunta-Susanna Sansone 2,Chris Taylor 2, and Christian.
THE NATIONAL CENTER FOR BIOMEDICAL ONTOLOGY Ontology-based Tools to Enhance Data Curation Trish Whetzel, PhD Outreach Coordinator December 9, 2010.
National center for ontological research. Part One: The History of NCOR and ECOR Part Two: How to Establish JCOR: The Japanese Consortium.
How to Organize the World of Ontologies Barry Smith 1.
Domain-Specific Software Engineering Alex Adamec.
Group E: Data Exchange Co-chairs Nigel Hardy, Chris Taylor.
Criteria for Centres of Expertise for Rare Diseases in the EU following EUCERD Recommendations RARECARENet Project: Consensus meeting on.
SCIENCE-DRIVEN INFORMATICS FOR PCORI PPRN Kristen Anton UNC Chapel Hill/ White River Computing Dan Crichton White River Computing February 3, 2014.
Enriching the Ontology for Biomedical Investigations (OBI) to Improve Its Suitability for Web Service Annotations Chaitanya Guttula, Alok Dhamanaskar,
1 st (RSBI) ISA-Tab Workshop – Scope and Outcome  Tackle today's need for exchange of multi-omics experiments Evaluate the ISA-TAB straw-man (incomplete)
A J Miles Rutherford Appleton Laboratory SKOS Standards and Best Practises for USING Knowledge Organisation Systems ON THE Semantic Web NKOS workshop ECDL.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
OBI – Communities and Structure 1. Coordination Committee (CC): Representatives of the communities -> Monthly conferences 2. Developers WG: CC and other.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
CaBIG Semantic Infrastructure 2.0: Supporting TBPT Needs Dave Hau, M.D., M.S. Acting Director, Semantic Infrastructure NCI Center for Biomedical Informatics.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
Community Ontology Development Lessons from the Gene Ontology.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
2 st ISA-TAB workshop Outcome/Summary (to date) Workshops on Data Standards (WODS) – EBI, Cambridge, UK 16 th, 17 th and 18 th June 2008 This workshop.
Wheat Data Interoperability. 2  Endorsed in March 2014  Focus:  Improve/reach semantic interoperability of Wheat data  The WG will focus first on.
Scientific Data Annotation and Analysis Lecture 7.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Architectural Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
1 Introduction to Software Engineering Lecture 1.
Exploiting scientific data in the domain of ‘omics 'Genomics Standards Consortium Ontology requirements and experiences' Dawn Field Oxford Centre for Ecology.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
FuGE: A framework for developing standards for functional genomics Angel Pizarro Univesrity of Pennsylvania Andrew Jones University of Manchester.
XML Standards for Proteomics Data Andrew Jones, Dr Jonathan Wastling and Dr Ela Hunt Department of Computing Science and the Institute of Biomedical and.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Representing Flow Cytometry Experiments within FuGE Josef Spidlen 1, Peter Wilkinson 2, and Ryan Brinkman 1 1 BC Cancer Research Centre, Vancouver, BC,
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Extending FuGE into other domains Andrew Jones School of Computer Science, University of Manchester
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Seeking SC Feedback on Draft Technology Strategy and Roadmap for EarthCube Draft of 3 November 2015 The Technology and Architecture Committee (TAC) Chairs:
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Sharing the knowledge of electrophysiology data Phillip Lord, Frank Gibson and the CARMEN Consortium.
1 Class exercise II: Use Case Implementation Deborah McGuinness and Peter Fox CSCI Week 8, October 20, 2008.
Data Consultant, Honorary Academic Editor Associate Director, Principal Investigator Community-driven metadata standards in the life science - analyzing.
ISA Project update CaNano November 12 th,2012 Philippe Rocca-Serra.
1 Steve Hughes Daniel J. Crichton NASA/JPL January 16, 2007 CCSDS Information Architecture Working.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.
ISWG / SIF / GEOSS OOSSIW - November, 2008 GEOSS “Interoperability” Steven F. Browdy (ISWG, SIF, SCC)
Semantic Media Wiki Open Terminology Development - Initial Steps - Frank Hartel, Ph.D. Associate Director, Enterprise Vocabulary Services National Cancer.
1 LS DAM Overview August 7, 2012 Current Core Team: Ian Fore, D.Phil., NCI CBIIT, Robert Freimuth, Ph.D., Mayo Clinic, Mervi Heiskanen, NCI-CBIIT, Joyce.
ArrayExpress Ugis Sarkans EMBL - EBI
Bio-ontologies SIG in conjunction with ISMB July Boston, USA
International Workshop 28 Jan – 2 Feb 2011 Phoenix, AZ, USA Modeling Standards Activity Team Model-based Systems Engineering (MBSE) Initiative Roger Burkhart.
Data Coordinating Center University of Washington Department of Biostatistics Elizabeth Brown, ScD Siiri Bennett, MD.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
Model Based Engineering Environment Christopher Delp NASA/Caltech Jet Propulsion Laboratory.
Development of the Amphibian Anatomical Ontology
OBI – Standard Semantic
Presentation transcript:

FuGO An Ontology for Functional Genomics Investigation Susanna-Assunta Sansone (EBI): Overview Trish Whetzel (Un of Pen): Microarray Daniel Schober (EBI): Metabolomics Chris Taylor (EBI): Proteomics On behalf of the FuGO working group

FuGO - Rationale  Standardization activities in (single) domains Reporting structures, CVs/ontology and exchange formats  Pieces of a puzzle Standards should stand alone BUT also function together - Build it in a modular way, maximizing interactions  Capitalize on synergies, where commonality exists  Develop a common terminology for those parts of an investigation that are common across technological and biological domains Source and Characteristics Treatments Collection Sample Preparation Instrumental Analysis (MS, NMR, array, etc.) Computational Analysis Data Pre-Processing Investigation Design

FuGO - Overview  Purpose NOT model biology, NOR the laboratory workflow BUT provide core of ‘universal’ descriptors for its components -To be ‘extended’ by biological and technological domain-specific WGs No dependency on any Object Model - Can be mapped to any object model, e.g. FuGE OM  Open source approach Protégé tool and Ontology Web Language (OWL) Source and Characteristics Treatments Collection Sample Preparation Instrumental Analysis (MS, NMR, array, etc.) Computational Analysis Data Pre-Processing Investigation Design

FuGO – Communities and Funds  List of current communities Omics technologies - HUPO - Proteomics Standards Initiative (PSI) - Microarray Gene Expression Data (MGED) Society - Metabolomics Society – Metabolomics Standards Initiative (MSI) Other technologies - Flow cytometry - Polymorphism Specific domains of application - Environmental groups (crop science and environmental genomics) - Nutrition group - Toxicology group - Immunology groups  List of current funds NIH-NHGRI grant (C. Stoeckert, Un of Pen) for workshops and ontologist BBSRC grant (S.A. Sansone, EBI) for ontologist

 Coordination Committee Representatives of technological and biological communities - Monthly conferences calls  Developers WG Representatives and members of these communities - Weekly conferences calls  Documentations  Advisory Board Advise on high level design and best practices Provide links to other key efforts Barry Smith, Buffalo Un and IFOMIS Frank Hartel, NIH-NCI Mark Musen, Stanford Un and Protégé Team Robert Stevens, Manchester Un Steve Oliver, Manchester Un Suzi Lewis, Berkeley Un and GO FuGO – Processes -> cBiO will also oversee the Open BioMedical Ontology (OBO) initiative

FuGO – Strategy  Use cases -> within community activity Collect real examples  Bottom up approach -> within community activity Gather terms and definitions - Each communities in its own domain  Top down approach -> collaborative activity Develop a ‘naming convention’ Build a top level ontology structure, is_a relationships Other foreseen relationships - part_of (currently expressed in the taxonomy as cardinal_part_of) - participate_in (input) and derive_from (output), - describe or qualify - located_in and contained_in  Binning terms in the top level ontology structure The higher semantics helps for faster ‘binning’

 Binning process - ongoing Reconciliations into one canonical version Iterative process  Common working practices - established Each class consists of: term ID, preferred term, synonyms, definition and comments Sourceforge tracker to send comments on terms, definitions, relationships  Timeline for completion of core omics technologies Two years and several intermediate milestones Interim solution - Community-specific CVs posted under the OBO  Ultimately FuGO will be part of the OBO Foundry (Core) Ontology  Overview paper – “Special Issue on Data Standards” OMICS journal FuGO – Status and Plans

Transcriptomics Community Contributions to FuGO Trish Whetzel

Transcriptomics Community Represented by the MGED Society –consists of those performing microarray experiments (technological domain) Current source of annotation terms for microarray experiments is the MGED Ontology –scope includes experiment design, biomaterials, protocols (actions, hardware, software), and data analysis

Work Towards FuGO MGED Ontology (MO) will be used as the source of terms to propose for inclusion in FuGO –Bin all terms according to high level containers of FuGO (bottom-up) identify those that are universal and those that are community specific –Modify all term names and definitions to adhere to FuGO naming conventions –Propose universal terms to FuGO developers for review of term name, definition and location in FuGO by members of other communities (top-down) –Propose technology specific terms to FuGO developers for review of the location of the term in FuGO AND ensure that the terms are community specific

Additional Community Specific Work Add numeric identifiers to the MGED Ontology Generate a mapping file of terms from the MGED Ontology to FuGO Modify applications to account for numeric identifiers AND to identify the annotation source (MO vs FuGO) Result: Ability to retrieve data annotated with either MO or FuGO.

Metabolomics Standardization Initiative Ontology Working Group (MSI-OWG) Daniel Schober

MSI OWG - Activities  Newly established group  Develop our roadmap Compile list of agreed controlled vocabularies (CVs) - Leveraging on existing resources and efforts (incl. PSI) Identify suitable ontology engineering method -Engage with FuGO  Establish group infrastructure Set up SF website and mailing lists Ontology web-access - WebProtege Collaborative ontology development & editing - pOWL

MSI OWG - CVs  Develop CVs for instrument-dependant domains (NMR, MS, chromatography) Resuse terms from existing resources, e.g.: - ArMet model and CVs - NMR-STAR group - PSI MS CVs - Human Metabolome Project (HMP), HUSERMET, MeT-RO - IUPAC terminology for analytical chemistry Initiate collaboration for chromatography component - PSI Sample Processing WG Enriching the initial term list - Swoogle, Ontosearch and LexGrid for finding Ontologies - Applied DTB-Schemata (Vendors) - Pubmed textmining

Naming Conventions for CV terms  Evaluate OBO- and GO style guide  Guidance document to name Knowledge Representation (KR) idioms SYNONYM and ACRONYM REPRESENTATION KR IDIOM IDENTIFIERS PROPER CLASS DEFINITIONS CROSS-REFERENCING OTHER TERMINOLOGIES ONTOLOGY FILE NAMES (VERSIONING) NAMING TERMS and CLASSES - Capitalisation (lower case), underscore word separator - Singular instead of plural - No ellipses (be explicit) - Allowed character set - Consistent affix usage (prefix, suffix, infix and circumfix) - Avoid “taboo" words

CV engineering approach  Strategy Use existing CV as initial start Apply naming conventions (normalize), identify synonyms and definitions Collect relationships (for later phase) Discuss CV within OWG Circulate to practitioners, refine, add missing terms (Iterative) Integrate further CVs Determine completeness and remove redundancy  Challenges  Modelling Mathematics/Numbers Atomic terms vs compound terms -‘Sample temperature in autosampler -‘Sample’ (object), ‘Temperature’ (characteristic), ‘in’ (located_in relation) and ‘Autosampler’ (object)

PSI Ontology Chris Taylor

Synergy for (not so) Dummies™ Diverse community-specific extensions Generic Features (origin of biomaterial) Generic Features (experimental design) Arrays Scanning Arrays & Scanning Columns Gels MS FTIR NMR TranscriptomicsProteomicsMetabolnomics Columns

PSI — CVs and FuGO PSI: MS controlled vocabulary generation –Term collection began some time ago –CV now available in OBO format –Includes IUPAC terms The next steps –Rebinning of the MS controlled vocabulary (in Excel) –Tracking the evolution of the ‘live’ OBO format Where we are going: 1) CVs that support the use/implementation of formats –mzData, analysisXML, GelML, +++ Tied explicitly to the elements in the format 2) Full-blown ontological structuring of those same terms –Insertion into FuGO –Linking through accessions back to the format-linked CV Allows re-use of terms by other communities