The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania.

Slides:



Advertisements
Similar presentations
Misha Kapushesky November 28, 2003 Expression Profiler: Next Generation.
Advertisements

The ArrayExpress Gene Expression Database: a Software Engineering and Implementation Perspective Ugis Sarkans European Bioinformatics Institute.
The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Configuration management
Software change management
Configuration management
THE INSTITUTE FOR GENOMIC RESEARCH TIGR MGED Ontology Workshop MGED7 September 8-10, 2004 Toronto, Canada.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
Functional Genomics Ontology FuGO and Metabolomics Society Ontology group Susanna-Assunta Sansone Nutr/Toxicogenomics Projects Coordinator EMBL-EBI Metabolomics.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
FuGO: Development of a Functional Genomics Ontology (FuGO) Patricia L. Whetzel 1, Helen Parkinson 2, Assunta-Susanna Sansone 2,Chris Taylor 2, and Christian.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,
Planning a measurement program What is a metrics plan? A metrics plan must describe the who, what, where, when, how, and why of metrics. It begins with.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
Software Requirements
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
Framework for Model Creation and Generation of Representations DDI Lifecycle Moving Forward.
OntologyEntry in MAGE Chris Stoeckert, Helen Parkinson Trish Whetzel, Joe White Gilberto Fragoso, Liju Fan, Mervi Heiskanen, Angel Pizarro Ontology Working.
1 ArrayExpress and MAGE Jamboree II Ugis Sarkans, EBI.
GUS Overview June 18, GUS-3.0 Supports application and data integration Uses an extensible architecture. Is object-oriented even though it uses.
Microrray Data Standardisation Microarray Gene Expression Database group -- MGED December, 2000.
INTRODUCTION GOAL: to provide novel types of interaction between classification systems and MIAME-compliant databases We present a prototype module aimed.
Data Curation and Management activities within the UCT Computational Biology Group Dr Nicky Mulder.
Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008.
The Functional Genomics Experiment Model (FuGE) Andy Jones School of Computer Science and Faculty of Life Sciences, University of Manchester.
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
GUS: A Functional Genomics Data Management System Chris Stoeckert, Ph.D. Center for Bioinformatics and Dept. of Genetics University of Pennsylvania ASM.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
First GUS Workshop July 6-8, 2005 Penn Center for Bioinformatics Philadelphia, PA.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
Universiti Utara Malaysia Chapter 3 Introduction to ASP.NET 3.5.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
1 maxdLoad The maxd website: © 2002 Norman Morrison for Manchester Bioinformatics.
MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.
Content, Format, and Standards in Genomics Scale Data The ILSI – EBI Collaboration Wm. B. Mattes, PhD, DABT.
Architectural Design Yonsei University 2 nd Semester, 2014 Sanghyun Park.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
The European Bioinformatics Institute MAGE-OM and ArrayExpress a brief introduction to the database model Helen Parkinson European Bioinformatics Institute.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK Standards and infrastructure for managing experimental metadata Philippe Rocca-Serra,
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Generating Useful Information in Toxicogenomics: Focused Efforts: Microarray Standards Feb. 6, 2003, The National Academies Chris Stoeckert, Ph.D. Center.
FuGE: A framework for developing standards for functional genomics Andrew Jones School of Computer Science, University of Manchester Metabomeeting 2.0.
Representing Flow Cytometry Experiments within FuGE Josef Spidlen 1, Peter Wilkinson 2, and Ryan Brinkman 1 1 BC Cancer Research Centre, Vancouver, BC,
CSC480 Software Engineering Lecture 10 September 25, 2002.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
- EVS Overview - Biomedical Terminology and Ontology Resources Frank Hartel, Ph.D. Director, Enterprise Vocabulary Services NCI Center for Bioinformatics.
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
Mining the Biomedical Research Literature Ken Baclawski.
Web Technologies for Bioinformatics Ken Baclawski.
Software Engineering, COMP201 Slide 1 Software Requirements BY M D ACHARYA Dept of Computer Science.
EMBL- EBI Wellcome Trust Genome Campus Hinxton, Cambridge CB10 1SD, UK The BioInvestigation Index – Standards and Infrastructure for Omics Data Philippe.
ArrayExpress Ugis Sarkans EMBL - EBI
Chapter 2: Database System Concepts and Architecture
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Functional Genomics Consortium: NIDDK (Kaestner) and (Permutt)
Presentation transcript:

The MGED Ontology Workshop MGED 7 September 8, 2004 Chris Stoeckert Center for Bioinformatics & Dept. of Genetics University of Pennsylvania

MGED Ontology Workshop Agenda What is the MGED Ontology (MO)? What is the MGED Ontology (MO)? Building MO: the process Building MO: the process Using MO Using MO Future development of MO Future development of MO Joe White (TIGR): MO applications from MAGE Jamboree Joe White (TIGR): MO applications from MAGE Jamboree

MGED Standardization Efforts MIAME MIAME The formulation of the minimum information about a microarray experiment required to interpret and verify the results. (Brazma et al. Nature Genetics 2001) The formulation of the minimum information about a microarray experiment required to interpret and verify the results. (Brazma et al. Nature Genetics 2001) MAGE-OM MAGE-OM The establishment of a data exchange format and object model for microarray experiments. (Spellman et al. Genome Biol. 2002) The establishment of a data exchange format and object model for microarray experiments. (Spellman et al. Genome Biol. 2002) MGED Ontology MGED Ontology The development of an ontology for microarray experiment description and biological material (biomaterial) annotation in particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003) The development of an ontology for microarray experiment description and biological material (biomaterial) annotation in particular. (Stoeckrt & Parkinson, Comp. Funct. Genom. 2003) Transformations Transformations The development of recommendations regarding microarray data transformations and normalization methods. The development of recommendations regarding microarray data transformations and normalization methods. RSBI RSBI Reporting Structure for Biological Investigations (toxicogenomics, environmental genomics, metabol/nomics) Reporting Structure for Biological Investigations (toxicogenomics, environmental genomics, metabol/nomics)

MGED Ontology (MO) Purpose Purpose Provide standard terms for the annotation of microarray experiments Provide standard terms for the annotation of microarray experiments Not to model biology but to provide descriptors for experiment components Not to model biology but to provide descriptors for experiment components Benefits Benefits Unambiguous description of how the experiment was performed Unambiguous description of how the experiment was performed Structured queries can be generated Structured queries can be generated Ontology concepts derived from the MIAME guidelines/MAGE-OM Ontology concepts derived from the MIAME guidelines/MAGE-OM Also incorporating concepts from Transformations and RSBI Also incorporating concepts from Transformations and RSBI

Relationship of MO to MAGE-OM MO class hierarchy follows that of MAGE-OM MO class hierarchy follows that of MAGE-OM Association to OntologyEntry Association to OntologyEntry MO provides terms for these associations by: MO provides terms for these associations by: Instances internal to MO Instances internal to MO Instances from external ontologies Instances from external ontologies Take advantage of existing ontologies Take advantage of existing ontologies

MGED Ontology Class Hierarchy MGED CoreOntology Coordinated development with MAGE-OM Ease of locating appropriate class to select terms from MGED ExtendedOntology Classes for additional terms as the usage of genomics technologies expand

MAGE and MO

Main focus of MGED Ontology Structured and rich description of BioMaterials BioMaterial OntologyEntry +characteristics +associations

MO and References to External Ontologies

MO and references to External Ontologies

Standards and Ontologies for Functional Genomics 2 October 23-26, 2004 held at the University of Pennsylvania Medical School Funded in part by NHGRI NCRR NERC GSK Co-Hosted by The Jackson Laboratory University of Pennsylvania European Bioinformatics Institute Student Scholarships Available Photo by R. Kennedy, B Trist, R. Tarver, for GPTMC

Use MGED Ontology for Structured Descriptions (MAGE- ML)

MGED Ontology development hp OILed OILed File formats File formats DAML file DAML file HTML file HTML file NCI DTS Browser NCI DTS Browser Changes Changes Notes Notes Term Tracker Term Tracker

MGED Ontology Working Group Virtual Ontology Workshops Virtual Ontology Workshops Chris Stoeckert, Trish Whetzel (Penn) Chris Stoeckert, Trish Whetzel (Penn) Helen Parkinson, Susanna Sansone (EBI) Helen Parkinson, Susanna Sansone (EBI) Joe White (TIGR) Joe White (TIGR) Gilberto Fragoso, Liju Fan, Mervi Heiskanen (NCI) Gilberto Fragoso, Liju Fan, Mervi Heiskanen (NCI) Helen Causton, Laurence Game (ICL) Helen Causton, Laurence Game (ICL) Chris Taylor (PSI, EBI) Chris Taylor (PSI, EBI) Mged-ontologies mailing list Mged-ontologies mailing list

Desirable Microarray Queries Return all experiments with species X examined at developmental stage Y Return all experiments with species X examined at developmental stage Y Sort by platform type Sort by platform type Which are untreated? Treated? Which are untreated? Treated? Treated with what compound? Treated with what compound? How comparable are these? How comparable are these? What can these experiments tell me? What can these experiments tell me?

MO and Structured Queries

RAD: RNA Abundance Database RAD is part of GUS (Genomics Unified Schema) The GUS platform maximizes the utility of stored data by warehousing them in a schema that integrates the genome, transcriptome, gene regulation and networks, ontologies and controlled vocabularies, gene expression Relational schema (implemented in Oracle) Stores data from gene expression arrays and SAGE Comes with a suite of web-annotation forms (Study- Annotator) MAGE-RAD Translator (MR_T) generates MAGE-ML files for exports Manduchi et al Bioinformatics 20:

RAD Study-Annotator Covers all relevant parts of the MIAME checklist Exploits the MGED Ontology Allows entering of very specific details of an experiment Web-based forms: Modular structure Written in PHP Front-end data integrity checks using JavaScript Manages Data Privacy based on Project/Group selections present in GUS schema Available at installation.htm

BioMaterial Annotation: Conceptual View

RAD Study Annotator: BioMaterial Module

RAD Study Annotator: BioSource Form

Other Sites Using MO See posters for more details on these!

Future Development of MO Areas of Development Areas of Development Ongoing maintenance Ongoing maintenance Ontology language Ontology language Non-array technologies Non-array technologies Biological domain extensions Biological domain extensions MO v2. development MO v2. development

Proposed methods for MO development Ongoing maintenance Ongoing maintenance Addition of new instance terms to existing classes Addition of new instance terms to existing classes Fixing typographical errors Fixing typographical errors Adding missing associations Adding missing associations These represent minor changes that should largely not affect software applications that are based on the MO These represent minor changes that should largely not affect software applications that are based on the MO

Proposed methods for MO development Ontology language Ontology language Planned changes in the primary language format (from DAML to OWL). Planned changes in the primary language format (from DAML to OWL). Planned changes in the primary ontology editing tool (from OILed to Protégé). Planned changes in the primary ontology editing tool (from OILed to Protégé). These should represent fairly minor differences as far as applications based on the MO are concerned. These should represent fairly minor differences as far as applications based on the MO are concerned. Some minor name changes will be needed to adjust for differences in allowed characters. Some minor name changes will be needed to adjust for differences in allowed characters. New functionalities such as the availability of synonyms may be used to enrich the MO further. New functionalities such as the availability of synonyms may be used to enrich the MO further.

Proposed methods for MO development Non-array technologies Non-array technologies Standards efforts for proteomics (PSI) and metabol/nomics (SMRS) would like to add terms for their specific needs. Standards efforts for proteomics (PSI) and metabol/nomics (SMRS) would like to add terms for their specific needs. Classes that are needed for new technologies can be placed under the MGEDExtendedOntology and linked to MGEDCoreOntology classes through properties Classes that are needed for new technologies can be placed under the MGEDExtendedOntology and linked to MGEDCoreOntology classes through properties (i.e., MGEDExtendedOntologyClass has_property (MGEDCoreOntologyClass). (i.e., MGEDExtendedOntologyClass has_property (MGEDCoreOntologyClass). Such development would not impact the MGEDCoreOntology and therefore allow addition of non-array technology classes Such development would not impact the MGEDCoreOntology and therefore allow addition of non-array technology classes Instances that are needed for new technologies may be most appropriate for existing classes in the MGEDCoreOntology Instances that are needed for new technologies may be most appropriate for existing classes in the MGEDCoreOntology The policy for adding and defining instances regarding technology- related terms is to provide a generic name and definition but to supply technology-specific examples (in the definition). The policy for adding and defining instances regarding technology- related terms is to provide a generic name and definition but to supply technology-specific examples (in the definition).

A Functional Genomics View Courtesy of Andy Jones

Proposed methods for MO development Biological domain extensions Biological domain extensions Areas (e.g., toxicogenomics) where the current specification of Experiment and Biomaterial is not sufficient to fully capture descriptions of experiments Areas (e.g., toxicogenomics) where the current specification of Experiment and Biomaterial is not sufficient to fully capture descriptions of experiments Extensions should fit within the MAGE-OM v1.1 and so ultimately could go into the MGEDCoreOntology. Extensions should fit within the MAGE-OM v1.1 and so ultimately could go into the MGEDCoreOntology. However, as the new classes, subclasses, properties, and instances are under development (and therefore not stable), they should be placed in the MGEDExtendedOntology until mature enough to be migrated over to the MGEDCoreOntology. However, as the new classes, subclasses, properties, and instances are under development (and therefore not stable), they should be placed in the MGEDExtendedOntology until mature enough to be migrated over to the MGEDCoreOntology. The MGED Reporting Structure for Biological Investigations (RSBI)Working Group representing biological domain extensions in toxicogenomics, environmental genomics, and nutrigenomics will take this approach. The MGED Reporting Structure for Biological Investigations (RSBI)Working Group representing biological domain extensions in toxicogenomics, environmental genomics, and nutrigenomics will take this approach. Hear more about this from Jennifer Fostel next! Hear more about this from Jennifer Fostel next!

Proposed methods for MO development MO v2 development MO v2 development Reflect the reorganization planned for the MAGE- OM and its new major version (v2). Reflect the reorganization planned for the MAGE- OM and its new major version (v2). MAGE v2 will have major structural changes from MAGE v1.1 and is likely to require major changes in the MO. MAGE v2 will have major structural changes from MAGE v1.1 and is likely to require major changes in the MO. With a MO v2 developed in parallel this should not conflict with the stated plans of the MO to be consistent with MAGE as it will be tied to the new version. With a MO v2 developed in parallel this should not conflict with the stated plans of the MO to be consistent with MAGE as it will be tied to the new version.

A Functional Genomics Object Model (FGE-OM) Separate out common components from technology-specific ones Separate out common components from technology-specific ones Allow new domains to be added as new modules to the model Allow new domains to be added as new modules to the model Incorporate ideas from SysBio-OM (Xirasgur et al. Bioinformatics in press) Incorporate ideas from SysBio-OM (Xirasgur et al. Bioinformatics in press) Jones et al. Bioinformatics 2004

Proposed Development of MGED Ontology MO 1.x MO 2.x Sept Jan March 2005 Sept Move to OWL/Protege Proteomics in ExtendedOntology RSBI in ExtendedOntology RSBI in CoreOntology