MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
The MGED Ontology: Providing Descriptors for Microarray Data Trish Whetzel Department of Genetics Center for Bioinformatics University of Pennsylvania.
Mouse Phenotype Ontology George Gkoutos. Phenotype Annotation Traditional phenotypic descriptions are captures as free text Information retrieval based.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
Basics of Knowledge Management ICOM5047 – Design Project in Computer Engineering ECE Department J. Fernando Vega Riveros, Ph.D.
Using the Semantic Web to Construct an Ontology- Based Repository for Software Patterns Scott Henninger Computer Science and Engineering University of.
Welcome to mini-symposium on ontologies for biological sample description EMBL-EBI Wellcome Trust Genome Campus Deceber 5, 2001.
Who am I Gianluca Correndo PhD student (end of PhD) Work in the group of medical informatics (Paolo Terenziani) PhD thesis on contextualization techniques.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
 Goals Unambiguous description of how the investigation was performed Consistent annotation, powerful queries and data integration  Details NOT model.
1 Building and Using Ontologies Robert Stevens Department of Computer Science University of Manchester Manchester UK.
FuGO: Development of a Functional Genomics Ontology (FuGO) Patricia L. Whetzel 1, Helen Parkinson 2, Assunta-Susanna Sansone 2,Chris Taylor 2, and Christian.
The MGED Ontology Is An Experimental Ontology Bio-Ontologies Aug 8, 2002 Chris Stoeckert, Helen Parkinson and the MGED Ontology Working Group.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays Microarray Data Analysis and Management: Bio-ontologies for Microarrays EMBL-EBI,
Web Web 3.0 = Web 5.0? The HSFBCY + CIHR + Microsoft Research SADI and CardioSHARE Projects Mark Wilkinson & Bruce McManus Heart + Lung Institute.
The MGED Ontology: A framework for describing functional genomics experiments SOFG Nov. 19, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Business Domain Modelling Principles Theory and Practice HYPERCUBE Ltd 7 CURTAIN RD, LONDON EC2A 3LT Mike Bennett, Hypercube Ltd.
OntologyEntry in MAGE Chris Stoeckert, Helen Parkinson Trish Whetzel, Joe White Gilberto Fragoso, Liju Fan, Mervi Heiskanen, Angel Pizarro Ontology Working.
EMBL Outstation — The European Bioinformatics Institute MIAME and ArrayExpress - a standard for microarray data annotation and a database to store it Helen.
Microarray Gene Expression Database (MGED) Ontology Working Group Chris Stoeckert Center for Bioinformatics University of Pennsylvania July 26, 2001.
The importance of meta data capture – problems and solutions Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NERC Meta Data.
Excerpts from a Sample Description courtesy of M. Hoffman, S. Schmidtke, Lion BioSciences Organism: mus musculus [ NCBI taxonomy browser ] Cell source:
The European Bioinformatics Institute MIAME and Ontologies for Sample Description Helen Parkinson Microarray Informatics Team European Bioinformatics Institute.
Knowledge Representation Ontology are best delivered in some computable representation Variety of choices with different: –Expressiveness The range of.
Susanna-Assunta Sansone (Toxicogenomics project coordinator) Microarray Informatics Team EMBL- EBI (European Bioinformatics Institute) Transcriptome Symposium,
ILSI-HESI agreement with EBI: ArrayExpress, public repository for toxicogenomics data Susanna Assunta Sansone Microarray Informatics.
Copyright OpenHelix. No use or reproduction without express written consent1.
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of.
Ontologies for the Integration of Geospatial Data Michael Lutz Workshop: Semantics and Ontologies for GI Services, 2006 Paper: Lutz et al., Overcoming.
Standards and Ontologies for Data Annotation Helen Parkinson Microarray Informatics Team European Bioinformatics Institute NBN-EBI Course, October 2002.
Annual reports and feedback from UMLS licensees Kin Wah Fung MD, MSc, MA The UMLS Team National Library of Medicine Workshop on the Future of the UMLS.
The European Bioinformatics Institute MGED ontology for consistent annotation of microarray experiments Manchester Bioinformatics Week Ontologies Workshop1.
Data Management David Nathan & Peter Austin & Robert Munro.
1 MIAME The MIAME website: © 2002 Norman Morrison for Manchester Bioinformatics.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
Metadata Models in Survey Computing Some Results of MetaNet – WG 2 METIS 2004, Geneva W. Grossmann University of Vienna.
From MIAME to MAML: Microarray Gene Expression Database (MGED) Chris Stoeckert Center for Bioinformatics University of Pennsylvania Sept. 19, 2001 GE ^
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
What is an Ontology? An ontology is a specification of a conceptualization that is designed for reuse across multiple applications and implementations.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team European Bioinformatics Institute MGED.
MIAMExpress and the development of annotation ontologies for gene expression experiments Ele Holloway Microarray Informatics European Bioinformatics Institute.
The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society.
A plant-specific annotation and submission tool for the incorporation of Arabidopsis gene expression data into ArrayExpress, the EBI’s public DNA microarray.
RADical microarray data: standards, databases, and analysis Chris Stoeckert, Ph.D. University of Pennsylvania Yale Microarray Data Analysis Workshop December.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Alvis Brazma, Johan Rung, Ugis Sarkans, Thomas Schlitt, Jaak Vilo European Bioinformatics Institute (EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge,
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
DAML+OIL: an Ontology Language for the Semantic Web.
1 Outline Standardization - necessary components –what information should be exchanged –how the information should be exchanged –common terms (ontologies)
The MGED Ontology W3C Workshop on Semantic Web for life Sciences October 27, 2004 Presented by Liju Fan MGED Ontology Working Group Senior Scientist, KEVRIC.
Ontologies Working Group Agenda MGED3 1.Goals for working group. 2.Primer on ontologies 3.Working group progress 4.Example sample descriptions from different.
The European Bioinformatics Institute ArrayExpress – a public database for microarray gene expression data Helen Parkinson Microarray Informatics Team.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation Bioinformatics, July 2003 P.W.Load,
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
1 Chapter 2 Database Environment Pearson Education © 2009.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Ontology Driven Data Collection for EuPathDB Jie Zheng, Omar Harb, Chris Stoeckert Center for Bioinformatics, University of Pennsylvania.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
COMP6215 Semantic Web Technologies
Chapter 2 Database Environment.
MGED Ontology: An Ontology of Biomaterial Descriptions for Microarrays
From MIAME to MAML: Microarray Gene Expression Database (MGED)
MGED Ontology Working Group Report
Deniz Beser A Fundamental Tradeoff in Knowledge Representation and Reasoning Hector J. Levesque and Ronald J. Brachman.
CIS Monthly Seminar – Software Engineering and Knowledge Management IS Enterprise Modeling Ontologies Presenter : Dr. S. Vasanthapriyan Senior Lecturer.
Presentation transcript:

MGED Ontology Working Group MGED4 Boston, MA Feb. 15, 2002 Chris Stoeckert, Center for Bioinformatics, U. Penn Helen Parkinson, EBI

Agenda Overview of ontologies Status of MGED Ontology Incorporating ontologies into microarray database annotation forms - Helen Parkinson Discussion –Annotation experience –Use Cases: needs besides retrieving experiments? –issues: Missing concepts? (quick tour of ontology) Relationship between MAGE and MGED ontology

What Does an Ontology Do? Captures knowledge Creates a shared understanding – between humans and for computers Makes knowledge machine processable Makes meaning explicit – by definition and context From Building and Using Ontologies, Robert Stevens, U. of Manchester

What is an Ontology? Catalog/ ID General Logical constraints Terms/ glossary Thesauri “narrower term” relation Formal is-a Frames (properties) Informal is-a Formal instance Value Restrs.Disjointness, Inverse, part- of… From Building and Using Ontologies, Robert Stevens, U. of Manchester

Uses of Ontology Community reference -- neutral authoring. Either defining database schema or defining a common vocabulary for database annotation -- ontology as specification. Providing common access to information. Ontology-based search by forming queries over databases. Understanding database annotation and technical literature. Guiding and interpreting analyses and hypothesis generation From Building and Using Ontologies, Robert Stevens, U. of Manchester

Components of an Ontology Concepts: Class of individuals – The concept Protein and the individual `human cytochrome C’ Relationships between concepts Is a kind of relationship forms a taxonomy Other relationships give further structure – is a part of Axioms – Disjointness, covering, equivalence,… From Building and Using Ontologies, Robert Stevens, U. of Manchester

Languages Vocabularies using natural language –Hand crafted, flexible but difficult to evolve, maintain and keep consistent, with weak semantics –Gene Ontology Object-based KR: frames –Extensively used, good structuring, intuitive. Semantics defined by OKBC standard –EcoCyc (uses Ocelot) and RiboWeb (uses Ontolingua) Logic-based: Description Logics –Very expressive, model is a set of theories, well defined semantics –Automatic derived classification taxonomies –Concepts are defined and primitive From Building and Using Ontologies, Robert Stevens, U. of Manchester

Microarray Information to be Captured Figure from: David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

MGED Ontology Working Group Goals 1.Identify concepts 2.Collect available controlled vocabularies and ontologies for concepts 3.Define concepts 4.Formalize concept relationships

Relationship of MGED Efforts MAGE MIAME DB MIAME DB External Ontologies/CVs MGED Ontology  Annotation  Format  Ontologies  External  Internal Ontologies provide common terms and their definitions for describing microarray experiments.

Species Resources

Concept Definitions

Usage of Concepts and Resources for Microarrays MIAME glossary –Provide definitions for types of information (concepts) listed in MIAME MIAME qualifier, value, source –Provide pointers to relevant sources that can be used to annotate experiments

sample source and treatment ID as used in section 1 organism (NCBI taxonomy) additional "qualifier, value, source" list; the list includes: cell source and type (if derived from primary sources (s)) sex age growth conditions development stage organism part (tissue) animal/plant strain or line genetic variation (e.g., gene knockout, transgenic variation) individual individual genetic characteristics (e.g., disease alleles, polymorphisms) disease state or normal target cell type cell line and source (if applicable) in vivo treatments (organism or individual treatments) in vitro treatments (cell culture conditions) treatment type (e.g., small molecule, heat shock, cold shock, food deprivation) compound is additional clinical information available (link) separation technique (e.g., none, trimming, microdissection, FACS) laboratory protocol for sample treatment MIAME Section on Sample Source and Treatment

External References ©- BioMaterialDescription © -Biosource Property © -Organism © -Age © -DevelopmentStage © -Sex © -StrainOrLine © -BiosourceProvider © -OrganismPart © -BioMaterialManipulation © -EnvironmentalHistory ©- CultureCondition ©- Temperature ©- Humidity ©- Light © -PathogenTests © -Water © -Nutrients © -Treatment © -CompoundBasedTreatment (Compound) (Treatment_application) (Measurement) MGED Ontology Instances NCBI Taxonomy Mouse Anatomical Dictionary International Committee on Standardized Genetic Nomenclature for Mice International Committee on Standardized Genetic Nomenclature for Mice Mouse Anatomical Dictionary ChemIDplus Mus musculus musculus id: weeks after birth Stage 28 Female C57BL/6N Charles River, Japan Liver 22  2  C 55  5% 12 hours light/dark cycle Specified pathogen free conditions ad libitum MF, Oriental Yeast, Tokyo, Japan Fenofibrate, CAS in vivo, oral gavage 100mg/kg body weight An example of microarray sample annotation using the MGED ontology Susanna A. Sansone, Helen Parkinson, Philippe Rocca-Serra, Chris Stoeckert and Alvis Brazma

MAGE BioMaterial Model

MGED Biomaterial Ontology Under construction –Using OILed (Not wedded to any one tool) –Generate multiple formats: RDFS, DAML+OIL Define classes, provide relations and constraints, identify instances Motivated by MIAME and coordinated with MAGE

Building a Microarray Ontology

Ontology in Browseable Form

Example of Internal Terms

Example of External Terms

Example of Combined Internal and External: Treatment

OWG Use Cases Make it easier and more accurate to annotate a microarray experiment. –Build forms that provide menus of terms and links to external resources. –Only ask for relevant terms and fill in terms that can be inferred. Return a summary of all experiments that use a specified type of biosource. –Use “age” to select and order experiments –Use Mouse Anatomical Dictionary Stage 28 to pick experiments according to “organism part” Return a summary of all experiments done examining effects of a specified treatment –E.g., Look for “CompoundBasedTreatment”, “in vivo” –Select “Compound” based on CAS registry number –Order based on “CompoundMeasurement” ? Use to check if “MIAME-compliant.” –Assess only fields that are relevant –Check for proper use of terms ? Build gene networks based on biomaterial description –Generate a distance metric based on biosource and use in calculation of correlation with gene expression level –Generate an error estimation based on biosample (i.e., even when biosources are identical, there will be variation resulting from different treatments)

MGED Ontology Plans More Concepts? Improve definitions? –Extend to other parts of MIAME More instances! Add identifiers to all classes (facilitate neutral authoring). Instances? Add constraints. Prevent nonsense associations (e.g., only time units for age) Write a paper describing and explaining MGED ontology by next meeting with example applications and datasets. –Mechanism to establish a consensus “standard.”