POC tutorial #2: Ontology Development This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers.

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

A Comparative mapping resource ONTOLOGY DEVELOPMENT AND INTEGRATION IN GRAMENE Pankaj Jaiswal Cornell University.
1 Web Data Management XML Schema. 2 In this lecture XML Schemas Elements v. Types Regular expressions Expressive power Resources W3C Draft:
Solutions to Review Questions
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
CPSC 322, Lecture 5Slide 1 Uninformed Search Computer Science cpsc322, Lecture 5 (Textbook Chpt 3.4) January, 14, 2009.
Uninformed Search Jim Little UBC CS 322 – Search 2 September 12, 2014
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Gene Ontology John Pinney
POC tutorial#3: Annotation This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within.
Vocabulary Markup Language (Voc-ML) Project Joseph A. Busch Content Intelligence Evangelist Interwoven.
POC tutorial #1: Introduction This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within.
Building Suffix Trees in O(m) time Weiner had first linear time algorithm in 1973 McCreight developed a more space efficient algorithm in 1976 Ukkonen.
Review of Graphs A graph is composed of edges E and vertices V that link the nodes together. A graph G is often denoted G=(V,E) where V is the set of vertices.
An introduction to using the AmiGO Gene Ontology tool.
More Trees COL 106 Amit Kumar and Shweta Agrawal Most slides courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Automatic methods for functional annotation of sequences Petri Törönen.
OOPSLA 2003 DSM Workshop Diagram Definition Facilities Based on Metamodel Mappings Edgars Celms, Audris Kalnins, Lelde Lace University of Latvia, IMCS,
Modifying GO How changes are made to GO, and how you can be involved.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Chapter 8 More Object Concepts
Jim Anderson Comp 122, Fall 2003 Single-source SPs - 1 Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph.
Managing Changing Requirements: Structure the Use Case Model PowerPoint Presentation derived from IBM/Rational course Mastering Requirements Management.
GO : the Gene Ontology “because you know sometimes words have two meanings” Amelia Ireland GO Curator EBI, Cambridge, UK.
Slide-1 DEVELOPMENT AND INTEGRATION OF ONTOLOGIES IN GRAMENE Scientific Advisory Board Meeting January 2005.
Gene Expression Databases: Where and When Dave Clements EuReGene and Mouse Atlas projects Medical Research Council Human Genetics.
Ontology development and use for efficient information input and retrieval 1 Alice Clara Augustine, Vijayalakshmi K, Shobha Char, Naveen Sylvester, Mittur.
Editing the Gene Ontology Midori A. Harris GO Editorial Office EBI, Hinxton, UK.
Trees. Introduction to Trees Trees are very common in computer science They come in different forms They are used as data representation in many applications.
Computer Science 112 Fundamentals of Programming II Introduction to Graphs.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Querying Structured Text in an XML Database By Xuemei Luo.
GENE ONTOLOGY FOR THE NEWBIES Suparna Mundodi, PhD The Arabidopsis Information Resources, Stanford, CA.
Representing and Using Graphs
Practical Object-Oriented Design with UML 2e Slide 1/1 ©The McGraw-Hill Companies, 2004 PRACTICAL OBJECT-ORIENTED DESIGN WITH UML 2e Chapter 9: Interaction.
TCP Traffic and Congestion Control in ATM Networks
Copyright © Cengage Learning. All rights reserved. CHAPTER 10 GRAPHS AND TREES.
The Gene Ontology project Jane Lomax. Ontology (for our purposes) “an explicit specification of some topic” – Stanford Knowledge Systems Lab Includes:
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
Chapter 24: Single-Source Shortest Paths Given: A single source vertex in a weighted, directed graph. Want to compute a shortest path for each possible.
The Plant Ontology Consortium Lincoln Stein 1, Susan McCouch 2, Elizabeth Kellogg 3, Seung Rhee 4, Pankaj Jaiswal 2, Doreen Ware 1, Peter Stevens 5 1 Cold.
Trees CS 105. L9: Trees Slide 2 Definition The Tree Data Structure stores objects (nodes) hierarchically nodes have parent-child relationships operations.
The Plant Ontology: Development of a Reference Ontology for all Plants Plant Ontology Consortium Members and Curators*: Laurel D.
POC tutorial#4: POC website and Browser This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers.
Scope of the Gene Ontology Vocabularies. Compile structured vocabularies describing aspects of molecular biology Describe gene products using vocabulary.
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
Plant structure and growth stage ontologies to describe phenotypes and gene expression in angiosperms Pankaj Jaiswal Cornell University.
Introduction to Artificial Intelligence (G51IAI) Dr Rong Qu Blind Searches - Introduction.
Data Structures Lakshmish Ramaswamy. Tree Hierarchical data structure Several real-world systems have hierarchical concepts –Physical and biological systems.
Working with XML. Markup Languages Text-based languages based on SGML Text-based languages based on SGML SGML = Standard Generalized Markup Language SGML.
AOP/cross-cutting What is an aspect?. An aspect is a modular unit that cross-cuts other modular units. What means cross-cutting? Apply AOP to AOP. Tease.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Introduction to Databases Angela Clark University of South Alabama.
1 Binary Search Trees  Average case and worst case Big O for –insertion –deletion –access  Balance is important. Unbalanced trees give worse than log.
CPSC 322, Lecture 5Slide 1 Uninformed Search Computer Science cpsc322, Lecture 5 (Textbook Chpt 3.5) Sept, 13, 2013.
CS 501: Software Engineering Fall 1999 Lecture 15 Object-Oriented Design I.
GO : the Gene Ontology & Functional enrichment analysis
Development of the Amphibian Anatomical Ontology
Pipelines for Computational Analysis (Bioinformatics)
Database Processing with XML
(edited by Nadia Al-Ghreimil)
Taibah University College of Computer Science & Engineering Course Title: Discrete Mathematics Code: CS 103 Chapter 10 Trees Slides are adopted from “Discrete.
Welcome to the Quantitative Trait Loci (QTL) Tutorial
Gramene’s Ontologies Tutorial
A Introduction to Computing II Lecture 13: Trees
Trees-2, Graphs Data Structures with C Chpater-6 Course code: 10CS35
Two – One Problem Legal Moves: Slide Rules: 1s’ move right Hop
Two – One Problem Legal Moves: Slide Rules: 1s’ move right Hop
Presentation transcript:

POC tutorial #2: Ontology Development This tutorial will run automatically in Quicktime. To run the tutorial at your own pace use the internal controllers within the tutorial. button goes to next slide button goes to previous slide button goes to last slide button goes to previous slide

Ontology Development A.What are the organizing principles? B.How are terms defined? C.How are terms related to each other? D.What is a directed acyclic graph (DAG)? E.Elements and attributes of terms F.What is the “True Path Rule” ? G.Species specificity: problems and solutions H.How are ontologies maintained?

What are the organizing principles? Keep it simple: strive for a robust extensible structure, rather than comprehensiveness. Where possible, rely on synonyms (equivalence of terms) rather than creating a new term. The criterion for creating an anatomy term include: location, morphology, derivation and spatial/positional organization. Include species specific terminology to accommodate annotation and biological accuracy (i.e. maintain the true path rule). All terms must be defined.

How are terms defined? The precise definition of terms is critical to the integrity of the ontologies. Definitions are obtained primarily from standard references such as textbooks and glossaries. Definitions may be taken verbatim from references or modified for clarity or to reflect common usage. Most definitions come from Plant Anatomy (K.Esau) and the Angiosperm Phylogeny website Missouri Botanical Garden).Plant AnatomyAngiosperm Phylogeny website

How are terms related to each other? Terms are related to each other as children to parents. Each child term can have one or more parents. There are three basic types of child-parent relationships used in the plant ontologies, which are illustrated in the following graph. plant cell tissue plant structure tricho- blast guard cell root hair root organ

The is a relationship is a is a simple class-subclass relationship. For example, a trichoblast is a plant cell which is a plant structure. A root is a organ which is a plant structure. plant cell tissue plant structure tricho- blast guard cell root hair root organ

The part of Relationship It indicates a subpart/part relationship within a tissue or organ. Used in a non-restrictive manner. An example would be root hair part of root; root hair is always part of a root, but not all roots have root hair. plant cell tissue plant structure tricho- blast guard cell root hair root organ

The develops from Relationship It indicates that cell/tissue/organ develops from its parent term. Implies both, develops from and a more indirect relationship, derive from. For example, the root hair develops from the trichoblast which is a plant cell which is a plant structure. plant cell tissue plant structure tricho- blast guard cell root hair root organ

What is a directed acyclic graph (DAG)? A DAG is a collection of ordered nodes (e.g. parent-child) and edges (e.g. relationships) that flows in a specific direction. In the ontologies, nodes are terms. A path through the nodes cannot cycle, or double back on itself.

If every child node has no more than one parent node, then the DAG is a tree. If at least one child node has two parents, the DAG is a network. The plant ontology, like the Gene Ontology can be represented as a network DAG. plant cell tissue plant structure tricho- blast guard cell root hair root organ

Tree view in “AmiGo browser”

What is the true path rule? The true path rule states that the path from any node (term) all the way to the top node of the tree must be biologically correct. When violations of the true path rule are detected the structure of the ontology must be modified.

Example: Maize lemmas For example, a lemma is a type of bract that is a part of a maize floret but is not present in other flowers. Schematic diagram of male florets of maize. Veit et.al. Plant Cell Oct;5(10): lemma flower (generic) part of

Maintaining the true path rule Lemmas are not present in all flowers- therefore it is necessary to create a special instance of a flower - specifically a maize floret. lemma floret (sensu Poaceae) flower (generic) floret lemma flower (generic) Problem:this violates true path as a lemma is not part of a generic flower Solution: add floret as instance of flower and add an instance of a maize floret part of is a part of

How does this affect queries? The path to each parent is true. A query of all genes affecting the generic flower would still return genes affecting the lemma of maize floret. It is possible to find all flower mutations in maize without explicit knowledge of maize-specific terms such as lemma. Representation of lemma in the plant structure ontology

Elements and attributes of terms The following section defines the attributes of terms as they are shown in the AmiGO browser. Here, we show the term "inflorescence".

Accession Each term has a unique identifier of that term.

Aspect This refers to the aspect of the Plant Ontology (structure or developmental stage) that includes the term.

Synonyms The synonyms include a variety of alternate forms of the term such as variations, broader/narrower terms, misnomers and equivalent terms.

Definition Definition of the term as used in the Plant Ontologies.Definitions are primarily obtained from text books and glossaries.

Comments Comments by curators/developers to provide clarity or additional information such as usage.

Lineage The diagram shows the relationship of the term to all of its parents.

Species-specificity:the problem In cases where more specific instances of terms (sensu) are created the children terms cannot be generic because this violates the true path rule. An Arabidopsis gene annotated to a generic anther term, should NOT be retrieved in a search for genes expressed in a maize floret. anther floret (sensu Zea) flower part of is a part of X

Species-specificity:the solution The solution is to create specific (sensu Zea) instances for the parts of the maize floret. The new sensu terms are also added as instances of the more generic term, so that a query for mutants affecting the anther will include genes from maize as well as other species. anther floret (sensu Zea) flower part of is a part of anther (sensu Zea) is a

How are the ontologies maintained? The ontologies are updated often. The most current versions of the ontologies can be downloaded from the POC CVS repository. The updated ontologies are then used to update the Plant Ontology (AmiGO) browser on a monthly basis. The ontologies are created and edited by curators using the DAG Edit ontology editorDAG Edit ontology editor which is freely available from Sourceforge.

End of tutorial