Download presentation
Presentation is loading. Please wait.
1
A knowledge-based text annotation tool
Knowtator A knowledge-based text annotation tool
2
Philip Ogren (Philip.Ogren@uchsc.edu)
Larry Hunter, PhD Center for Computational Pharmacology University of Colorado Health Sciences Center Aurora, CO
3
bionlp.sourceforge.net/Knowtator
Availability: bionlp.sourceforge.net/Knowtator Source code will be available under MPL soon. Comments and suggestions welcome! This work was supported by NIH grant R01-LM008111
4
Knowtator is: A general-purpose text annotation tool A Protégé plugin
5
Knowtator screenshot
6
Synopsis Knowtator facilitates the manual creation of training and evaluation corpora for a variety of biomedical language processing tasks. Knowtator’s key strength is the ability to define an annotation schema using a Protégé knowledge base.
7
Features Stand-off annotation Inter-annotator agreement metrics
Original text is not modified Inter-annotator agreement metrics Simple API allows annotation of any arbitrary text source. Annotation filters All annotations are assigned an annotator and (optionally) one or more annotation sets. Annotations of many types, from multiple annotators and annotation sets can clutter the user interface. Filters allow viewing select annotations
8
Knowtator annotation schemas are defined by a Protégé knowledge base
Biological and linguistic concepts can be modeled in Protégé.
9
Entities in an annotation schema are defined by Protégé class definitions. Protégé slots and constraints on those slots can be used to relate annotations in meaningful ways. Class definition for endocytosis
10
Example: endocytosis annotation
Annotations of endocytosis relate to annotations of cellular component and molecule via the slot definitions of the endocytosis class definition. Six slots of endocytosis location: filled by cellular component annotations origin: subslot of location destination: subslot of location transport participants: filled by molecule annotations transported entities: subslot of transport participants transporters: subslot of transport participants
11
Example endocytosis annotation
12
Knowtator data model The goal of Knowtator is to create mappings between concepts represented in a knowledge base and texts that talk about those concepts.
13
Ontology/knowledge base of concepts and relationships (Protégé frames)
The Knowtator data model has three parts: Ontology/knowledge base of concepts and relationships (Protégé frames) Mentions of concepts and assertions about relationships between concepts found in text A mapping between the target text and members of 1 and 2 (annotations)
14
II. Mentions/Assertions III. Annotations I. Ontology/KB
Endocytosis of molecule with thromboxane A2 receptor from endosome to cell surface
15
To do: report on annotation efforts
mechanism for semi-automated annotation import/export scripts for other annotation formats (e.g. ATLAS)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.