A general-purpose text annotation tool called Knowtator is presented. Knowtator facilitates the manual creation of annotated corpora that can be used for.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Biomedical Informatics Reference Ontologies in Biomedicine Christopher G. Chute, MD DrPH Professor and Chair, Biomedical Informatics Mayo Clinic College.
University of Sheffield NLP Module 11: Advanced Machine Learning.
Consistent and standardized common model to support large-scale vocabulary use and adoption Robust, scalable, and common API to reduce variation in clinical.
MP IP Strategy Stateye-GUI Provided by Edotronik Munich, May 05, 2006.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 8 Slide 1 System modeling 2.
Biomedical Informatics Some Observations on Clinical Data Representation in EHRs Christopher G. Chute, MD DrPH, Mayo Clinic Chair, ICD11 Revision, World.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Use of Ontologies in the Life Sciences: BioPax Graciela Gonzalez, PhD (some slides adapted from presentations available at
SchemaLogic Workshop Part 2 Tools for Enterprise Metadata Management and Synchronization Prepared for the University of Washington Information School Applied.
UML CASE Tool. ABSTRACT Domain analysis enables identifying families of applications and capturing their terminology in order to assist and guide system.
Protegè Dott. Daniela Briola. Class Usually classes will correspond to objects, or types of objects, in the domain. Classes in Protege-Frames are shown.
1 System: Mecano Presenters: Baolinh Le, [Bryce Carder] Course: Knowledge-based User Interfaces Date: April 29, 2003 Model-Based Automated Generation of.
Mayo LexWiki: A Prototype of Collaborative Platform for Terminology/Ontology Content Development Guoqian Jiang, Ph.D. Division of Biomedical Informatics,
A Guide to SQL, Seventh Edition. Objectives Understand the concepts and terminology associated with relational databases Create and run SQL commands in.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
Protégé An Environment for Knowledge- Based Systems Development Haishan Liu.
Chapter 4 Entity Relationship (E-R) Modeling
Editing Description Logic Ontologies with the Protege OWL Plugin.
Evaluating Ontology-Mapping Tools: Requirements and Experience Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University.
Data Modeling Using the Entity-Relationship Model
Information storage: Introduction of database 10/7/2004 Xiangming Mu.
Lecture 6 of Advanced Databases XML Schema, Querying & Transformation Instructor: Mr.Ahmed Al Astal.
1 CS 456 Software Engineering. 2 Contents 3 Chapter 1: Introduction.
LexEVS 6.0 Overview Scott Bauer Mayo Clinic Rochester, Minnesota February 2011.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
A Case Study of ICD-11 Anatomy Value Set Extraction from SNOMED CT Guoqian Jiang, PhD ©2011 MFMER | slide-1 Division of Biomedical Statistics & Informatics,
Using the Open Metadata Registry (openMDR) to create Data Sharing Interfaces October 14 th, 2010 David Ervin & Rakesh Dhaval, Center for IT Innovations.
Building an Ontology of Semantic Web Techniques Utilizing RDF Schema and OWL 2.0 in Protégé 4.0 Presented by: Naveed Javed Nimat Umar Syed.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Computer Science 101 Database Concepts. Database Collection of related data Models real world “universe” Reflects changes Specific purposes and audience.
Database Management System Lecture 4 The Relational Database Model- Introduction, Relational Database Concepts.
Entity Framework Overview. Entity Framework A set of technologies in ADO.NET that support the development of data-oriented software applications A component.
11 Chapter 11 Object-Oriented Databases Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
10/3/2012ISC329 Isabelle Bichindaritz1 Logical Design.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Knowledge Modeling, use of information sources in the study of domains and inter-domain relationships - A Learning Paradigm by Sanjeev Thacker.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Value Set Resolution: Build generalizable data normalization pipeline using LexEVS infrastructure resources Explore UIMA framework for implementing semantic.
1 Chapter 1 Introduction. 2 Introduction n Definition A database management system (DBMS) is a general-purpose software system that facilitates the process.
Acknowledgements Contact Information Objective An automated annotation tool was developed to assist human annotators in the efficient production of a high.
Food and Agriculture Organization of the UN Library and Documentation Systems Division July 2005 Ontologies creation, extraction and maintenance 6 th AOS.
Chapter 1 Introduction to Databases. 1-2 Chapter Outline   Common uses of database systems   Meaning of basic terms   Database Applications  
VistA Imaging Workstation Configuration. October The information in this documentation includes functionality of the software after the installation.
SSO: THE SYNDROMIC SURVEILLANCE ONTOLOGY Okhmatovskaia A, Chapman WW, Collier N, Espino J, Conway M, Buckeridge DL Ontology Description The SSO was developed.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Christoph F. Eick University of Houston Organization 1. What are Ontologies? 2. What are they good for? 3. Ontologies and.
A Semantic-Web Representation of Clinical Element Models
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
Database Management Systems (DBMS)
MedKAT Medical Knowledge Analysis Tool December 2009.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
User Interface Generation From The Data Schema Akhilesh Bajaj Jason Knight University of Tulsa May 13, 2007 Sixth AIS SIGSAND Symposium, Tulsa, OK.
Chapter – 8 Software Tools.
Supporting Collaborative Ontology Development in Protégé International Semantic Web Conference 2008 Tania Tudorache, Natalya F. Noy, Mark A. Musen Stanford.
Logical Design 12/10/2009GAK1. Learning Objectives How to remove features from a local conceptual model that are not compatible with the relational model.
LexEVS 5.0: Migrating from EVS 3.x API to LexEVS API Craig R. Stancl, Kevin J. Peterson, H. Scott Bauer, Traci V. St.Martin, Christopher G. Chute, MD PhD.
The Relational Model Lecture #2 Monday 21 st October 2001.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Database Design, Application Development, and Administration, 6 th Edition Copyright © 2015 by Michael V. Mannino. All rights reserved. Chapter 5 Understanding.
Ccs.  Ontologies are used to capture knowledge about some domain of interest. ◦ An ontology describes the concepts in the domain and also the relationships.
The Enhanced Entity- Relationship (EER) Model
A knowledge-based text annotation tool
Semantic Database Builder
LexRDF: An Approach for Representing Biomedical Ontologies in RDF
SHARP: Secondary Use Project 1: Data Normalization
Guoqian Jiang, Harold R. Solbrig, Christopher G. Chute
Chapter 1 Database Systems
Building Ontologies with Protégé-2000
Presentation transcript:

A general-purpose text annotation tool called Knowtator is presented. Knowtator facilitates the manual creation of annotated corpora that can be used for evaluating or training a variety of natural language processing systems. Building on the strengths of the widely used Protégé knowledge representation system, Knowtator has been developed as a Protégé plug-in that leverages Protégé’s knowledge representation capabilities to specify annotation schemas. Knowtator’s unique advantage over other annotation tools is the ease with which complex annotation schemas (e.g. schemas which have constrained relationships between annotation types) can be defined and incorporated into use. Introduction Larry Hunter, PhD 1 Zhiyong Lu 1 Kevin Cohen 1 Mike Bada 1 Andrew Dolbey 1 Christopher G. Chute, MD DrPH 2 Guergana Savova, PhD 2 Serguei Pakhomov, PhD 2 Marcelline R. Harris, PhD 2 1.University of Colorado Health Sciences Center, Aurora, CO. 2.Mayo Clinic College of Medicine, Rochester, MN. Knowtator is a general-purpose text annotation tool. Synopsis Knowtator: A Protégé plug-in for annotated corpus construction Philip V. Ogren Division of Biomedical Informatics, Mayo Clinic College of Medicine, Rochester, Minnesota, USA Knowtator is a Protégé plug-in. Knowtator is open source and available at: bionlp.sourceforge.net/Knowtator Acknowledgements Example The following outlines an example of how Knowtator can be used to annotate problem statements, outcomes, and interventions that are found in clinical notes. The annotation schema shown in Knowtator is based on the International Classification for Nursing Practice (ICNP), a controlled vocabulary and data model created specifically for coding in this domain. Annotation Schema Creation: The Protégé knowledge-base editor can be used to create new class (Figure 1), instance, slot (Figures 2 and 3), and facet frames for defining the annotation schema. Figure 1 The creation of a subclass of Statement in progress using the Protégé class editor is shown. Figure 2 The class definition for Problem Statement is shown with its slots and the constraints on those slots (e.g. an action of a Problem Statement must be of type Action). Figure 3 The only slot of the class Artifact is a simple attribute that accepts a string value corresponding to an identifier for a term in the ICNP controlled vocabulary. Annotation of Text: Once an annotation schema has been created, then it can be immediately used for text annotation. Figure 4 shows some text that is going to be annotated. On the left is the subsumption hierarchy of the available annotation types. A single annotation has been created for the span of text ‘pain’ and is annotated to the class Process. Figure 4 The text ‘pain’ was highlighted with a mouse, the class Process was selected and an annotation was created. (Continued from previous column) Figure 5 The annotation corresponding to the text ‘pain’ has slot that relates this annotation to a specific identifier in the ICNP terminology. A dialog that allows the entry of a string value for the identifier is shown. Figure 6 An annotation corresponding to the class Problem Statement has been created. There is no span associated with the annotation. However, Problem Statement has several slots (shown in Figure 2) that correspond to other annotations in the text. The annotation for the span of text ‘parascapular thoracic’ with the class Body Structure becomes the value of the location slot of the Problem Statement annotation. The slots of the class definitions in the annotation schema define what properties an annotation can have. Figure 5 shows an example of a simple slot that holds the value of an identifier from a controlled vocabulary for an annotation of the class Process. Figure 6 shows an example of a complex slot that relates an annotation of type Problem Statement to an annotation of type Body Structure via the location slot. A key strength of Knowtator is its ability to relate annotations to each other via the slot definitions of the corresponding annotated classes. In the ICNP example above, the slot location of the class Problem Statement relates to the Body Structure annotation for the text extent ‘parascapular thoracic’. The constraints on the slot ensure that the relationships between annotations are consistent. Protégé is capable of representing much more sophisticated and complex conceptual models which can be used, in turn, by Knowtator for text annotation. Also, because Protégé is often used to create conceptual models of domains relating to biomedical disciplines, Knowtator is especially well suited for capturing named entities and their relations for those domains. Features Merges annotations from multiple annotators Performs a variety of inter-annotator agreement metrics along with detailed error analysis data. Consensus set creation mode for consolidating differences between two or more annotators Pluggable architecture for handling different text sources Stand-off annotation (i.e. the annotated text is not modified) XML import/export Scalable – can run on a standalone laptop or with a database backend (or both) Mozilla Public License version 1.1 Filters provide fine-grained control over display, annotation export, consensus set creation, and inter- annotator agreement. Display of annotations is highly configurable with respect to the text shown and highlight color.