Building Ontologies with Protégé-2000 Mark A. Musen, M.D., Ph.D. Stanford Medical Informatics Stanford University
Conceptual building blocks for building intelligent systems Domain ontologies Characterization of concepts and relationships in an application area, providing a domain of discourse Problem-solving methods Abstract algorithms for achieving solutions to stereotypical tasks (e.g., constraint satisfaction, classification, planning, simulation)
http://protege.Stanford.EDU
What Protégé-2000 offers Ontology editor for defining classes of concepts (e.g., esophagus) Automated generation of tools for building knowledge bases that define instances of concepts (e.g., Cornelius’ esophagus) Knowledge-visualization systems Lots of user-contributed “plug ins” Ability to archive ontologies and knowledge bases in a variety of formats
Digital Anatomist Foundational Model of Anatomy Material Physical Anatomical Entity Anatomical Spatial Entity Anatomical Structure Body Substance Body Part Organ System Organism Cell Subdivision Component Tissue
Is-a Classes of anatomical structures Parts of the heart Part-of Spatial Entity Anatomical Structure Organ Organ Part Body Space Anatomical Feature Viscus Organ Subdivision Organ Component Organ Cavity Internal Feature Hollow Viscus Cardiac Chamber Heart Parts of the heart Organ Cavity Subdivision Cavity of Heart Right Atrium Wall of Heart Cavity of Right Atrium Fossa Ovalis Wall of Right Atrium Myocardium Is-a Part-of Sinus Venarum Myocardium of Right Atrium SA Node
The Ontology in Protégé-2000
Visualizing relationships with JAMBALAYA Hierarchy (nested graph) based on is-a and instance-of relationships. The user can zoom in and explore any of the subclasses in more detail (see next slide).
No hierarchical relationships, just a few slot types visible and a spring layout done on those arc types To help the user explore structures (see filter panel for arc types not filtered). Classes are blue, instances are white.
Protégé Uses Open Knowledge-Base Connectivity (OKBC) model Classes define concepts in the domain Slots define attributes of classes Facets put constraints on values of classes (e.g., type, cardinality, defaults) Protégé may exchange knowledge with any other OKBC-compatible server (e.g., Ontolingua, LOOM, Ocelot)
Additional features of Protégé’s knowledge model Metaclasses allow developers to define special-purpose facets of base classes that are “instances” of the metaclasses Protégé axiom language (PAL) allows developers to specify complex semantic constraints using logic
Metaclasses allow developers to define new template slots (e. g Metaclasses allow developers to define new template slots (e.g., as in GO)
Evaluation of constraints can point out semantic errors
OntoViz tab: Another visualization plug-in that uses AT&T’s GraphViz system
PROMPT: A plug in that allows users to compare and merge different versions of ontologies
Building intelligent systems with Protégé-2000 Build a domain ontology Protégé-2000 generates a custom-tailored GUI for acquisition of content knowledge Elicit content knowledge from application specialists Map domain ontology to appropriate problem-solving methods for automation of particular tasks
Building knowledge bases: The Protégé methodology Domain ontology to provide domain of discourse Knowledge-acquisition tool for entry of detailed content
In Protégé-2000, ontologies can define an overarching structure for a domain knowledge base
Forms allow entry of instances and are generated directly from the ontology
Sometimes diagrams are more intuitive than forms are Developers can use diagrams for entry of complex relationships Forms and diagrams can be freely mixed based on the requirements of the ontology
Protégé-2000 supports special-purpose “plug ins” Widgets to acquire and display special information Tabs that provide enhanced functionality Access to UMLS and WordNet Access to problem solvers (e.g., JESS, Prolog) Access to visualization systems Tabs that support entire knowledge-based applications
Current “back ends” for reading and writing knowledge Relational databases (via JDBC) CLIPS OKBC XML (using DTDs) XML (using XML Schema) Resource Description Framework (RDF/S) DAML+OIL (in progress)
Protégé-2000 is Available under an open-source license and freely downloadable from our Web site Used by more than 1000 investigators around the world Supported by Stanford Medical Informatics and by the Protégé-2000 user community
The Protégé community keeps growing 9.4 new downloads each day (mean) 4133 registered users as of July 1, 2002 877 people subscribe to our “protege-discussion” mailing list
http://protege.Stanford.EDU