BioHealth Informatics Group Advanced OWL Tutorial 2005 Ontology Engineering in OWL Alan Rector & Jeremy Rogers BioHealth Informatics Group
Ontology Engineering: The ‘ontology’ is just the beginning ‘Ontologies’ Software agents Problem- solving methods Domain- independent applications Domain- independent applications Databases Declare structure Knowledge bases Knowledge bases Provide domain description The “Semantic Web”
We know it is wrong – but why? ►Do we really mean wrong? ►Many upper ontologies ►Some very abstract, some less so ►Dolce/OntoClean my favourite current compromise besides ►See Guarino and Welty: ►doc paper is a readable summary if you can get past the vocabulary ►Also Guarino’s home page ►Others ►SUO (Standard Upper Ontology) ►John Sowa’s work – see Google ►OpenCyc ►OpenGALEN ►There is no one way! ►No matter how much some people want to make it a matter of dogma
Ontology Layers: What’s it for? Cooperation on the Domain Content Ontologies to enable… Cooperation on Top Domain Ontologies to enable… Cooperation on the Upper Ontologies to enable …. The Meta Ontology is to enable… Cooperation on Information systems & resources
Information systems & resources Information systems & resources Databases, RDF Instance stores, … (“individuals”) Where do DLs fit in? Domain Content Ontologies Top Domain Ontologies Top Domain Ontologies Upper Ontologies Upper Ontologies DLs? (“classes”) Meta Ontology FoL / HoL
Principles ►How to describe the things in a domain & how to arrange and maintain those descriptions ►Just enough to describe what needs to be described ►No distinction without a difference! ►Properties are as important as Classes/Entities/Concepts ►If an upper level category does not act as a domain or range constraint or have some other engineering effect, why represent it? ► Exclude things that will be dealt with by other means or given ►“Concrete domains” ►Time and place ►Designed to record what an observer has recorded at a given place and time ►Non_physical – e.g. agency ►Causation – except in sense of “aetiology” ►Implemented Ontology in a standard framework ►For today: OWL/DLs ►Must be implemented and support a large ontology
Principles 2 ►Minimal commitment ►Don’t make a choice if you don’t have to ►Understandable ►Experts an make distinctions repeatably/reliably ►Able to infer classification top domain concepts ►‘Twenty questions’ – to neighbourhood ►Upper ontology primarily composed of ‘open dichotomies’ ►Open to defer arguments such as whether Collectives of Physical things are physical
Issues for ‘ontology engineering’ ►Utility ►What’s it for? - Scope and Limitations ►Application tools ►Understandability & reliability ►Can people use it consistently ►Matching level of abstraction to human use ►“Patterns” ►“Intermediate representations”, “Macros”, … ►Soundness ►Logical consistency ►Sound inferences about domain ►Evolution and maintenance ►Modularisation ►Debugging ►Parsimony ►Collaboration and Standards
Fundamental issue ►Knowledge is fractal ►All terminologies are combinatorially large ►2 severitities * 2 durations * 2 varieties * 2 circumstances ► 2 4 = 16 leaf nodes ►3 4 = 81 potential entities ►Most problems have more than 2 at each step ►Can only catalogue a few of the descriptions to be used ►Can’t predict which in advance of use ►Experience shows get at most 50% right ►And there is a Zipf distribution for the rest
Limit combinatorial explosions ►“The Exploding Bicycle” ► ICD-9 (E826) 8 ► READ-2 (T30..) 81 ► READ-3 87 ► ICD-10 (V10-19) 587 ►V31.22 Occupant of three-wheeled motor vehicle injured in collision with pedal cycle, person on outside of vehicle, nontraffic accident, while working for income ►and meanwhile elsewhere in ICD-10 ►W65.40 Drowning and submersion while in bath-tub, street and highway, while engaged in sports activity ►X35.44 Victim of volcanic eruption, street and highway, while resting, sleeping, eating or engaging in other vital activities
Combinatorial explosion leads to complex polyhierarchies ►Humans find polyhierarchies hard to maintain ►Experience suggests errors increase with number of parents ►1-210%-15% 2-420%-25% >4 >35% ►Only MEDDRA seems to have cracked it ►Compositional ‘ontologies’ with formal classification provide a ‘compiler’ to manage polyhierarchies as collections of mono-hierarchies
Issues for Today One day selection from two- three days tutorial ►Assume ►Introduction to OWL & Protégé-OWL ►At least the first part of Protégé-OWL tutorial ►Will review Value Partitions pattern ►Will deal with ►Engineering issues and combinatorial explosion for Domain Ontologies ►A common pattern and the use of debugging tools ►Basic architecture for an “Ontology Based Knowledge Resource” ►Modularisation and Normalisation ►Why and when to use a classifier ►General questions ►Additional material on the web at: ►