Reasoning with Expressive Description Logics Ian Horrocks University of Manchester Manchester, UK Logical Foundations for the Semantic Web
Talk Outline Introduction to Description Logics The Semantic Web: Killer App for (DL) Reasoning? –Semantic Web Background –Ontology Languages for the Semantic Web Reasoning with OWL –OileEd Demo (if time) Description Logic Reasoning Research Challenges
Summary 1 DLs are family of object oriented KR formalisms related to frames and Semantic networks –Distinguished by formal semantics and inference services Semantic Web aims to make web resources accessible to automated processes –Ontologies will play key role by providing vocabulary for semantic markup OWL is a DL based ontology language designed for the Web –Exploits existing standards: XML, RDF(S) –Adds KR idioms from object oriented and frame systems –W3C recommendation and already widely adopted in e-Science –DL provides formal foundations and reasoning support
Summary 2 Reasoning is important because –Understanding is closely related to reasoning –Essential for design, maintenance and deployment of ontologies Reasoning support based on DL systems –Sound and complete reasoning –Highly optimised implementations Challenges remain –Reasoning with full OWL language –(Convincing) demonstration(s) of scalability –New reasoning tasks –Development of (more) high quality tools and infrastructure
Introduction to Description Logics
What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic networks and KL-ONE –Describe domain in terms of concepts (classes), roles (relationships) and individuals Distinguished by: –Formal semantics (typically model theoretic) Decidable fragments of FOL Closely related to Propositional Modal & Dynamic Logics –Provision of inference services Sound and complete decision procedures for key problems Implemented systems (highly optimised)
DL Architecture Knowledge Base Tbox (schema) Abox (data) Man ´ Human u Male Happy-Father ´ Man u 9 has-child Female u … John : Happy-Father h John, Mary i : has-child John: 6 1 has-child Inference System Interface
Short History of Description Logics Phase 1: –Incomplete systems (Back, Classic, Loom,... ) –Based on structural algorithms Phase 2: –Development of tableau algorithms and complexity results –Tableau-based systems for Pspace logics (e.g., Kris, Crack) –Investigation of optimisation techniques Phase 3: –Tableau algorithms for very expressive DLs –Highly optimised tableau systems for ExpTime logics (e.g., FaCT, DLP, Racer) –Relationship to modal logic and decidable fragments of FOL
Latest Developments Phase 4: –Mature implementations –Mainstream applications and Tools Databases –Consistency of conceptual schemata (EER, UML etc.) –Schema integration –Query subsumption (w.r.t. a conceptual schema) Ontologies and Semantic Web, Grid and e-Science –Ontology engineering (design, maintenance, integration) –Reasoning with ontology-based markup (meta-data) –Service description and discovery –Commercial implementations Cerebra system from Network Inference Ltd
Semantic Web: Killer App for DL Reasoning?
Web was “invented” by Tim Berners-Lee (amongst others), a physicist working at CERN His vision of the Web was much more ambitious than the reality of the existing (syntactic) Web: This vision of the Web has become known as the Semantic Web History of the Semantic Web “… a plan for achieving a set of connected applications for data on the Web in such a way as to form a consistent logical web of data …” “… an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation …”
Realising the complete “vision” is too hard for now (probably) Can make a start by adding semantic annotation to web resources Already seeing exciting applications of technology in e-Science Scientific American, May 2001: Beware of the Hype!
Where we are Today: the Syntactic Web A place where computers do the presentation (easy) and people do the linking and interpreting (hard) Why not get computers to do more of the hard work?
Hard Work using the Syntactic Web… Find images of Peter Patel-Schneider, Frank van Harmelen and Alan Rector… Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois
Impossible (?) using the Syntactic Web… Complex queries involving background knowledge –Find information about “animals that use sonar but are neither bats nor dolphins” Locating information in data repositories –Travel enquiries –Prices of goods and services –Results of human genome experiments Finding and using “web services” –Visualise surface interactions between two proteins Delegating complex tasks to web “agents” –Book me a holiday next weekend somewhere warm, not too far away, and where they speak French or English, e.g., Barn Owl
What is the Problem? Consider a typical web page: Markup consists of: –rendering information (e.g., font size and colour) –Hyper-links to related content Semantic content is accessible to humans, but not (easily) to computers… Requires (at least) NL understanding
Solution(?): Add “Semantic Markup” Annotations added to web pages (and other web accessible resources) “Semantics” given by ontologies –Ontologies provide a vocabulary of terms used in annotations –New terms can be formed by combining existing ones –Meaning (semantics) of such terms is formally specified –Need to agree on a standard web ontology language
Structure of an Ontology Ontologies typically have two distinct components: Names for important concepts in the domain –Elephant is a concept whose members are a kind of animal –Herbivore is a concept whose members are exactly those animals who eat only plants or parts of plants –Adult_Elephant is a concept whose members are exactly those elephants whose age is greater than 20 years Background knowledge/constraints on the domain –Adult_Elephants weigh at least 2,000 kg –All Elephants are either African_Elephants or Indian_Elephants –No individual can be both a Herbivore and a Carnivore
A Semantic Web — First Steps Extend existing rendering markup with semantic markup –Metadata annotations that describe content/funtion of web accessible resources Use Ontologies to provide vocabulary for annotations –“Formal specification” is accessible to machines A prerequisite is a standard web ontology language –Need to agree common syntax before we can share semantics –Syntactic web based on standards such as HTTP and HTML Make web resources more accessible to automated processes
Ontology Languages for the Semantic Web
RDF and RDFS RDF stands for Resource Description Framework It is a W3C candidate recommendation ( RDF is graphical formalism ( + XML syntax + semantics) –for representing metadata –for describing the semantics of information in a machine- accessible way RDFS extends RDF with “schema vocabulary”, e.g.: –Class, Property –type, subClassOf, subPropertyOf –range, domain
RDF Syntax: Triples _:xxx SubjectPropertyObject ex:subject ex:property ex:object _:yyy « plain litteral » « lexical »^^datatype Jean-François Baget
RDF Syntax: Graphs _:xxx « Ian Horrocks » ex:name ex:Person rdf:type « University of Manchester » ex:Organisation ex:name rdf:type _:yyy ex:member-of Jean-François Baget
RDFS RDFS vocabulary adds constraints on models, e.g.: – 8 x,y,z type(x,y) and subClassOf(y,z) ) type(x,z) ex:Person rdf:type ex:John ex:Animal rdfs:subClassOf ex:Person ex:Animal rdf:type
RDFS RDFS allows arbitrary use of schema vocabulary –Can be used/abused to say very strange things! rdfs:subClassOf rdfs:subPropertyOf rdf:type ex:Person rdf:type ex:Person
RDF/RDFS Semantics RDF has “Non-standard” semantics given by RDF Model Theory (MT) –IR, a non-empty set of resources –IS, a mapping from V into IR –IP, a distinguished subset of IR (the properties) –IEXT, a mapping from IP into the powerset of IR £ IR Class interpretation ICEXT induced by IEXT(IS(type)) –ICEXT(C) = {x | (x,C) 2 IEXT(IS(type))} RDFS adds constraints on models –{(x,y), (y,z)} µ IEXT(IS(subClassOf)) ) (x,z) 2 IEXT(IS(subClassOf))
Problems with RDFS RDFS too weak to describe resources in sufficient detail –No localised range and domain constraints Can’t say that the range of hasChild is person when applied to persons and elephant when applied to elephants –No existence/cardinality constraints Can’t say that all instances of person have a mother that is also a person, or that persons have exactly 2 parents –No transitive, inverse or symmetrical properties Can’t say that isPartOf is a transitive property, that hasPart is the inverse of isPartOf or that touches is symmetrical –…–… Difficult to provide reasoning support –No “native” reasoners for non-standard semantics –May be possible to reason via FO axiomatisation
From RDF to OWL Two languages developed by extending (part of) RDF –OIL: developed by group of (largely) European researchers (several from EU OntoKnowledge project) –DAML-ONT: developed by group of (largely) US researchers (in DARPA DAML programme) Efforts merged to produce DAML+OIL –Development was carried out by “Joint EU/US Committee on Agent Markup Languages” –Extends (“DL subset” of) RDF DAML+OIL submitted to W3C as basis for standardisation –Web-Ontology (WebOnt) Working Group formed –WebOnt group developed OWL language based on DAML+OIL –OWL language now a W3C Proposed Recommendation
OWL Language Three species of OWL –OWL full is union of OWL syntax and RDF –OWL DL restricted to FOL fragment ( ¼ DAML+OIL) –OWL Lite is “simpler” subset of OWL DL Semantic layering –OWL DL ¼ OWL full within DL fragment OWL DL based on SHIQ Description Logic –In fact it is equivalent to SHOIN (D n ) DL OWL DL Benefits from many years of DL research –Well defined semantics –Formal properties well understood (complexity, decidability) –Known reasoning algorithms –Implemented systems (highly optimised)
OWL Class Constructors XMLS datatypes as well as classes in 8 P.C and 9 P.C –E.g., 9 hasAge.nonNegativeInteger (see work by Zhiming Pan) Arbitrarily complex nesting of constructors –E.g., Person u 8 hasChild.Doctor t 9 hasChild.Doctor
RDFS Syntax E.g., Person u 8 hasChild.(Doctor t 9 hasChild.Doctor):
OWL Axioms Axioms (mostly) reducible to inclusion ( v ) –C ´ D iff both C v D and D v C Obvious FOL equivalences –E.g., C ´ D, x.C(x) $ D(x), C v D, x.C(x) ! D(x)
Reasoning with OWL
OWL and Description Logic OWL DL corresponds to SHOIN (D n ) Description Logic –Provides well defined semantics –Formal properties well understood (complexity, decidability) –Facilitates provision of reasoning services (using DL systems) Why do we want/need reasoning services for the Semantic Web?
Philosophical Reasons Semantic Web aims at “machine understanding” Understanding closely related to reasoning –Recognising semantic similarity in spite of syntactic differences –Drawing conclusions that are not explicitly stated
Practical Reasons Given key role of ontologies in e-Science and Semantic Web, it is essential to provide tools and services to help users: –Design and maintain high quality ontologies, e.g.: Meaningful — all named classes can have instances Correct — captured intuitions of domain experts Minimally redundant — no unintended synonyms Richly axiomatised — (sufficiently) detailed descriptions –Store (large numbers) of instances of ontology classes, e.g.: Annotations from web pages (or gene product data) –Answer queries over ontology classes and instances, e.g.: Find more general/specific classes Retrieve annotations/pages matching a given description –Integrate and align multiple ontologies
Why Decidable Reasoning? OWL constructors/axioms restricted so reasoning is decidable Consistent with Semantic Web's layered architecture –XML provides syntax transport layer –RDF(S) provides basic relational language and simple ontological primitives –OWL provides powerful but still decidable ontology language –Further layers (e.g. SWRL) will extend OWL Will almost certainly be undecidable Facilitates provision of reasoning services –“Practical” algorithms for sound and complete reasoning –Several implemented systems –Evidence of empirical tractability
Why Sound & Complete Reasoning? Important for ontology design –Ontologists need to have complete confidence in reasoner –Otherwise they will cease to trust results –Doubting unexpected results makes reasoner useless Important for ontology deployment –Many realistic web applications will be agent ↔ agent –No human intervention to spot glitches in reasoning Incomplete reasoning might be OK in 3-valued system –But “don’t know” typically treated as “no”
Basic Inference Tasks Knowledge is correct (captures intuitions) –Does C subsume D w.r.t. ontology O ? (in every model I of O, C I µ D I ) Knowledge is minimally redundant (no unintended synonyms) –Is C equivallent to D w.r.t. O ? (in every model I of O, C I = D I ) Knowledge is meaningful (classes can have instances) –Is C is satisfiable w.r.t. O ? (there exists some model I of O s.t. C I ; ) Querying knowledge –Is x an instance of C w.r.t. O ? (in every model I of O, x I 2 C I ) – Is h x, y i an instance of R w.r.t. O ? (in every model I of O, ( x I, y I ) 2 R I ) Above problems can be solved using highly optimised DL reasoners
E.g.: Reasoning Support for Ontology Design
E.g.: Reasoning Support for Instance Retrieval
DL Reasoning: Highly Optimised Implementations DL reasoning based on tableaux algorithms Naive implementation → effective non-termination Modern systems include MANY optimisations Optimised classification (compute partial ordering) –Enhanced traversal (exploits information from previous tests) –Use structural information to select classification order Optimised subsumption testing (search for models) –Normalisation and simplification of concepts –Absorption (simplification) of axioms –Dependency directed backtracking –Caching of satisfiability results and (partial) models –Heuristic ordering of propositional and modal expansion –…–…
Research Challenges Increased expressive power –Existing DL systems implement (at most) SHIQ –OWL extends SHIQ with datatypes and nominals ( SHOIN (D n )) –Future (undecidable) extensions such as SWRL Scalability –Very large ontologies –Reasoning with (very large numbers of) individuals Other reasoning tasks –Querying –Matching –Least common subsumer –... Tools and Infrastructure –Support for large scale ontological engineering and deployment
Summary 1 DLs are family of object oriented KR formalisms related to frames and Semantic networks –Distinguished by formal semantics and inference services Semantic Web aims to make web resources accessible to automated processes –Ontologies will play key role by providing vocabulary for semantic markup OWL is a DL based ontology language designed for the Web –Exploits existing standards: XML, RDF(S) –Adds KR idioms from object oriented and frame systems –W3C recommendation and already widely adopted in e-Science –DL provides formal foundations and reasoning support
Summary 2 Reasoning is important because –Understanding is closely related to reasoning –Essential for design, maintenance and deployment of ontologies Reasoning support based on DL systems –Sound and complete reasoning –Highly optimised implementations Challenges remain –Reasoning with full OWL language –(Convincing) demonstration(s) of scalability –New reasoning tasks –Development of (more) high quality tools and infrastructure
Acknowledgements Thanks to the many people who I have worked with, in particular: –Dieter Fensel –Frank van Harmelen –Zhiming Pan –Peter Patel-Schneider –Alan Rector –Uli Sattler
Resources Slides from this talk – FaCT system (open source) – OilEd (open source) – Protégé – W3C Web-Ontology (WebOnt) working group (OWL) – DL Handbook, Cambridge University Press –
Select Bibliography Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and RDF to OWL: The making of a web ontology language. Journal of Web Semantics, Franz Baader, Ian Horrocks, and Ulrike Sattler. Description logics as ontology languages for the semantic web. In Festschrift in honor of Jörg Siekmann, LNAI. Springer, I. Horrocks and U. Sattler. Ontology reasoning in the SHOQ(D) description logic. In Proc. of IJCAI All available from