Semantic Web Ontologies & Data Models CS 502 – 20030303 Carl Lagoze – Cornell University Acknowledgements: Eric Miller Dieter Fensel Cornell CS 502 20020307
Components of the Semantic Web Cornell CS 502 20020307
Knowledge Representation and AI meeting the Web Knowledge representation is not new Formalisms Semantic networks Frame systems Frames (objects) Slots (attributes and properties) Languages LOOM Classic Cyc-L Logics Description Logics (DLs) Cornell CS 502 20020307
So why re-invent it for the Web Not re-invention Same underlying formalisms (frames, slots, description logic) But new factors Lack of central control Inconsistency, lies, re-interpretations, duplications Continue change New facts appear and modify constantly Massive scale Tractability Knowledge expressiveness must be limited or reasoning must be incomplete Open world context Contrast to most reasoning systems that assume anything absent from knowledge base is not true Need to build on existing standards URI, XML, RDF Cornell CS 502 20020307
Why aren’t XML schema good enough? XML schema is designed to primarily describe structure and constraints on documents Rich datatypes Grammar for describing structure of elements (and attributes) Incomplete modeling of is-a relationships Top-down inheritance of attributes from superclasses to subclasses with extension and restriction is clumsy E.g., student as person with a student # and age<28 Bottom-up inheritance of instances to superclasses is not automatic Multiple inheritance is missing Cornell CS 502 20020307
From modeling to implementation Modeling primitives: Class; Slot ERmodel OWLspecification Modeling Implementation Relational Model XMLschema Data- base XML documents Cornell CS 502 20020307
Data model for defining machine-processible semantics of data Brief review of RDF Data model for defining machine-processible semantics of data Three main object types: Resources Primitive metadata is URI Properties Sub-class of resource Statements (:s :p :o) Graph model that is serializable in XML Cornell CS 502 20020307
RDF Schemas A building block for assembling ontology languages Declaration of vocabularies Classes and class hierarchies properties defined by a particular community characteristics of properties and/or constraints on corresponding values Provides substructure for inferences based on existing triples Schema language is an expression of basic RDF model Schema Type System - Basic Types Property, Class, SubClassOf, Domain, Range Minimal (but extensible) at this time Expressible in the RDF model and syntax Cornell CS 502 20020307
Schema Vocabularies Enables communities to share machine readable tokens and locally define human readable labels. dc:Creator “Nom” rdfs:label “Author” rdfs:label “$100 $a” rdfs:label Cornell CS 502 20020307
Relationships among vocabularies dc:Creator marc:100 ms:director bib:Author Cornell CS 502 20020307
Relationships among vocabulary elements dc:Creator ms:director rdfs: subPropertyOf rdfs:label “Director” dc:Creator URI:R ms:director “John Smith” Cornell CS 502 20020307
RDF Schema: Specializing Properties rdfs:subPropertyOf – allows specialization of relations E.g., the property “father” is a subPropertyOf the property parent subProperty semantics Cornell CS 502 20020307
Sub-Property Semantics Cornell CS 502 20020307
Constraints on Properties Force objects to be of a certain type rdfs:domain Restricts the type of resources that may have a specific property rdfs:range Restricts the type of resources that may be the value of a specific property range Cornell CS 502 20020307
Inferences from Constraints doris betty eve alice charles Cornell CS 502 20020307
Inferences from Constraints Cornell CS 502 20020307
Class Hierarchy rdfs:Class rdfs:subClassOf Resources denoting a set of resources; range of rdf:type rdfs:subClassOf Create class hierarchy rdf:type rdf:type rdfs:class rdfs:subClassOf rdf:type rdf:type rdf:class rdf:class Cornell CS 502 20020307
Sub-Class Inferencing Cornell CS 502 20020307
Sub-class Inferencing Example Cornell CS 502 20020307
A formal specification of conceptualization shared in a community What is an Ontology? A formal specification of conceptualization shared in a community Vocabulary for defining a set of things that exist in a world view Formalization allows communication across application systems and extension Parallel concepts in other areas: Domains: database theory Types: AI Classes: OO systems Types/Sorts: Logic Global vs. Domain-specific Cornell CS 502 20020307
XML and RDF are ontologically neutral No standard vocabulary just primitives Resource, Class, Property, Statement, etc. Compare to classic first order logic Conjunction, disjunction, implication, existential, universal quantifier Cornell CS 502 20020307
Components of an Ontology Vocabulary (concepts) Structure (attributes of concepts and hierarchy) Logical characteristics of concepts & attributes Domain and range restrictions Properties of relations (symmetry, transitivity) Cornell CS 502 20020307
On-line lexical reference system, domain-independent Wordnet On-line lexical reference system, domain-independent >100,000 word meanings organized in a taxonomy with semantic relationships Synonymy, meronymy, hyponymy, hypernymy Useful for text retrieval, etc. http://www.cogsci.princeton.edu/~wn/online/ Cornell CS 502 20020307
Effort in AI community to accommodate all of human knowledge!!! CYC Effort in AI community to accommodate all of human knowledge!!! Formalizes concepts with logical axioms specifying constraints on objects and classes Associated reasoning tools Contents are proprietary but there is OpenCyc http://www.opencyc.org/ Cornell CS 502 20020307
Lots of Participants and $$$ Ontologies for the Web Lots of Participants and $$$ Web Ontology Working Group Distributed Agent Markup Language Ontology Inference Layer OntoWeb Schemas Project OWL (Web Ontology Language) – develop standard for encoding ontologies on top of RDF Schema Cornell CS 502 20020307
Extending RDF(S) with OWL Class definition: Conjunction, disjunction, negation Property constraints: universality, existence, cardinality Properties of properties: transitivity, symmetry Class, sub-class definition Property (attribute), sub-property definition Domain, range constraints Cornell CS 502 20020307
DAML class building operations disjointWith No vegetarians are carnivores sameClassAs (equivalence) Enumerations (on instances) The Ivy League is Cornell, Harvard, Yale, …. Boolean set semantics (on classes) Union (logical disjunction) Intersection (logical conjunction) complimentOf (logical negation) All non-carnivores are vegetarians Cornell CS 502 20020307
DAML property building operations & restrictions Transitive Property P(x,y) and P(y,z) -> P(x,z) SymmetricProperty P(x,y) iff P(y,x) Functional Property P(x,y) and P(x,z) -> y=z inverseOf P1(x,y) iff P2(y,x) InverseFunctional Property P(y,x) and P(z,x) -> y=z Cardinality Cornell CS 502 20020307
Full use of XML schema data type definitions Examples DAML+OIL DataTypes Full use of XML schema data type definitions Examples Define a type age that must be a non-negative integer Define a type clothing size that is an enumeration “small” “medium” “large” Cornell CS 502 20020307
DAML+OIL Instance Creation Create individual objects filling in slot/attribute/property definitions <Person ref:ID=“William Arms”> <rdfs:label>Bill</rdfs:label> <age><xsd:integer rdf:value=“57”/></age> <shoesize><xsd:decimal rdf:value=“10.5”/></shoesize> </Person> Cornell CS 502 20020307
Language Comparison DTD XSD RDF(S) OWL Bounded lists (“X is known to have exactly 5 children”) X Cardinality constraints (Kleene operators) Class expressions (unionOf, complementOf) Data types Enumerations Equivalence (properties, classes, instances) Formal semantics (model-theoretic & axiomatic) Inheritance Inference (transitivity, inverse) Qualified contraints (“all children are of type person” Reification Cornell CS 502 20020307
JENA toolkit for manipulating RDF models Some useful RDF tools JENA toolkit for manipulating RDF models http://www.hpl.hp.com/semweb/jena-top.html RDFSviz for visualizing ontologies expressed as RDF schema http://www.dfki.uni-kl.de/frodo/RDFSViz/ W3C RDF validation service for parsing and view RDF instances http://www.w3.org/RDF/Validator/ Cornell CS 502 20020307