Week 7: Semantic Web and Semantic Search

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
XML: Extensible Markup Language
RDF Schemata (with apologies to the W3C, the plural is not ‘schemas’) CSCI 7818 – Web Technologies 14 November 2001 Van Lepthien.
An Introduction to RDF(S) and a Quick Tour of OWL
CS570 Artificial Intelligence Semantic Web & Ontology 2
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
OWL TUTORIAL APT CSA 3003 OWL ANNOTATOR Charlie Abela CSAI Department.
1 Semantic Web Technologies: The foundation for future enterprise systems Okech Odhiambo Knowledge Systems Research Group Strathmore University.
Sematic Web Microdata, Microformat and RDF Advanced Web-based Systems | Misbhauddin.
Ontology Notes are from:
Chapter 8: Web Ontology Language (OWL) Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005.
XHTML1 Building Document Structure. XHTML2 Objectives In this chapter, you will: Learn how to create Extensible Hypertext Markup Language (XHTML) documents.
Dr. Alexandra I. Cristea RDF.
More RDF CS 431 – Carl Lagoze – Cornell University Acknowledgements: Eric Miller Dieter Fensel.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
Introduction to XML This material is based heavily on the tutorial by the same name at
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
13 Dec. 2006CmpE 583 Fall 2006 OWL Lite- Property Char’s. 1 OWL Lite: Ch. 13- Property Characteristics Atilla ELÇİ.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
Chapter 6 Understanding Each Other CSE 431 – Intelligent Agents.
Okech Odhiambo Faculty of Information Technology Strathmore University
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
1 Representing Data with XML September 27, 2005 Shawn Henry with slides from Neal Arthorne.
Logics for Data and Knowledge Representation
OWL 2 in use. OWL 2 OWL 2 is a knowledge representation language, designed to formulate, exchange and reason with knowledge about a domain of interest.
Chapter 9. 9 RDFS (RDF Schema) RDFS Part of the Ontological Primitive layer Adds features to RDF Provides standard vocabulary for describing concepts.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Semantic Web Ontology Design Pattern Li Ding Department of Computer Science Rensselaer Polytechnic Institute October 3, 2007 Class notes for CSCI-6962.
Semantic Web - an introduction By Daniel Wu (danielwujr)
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
1 Artificial Intelligence Applications Institute Centre for Intelligent Systems and their Applications Stuart Aitken Artificial Intelligence Applications.
Of 35 lecture 5: rdf schema. of 35 RDF and RDF Schema basic ideas ece 627, winter ‘132 RDF is about graphs – it creates a graph structure to represent.
RDF Schema (RDFS) RDF user communities need to define the vocabularies (terms) to indicate that they  are describing specific kinds or classes of resources.
Artificial Intelligence 2004 Ontology
Organization of the Lab Three meetings:  today: general introduction, first steps in Protégé OWL  November 19: second part of tutorial  December 3:
Ontology Engineering Lab #2 – September 9,
Doc.: IEEE /0169r0 Submission Joe Kwak (InterDigital) Slide 1 November 2010 Slide 1 Overview of Resource Description Framework (RFD/XML) Date:
Practical RDF Chapter 12. Ontologies: RDF Business Models Shelley Powers, O’Reilly SNU IDB Lab. Taikyoung Kim.
ONTOLOGY ENGINEERING Lab #2 – September 8,
Of 38 lecture 6: rdf – axiomatic semantics and query.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Knowledge Technologies Manolis Koubarakis 1 Some Other Useful Features of RDF.
Web Ontology Language (OWL). OWL The W3C Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about.
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Ontology Technology applied to Catalogues Paul Kopp.
Extensible Markup Language (XML) Pat Morin COMP 2405.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
Service-Oriented Computing: Semantics, Processes, Agents
Charlie Abela Department of Intelligent Computer Systems
An Introduction to RDF Schema
Knowledge Representation Part II Description Logic & Introduction to Protégé Jan Pettersen Nytun.
Service-Oriented Computing: Semantics, Processes, Agents
Service-Oriented Computing: Semantics, Processes, Agents
ece 627 intelligent web: ontology and beyond
XML QUESTIONS AND ANSWERS
Ontologies.
RDF For Semantic Web Dhaval Patel 2nd Year Student School of IT
Ontology.
ece 720 intelligent web: ontology and beyond
RDF 1.1 Concepts and Abstract Syntax
ece 720 intelligent web: ontology and beyond
Ontology.
Knowledge Representation Part VII Protégé / RDFS / OWL / ++
CSE591: Data Mining by H. Liu
ece 720 intelligent web: ontology and beyond
Presentation transcript:

Week 7: Semantic Web and Semantic Search Arantza Aldea Ontology developed by Dr. David Sutton

Some reading David Amerlan, Google Semantic Search W3org Semantic Web Section Google how search works Google Knowledge Graph Cambridge Semantics: Semantic Search and Semantic Web Simon Penson, The Search Engine Watch Blog

Content How Semantic Search Work Google Knowledge Graph Entity Extraction and the semantic Web Semantic Search and Semantic Web The role of Ontologies on the semantic web How to develop an Ontology

Semantic Search

Summary of Semantic Search steps Crawling Indexing Indexing look for semantic tags to get as much information as as possible about what it is the web about Entity Extraction The knowledge Graph Extraction of the most relevant information from the index Semantic search means Auto-complete of the query Understanding of the meaning of the query Ranking list of results

The Knowledge Graph

Indexing: Semantic Entity Extraction Entity Detection Convert raw data in web pages into a meaningful entity. Sentence analysis and segmentation Handling synonyms Relation Detection Understand the meaning of the words and put them into context Linked data A Semantic Entity is created

Semantic Document Parsing Semantic Tags can be added to Web Pages to facilitate Semantic Entity Extraction Semantic Tools available: XML Microdata RDF, RDFa Ontology Web language (OWL) Google recommends rich snippets using microformats, microdata, and RDFa Rich snippets are a few lines of text that appear every result www.schema.org Microdata are based on RDFa

Query Understanding Understand what the user wants Extract the meaningful words from the query Context of the query User Profile Location Current Trends Ontology has an important role in the semantic search The knowledge graph is based on them

The Semantic Search and the Semantic Web Semantic Search aims to understand the query and returns meaningful information taking into account Context Location User profile Meaning of the words Semantic web is set of technologies used to represent, query and store information Semantic search use some of semantic web technology Schema.org represents a merge between semantic search and semantic web

Semantic Web technologies Linked Data Web of Data (RDF, RDFa) Inference Reasoning about Data through rules OWL, Rule Interchange Format Vocabularies Ontology (RDF/OWL, Turtle) Queries SPARQL

Ontology Development What is an ontology and how to build one

What is an Ontology? Some definitions: An ontology is a data model that represents a set of concepts within a domain and the relationships between those concepts (wikipedia). An ontology is a specification of a conceptualisation (Gruber 1993). [Ontologies are] “explicit formal specifications of the terms in the domain and relations among them” (Noy and McGuinness)

How do we create an ontology. Various languages and development environments have been created in order to allow the formal expression of ontologies. These include: Frame based systems (e.g. Protégé Frames) Web Based Systems (e.g.OWL) Systems based on logic (e.g. OWL-DL). Note that these terms are not mutually exclusive. For instance OWL-DL is a web based language that uses a particular kind of logic known as description logic.

What Does An Ontology Describe? Concepts (classes). Properties of concepts and relationships between concepts. Constraints on properties and relationships. Instances (sometimes but not always)

Semantic Languages XML RDF RDFS OWL From HTML to XML XML document structure Relationships to other web knowledge reps RDF Universal Resource Identifiers RDF statements Representation of sets of RDF statements Literals, types and datatypes containers RDFS OWL

An HTML Document <html> <head> <title>David Sutton's Contact Details</title> </head> <body> <h1>Contact Details</h1> <p>My name is <strong>David Sutton</strong>. My office is <i>WHE – T2.13</i> , and my telephone number is <em>+44 (0) 1865 484576</em>. </body> </html> Angle brackets <..> enclose HTML Tags.

Markup For Structure. HTML marks up text in order to help the browser display that document to a human user An HTML document can only contain predefined HTML tags. XML is one of a series of technologies that allow us to mark up text in ways that allow it to be processed by computers. It is extensible in that you can define your own tags.

An XML Document <?xml version = "1.0"> <contact staffid="p0012345">  <name>David Sutton</name>   <address>WHE–T2.13</address> <phone>+44 (0) 1865484576</phone>   </contact> Contains user-defined tags

Technologies for Representing Data and Knowledge on the Web The Web Ontology Language (OWL) provides a rich language for describing properties and classes. XML RDF RDFS OWL RDF Schema (RDFS) allows entities to be arranged into classes and subclasses. RDF Schema (RDFS) allows entities to be arranged into classes and subclasses The Resource Description Framework (RDF) allows data to be described in terms of entities, attributes, and relationship. The Extensible Markup Language (XML) allows users to define their own tags to describe structured data.

Valid Documents. XML provides two mechanisms for defining the structure of an XML document, i.e. what tags are allowed and what attributes and content such tags can have. A Document Type Definition (DTD) defines the structure using an EBNF grammar. An XML Schema uses XML itself to define the structure. A document is valid if, and only if, it conforms to the structure defined by its schema or DTD.

Example Document invalid because DTD does not define a “kontact” tag, <!-- contact.dtd --> <!ELEMENT contact (name)> <!ELEMENT name (#PCDATA)> Document invalid because DTD does not define a “kontact” tag, <?xml version = "1.0"?> <!DOCTYPE contact SYSTEM "contact.dtd"> <kontact> <kname>Peter Marshall </kname> </kontact> 35 mins to here. N.B. This example uses a DTD rather than a schema solely because it was easier to fit it on the slide that way!

From XML to RDF XML allows us to make statements about entities and their properties. It would be useful to be able to: Indicate that two separate statements are describing the same entity. Use properties to describe relationships between entities. Make statements about properties themselves, e.g. give them natural language descriptions. The Resource Description Framework (RDF) provides us with mechanisms for doing these things.

RDF RDFa can be used to describe (add meaning to) specific (HTML) information on the web. RDFa is added to web pages in order to make them understandable for computers and people. By adding RDFa, browsers, search engines, and other software can understand more about the pages, and thus better results. RDF are used to talk about entities and their properties: Person, Product, events RDF requires a RDF Schema that defines the entities and their properties To indicate that a part of a web page is an instance of an entity, attributes in the html tag span and div are used

RDF example <div xmlns:v=http://rdf.data-vocabulary.org/# typeof="v:Person"> My name is <span property="v:name“>Muhammad Younas</span> People call me <span property="v:lastname">Younas</span> Here is my homepage: <a href=" http://cms.brookes.ac.uk/staff/MYounas" rel="v:url"> http://cms.brookes.ac.uk/staff/MYounas</a> I live in Oxford, Oxfordshire, and work as a <span property="v:title">lecturer</span> at <span property="v:affiliation“>Oxford Brookes University</span> </div> Extracted from week 5 slides

From RDF to RDF Schema RDF allows us to represent resources and their properties. What we cant do yet is to … Group resources into classes. Specify the properties that we expect resources of a given class to have. Arrange classes into inheritance hierarchies. These facilities are provided by RDF Schema (RDFS).

Properties rdfs:Property exs:Person rdf:type rdfs:Class rdf:type A Property of a class is described by a resource whose type is the predefined URI rdfs:Property. To indicate that a property applies to a particular class we use the predicate rdfs:domain. To indicate that the values of a property are instances of a particular class or datatype, we use the predicate rdfs:range. rdfs:Property exs:Person rdf:type rdfs:Class rdf:type rdfs:range exs:author rdf:type exs:Book rdfs:domain

Subclasses We use the predicate rdfs:subClassOf to indicate that one class is a subclass of another. rdfs:Class rdf:type rdf:type rdf:type exs:Vehicle rdf:subClassOf rdf:subClassOf exs:Car exs:Bus

Interpreting RDF Schema Although RDF Schema resemble the type systems of Object Oriented Programming Languages, there are some important differences. The most important of these is that OO class definitions are interpreted as constraints on objects, whereas RDF schema can be interpreted more freely. For instance suppose that an RDF Schema defines the range of property exs:author to be the class exs:Person. An application processing this RDF can intepret this as either: A constraint: the author of a book must be explicitly declared to be a Person. A rule of inference: the application will infer that the author of a book is a Person, even if no explicit statement to that effect has been made.

OWL: The Web Ontology Language. RDFS and RDF allow us to define classes, properties, and instances. However it would be useful to be able to make more complex statements about classes and properties, and draw inferences from these statements. Examples: Restrict the cardinality of a property: e.g . “ An instance of class Person has a mother property with exactly one value”. State that one property is the inverse of another: e.g. “if X is the parent of Y then Y is the child of X”. Indicate transitivity of properties: e.g. “if X is the ancestor of Y and Y is the ancestor of Z then X is the ancestor of Z”. The Web Ontology Language (OWL) defines RDF resources that allow us to make such statements, and defines the inferences that can be drawn from them.

Species of OWL OWL comes in three varieties. OWL Full: Any set of RDF statements can be interpreted as an OWL FULL ontology. However there is no guarantee that any reasoning software will be able to work out all the inferences that can be drawn from an OWL FULL ontology. OWL DL: Only some RDF statements are allowed in an OWL DL ontology. These restrictions mean that in principle it is possible to construct reasoning software that correctly processes all inferences of the ontology. OWL Lite: An entry level version of OWL that provides simple classification and constraint facilities.

Which to use? OWL Lite Vs OWL DL OWL DL Vs OWL Full Both can have full reasoning support so do you need the more expressive constructs provided by OWL DL OWL DL Vs OWL Full If you require full reasoning support choose OWL DL If you require the meta-modelling facilities of RDF Schema (defining classes of classes, attaching properties to classes) choose OWL Full

OWL Lite Provides… RDF Schema Features Class (Thing, Nothing) rdfs:subClassOf rdf:Property rdfs:subPropertyOf rdfs:domain rdfs:range individual

Class <Declaration> <Class IRI="#Monster"/> A class defines a group of individuals that belong together because they share some properties. There is a built in most general class: Thing There is a built in most specific class: Nothing Here we define some root classes (owl:class is a subclass of rdf:class) Owl/xml Xml/rdf <owl:Class rdf:about="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Monster"> </owl:Class> <Declaration> <Class IRI="#Monster"/> </Declaration>

subClassOf Used to create hierarchies. <SubClassOf> <Class IRI="#Dragon"/> <Class IRI="#Monster"/> </SubClassOf> <owl:Class rdf:about="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Humanoid"> <rdfs:subClassOf rdf:resource="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Monster"/> </owl:Class>

Defining Classes in Protégé

Disjoint Classes In OWL classes are not assumed to be disjoint. In other words a given individual can be a member of two classes that are not related by inheritance (e.g. it could be both a Dragon and a Humanoid). We can make them disjoint by using the Disjoint widget in Protégé to add disjointWith statements.

Disjoint Classes Disjoints Widget

Disjoint Classes <DisjointClasses> <Class IRI="#Dragon"/> <Class IRI="#Humanoid"/> </DisjointClasses> <owl:Class rdf:about="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Dragon"> <rdfs:subClassOf rdf:resource="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Monster"/> <owl:disjointWith rdf:resource="http://www.brookes.ac.uk/p0073862/Ontology1.owl#Humanoid"/> </owl:Class>

Properties Two main types of property Object properties link an individual to another individual. Datatype properties link an individual to a literal value (e.g. an integer or a string) <Declaration> <ObjectProperty IRI="#isWeapon"/> </Declaration> <DataProperty IRI="#hasIntelligence"/> </Declaration>

Properties

Characteristics of Properties Properties in OWL can have a variety of different characteristics: A property can have an inverse (as in frame based ontologies). A property can be functional , that is to say it can only have one value for a given individual. For instance a student can have only one surname. A property can be inverse functional that is to say that its inverse is functional. A property can be transitive in the sense that if A is related to B and B is related to C then A is related to C. For example if Babs is the sister of Joy, and Teddie is the sister of Babs than Teddie is also the sister of Joy.

Characteristics of Properties

Characteristics of Properties <owl:ObjectProperty rdf:ID="hasLair"> <owl:inverseOf rdf:resource="#isLairOf"/> </owl:ObjectProperty> <owl:ObjectProperty rdf:ID="hasSibling"> <rdf:type rdf:resource="&owl;TransitiveProperty"/>

Property Restrictions We can place restrictions on the individuals that belong to a class by imposing conditions on the values of their properties. These restrictions include: Existential restrictions, where we insist that a certain property must have at least one value of a certain kind. Universal restrictions, where we insist that all the values of a certain property must be of a certain kind.

Existential Restrictions We can to impose a restriction that all dragons must have at least one lair by Creating an anonymous class consisting of all individuals with at least one lair, and Imposing a restriction that the named class Dragon is a subclass of this anonymous class.

Existential Restrictions

Existential Restrictions <owl:Class rdf:ID="Dragon"> <rdfs:subClassOf rdf:resource="#Monster"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasLair"/> <owl:someValuesFrom rdf:resource="#Lair"/> </owl:Restriction> </rdfs:subClassOf> <owl:disjointWith rdf:resource="#Humanoid"/> </owl:Class>

Universal Restrictions We can create a class Warrior, and impose a restriction that the only possessions that a Warrior may have are his weapons by Creating an anonymous class consisting of all individuals whose only possessions are weapons, and Making the Warrior class a subclass of this anonymous class.

Universal Restrictions Superclasses of Warrior

Universal Restrictions <owl:Class rdf:ID="Warrior"> <rdfs:subClassOf rdf:resource="#Humanoid"/> <rdfs:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="#hasPossession"/> <owl:allValuesFrom rdf:resource="#Weapon"/> </owl:Restriction> </rdfs:subClassOf> </owl:Class>

Summary of XML XML can be used to mark up documents in ways that reveal their structure to applications that process them. XML documents are structured by dividing them into elements. Each element begins with a start tag and ends with an end tag. The start tag of an element may define values for attributes of that element. An XML document contains a single root element. An XML document is well-formed if it conforms to the syntax of XML. It is valid if it uses tags in ways that are allowed by an associated DTD or Schema. XSLT stylesheets may be used to convert XML documents into other forms (e.g. into HTML documents). RDF, RDFS, and OWL are technologies that use XML to transmit information over the Web in ways that allow this information to be processed by computers.

Summary of RDF An RDF document contains a set of statements. Each statement has a subject, a predicate, and an object. Statements can be represented as graphs, as triples, and in XML. RDF uses URIs to identify entities (which it refers to as “resources”). The object of an RDF statement can be a URI, a typed or plain literal, or a container. A container can be a Bag, a Seq, or an Alt.

Summary of RDFS and OWL RDFS allows us to define classes of resources, to indicate subclass and instance relationships and to describe properties of classes. OWL allows us to make statements about classes from which reasoning software may draw inferences. OWL comes in three varieties, OWL Full, Owl DL, and OWL Lite.