Web-Technology Lecture 13
Pre-quiz Who knows what Semantic Web is? Who knows what R, D and F stand for in RDF? Who knows why will we mention Dublin today? Who knows what owl has to do any of this?
Semantic Web
A Very General Motivation Every language has its own syntax and semantics Syntax is the study of grammar. It defines how to structure of a message how to say something? Semantic is the study of meaning. It defines how to interpret a message how to understand what one says? Syntax Semantic Different syntaxes can have the same semantics x += y x = x + y
What does it have to do with WebTech? Both syntax and semantics help to communicate information Web is the largest information system The syntax of information on the Web is defined by… Where is its Semantics?.. It does not have its own. By default, it did not need any. WWW has been created to to be consumed by humans. Humans read and write in natural languages they understand the texts and the links Computers - not so much HTML
Can we make computers understand the Web? The Semantic Web (2001) By Tim Berners-Lee, James Hendler and Ora Lassila Not the first piece about the term, but the most influential Main idea: Information on the Web can be given meaning This will allow computers (agents) to understand it and communicate to each other based on this information This will allow automate many online activities by giving computers complex tasks and delegating step-by-step execution of this tasks to them A kind of a distributed Artificial Intelligence Where would an old Web go? Nowhere. Semantic Web is not a substitute or an update, but an extension
How does one build Semantic Web He does not… not alone Building semantic technologies involves lots of work and knowledge SW is an evolution, not a revolution Yet, it involves: Many public data sets… …provided with metadata… …values interlinked… .. and elements defined by common ontologies… …and represented using open W3C SW standards.
Semantic Web Technologies Publisher Title With an example Genre Author
RDF RDF stands for RDF statements expressing knowledge are triples Resource: pages, dogs, ideas...everything that can have a URI Description: attributes, features, and relations of the resources Framework: model, languages and syntaxes for these descriptions RDF statements expressing knowledge are triples every piece of knowledge is broken down into (subject : predicate : object) e.g.: Jane-Eyre : has-an-author : Charlotte-Bronte RDF is a graph model that links description of resources together into networks Subjects and objects are nodes, predicates are links has-a-publisher has-an-author Penguin books Jane-Eyre Charlotte-Bronte
RDF (cont.) Resources are identified by their URIs (IRIs) Subject is an URI or a blank node Predicate is an URI Object is an URI, blank node or a literal An RDF model has a unique namespaces URIs can be represented relatedly to a name space where they are defined http://purl.org/dc/elements/1.1/creator => dc:creator RDF specification defines the rules for creating RDF graphs and datasets as well as the basic RDF vocabulary (revised in 2014) https://www.w3.org/TR/rdf11-concepts/
RDF-Schema (RDFS) RDF-Schema: extends RDF with basic vocabulary allows to define new properties (predicates) and classes (types of resources) Provides basic means for defining and controlling the semantics of RDF models rdf:type rdfs:subClassOf rdfs:subPropertyOf rdfs:domain rdfs:range “Jane Eyre” http://www.penguin.com/ my:has-title my:has-publisher http://www.gutenberg.org/files/1260/1260-h/1260-h.htm my:has-author rdf:type https://en.wikipedia.org/wiki/Charlotte_Bront%C3%AB my:Book rdf:type rdfs:subClassOf my:Person my:CreativeWork https://www.w3.org/TR/rdf-schema/
Standard RDF Vocabularies DC and DCterms (Dublic Core) – cataloguing digital resources FOAF (Friend Of A Friend) - linking people on the Web vCard – information about people and organisations geo – geographical information … “Jane Eyre” http://www.penguin.com/ dc:title dc:publisher http://www.gutenberg.org/files/1260/1260-h/1260-h.htm dc:creator rdf:type https://en.wikipedia.org/wiki/Charlotte_Bront%C3%AB my:Book rdf:type rdfs:subClassOf foaf:Person my:CreativeWork
Ontologies In Philosophy: “Ontology is the study of being or existence. It seeks to describe the basic categories and relationships existing in reality.” In AI and CS: “Formal ontology is an explicit intentional specification of shared conceptualization” In Semantic Web: “Web-ontology is a document that formally defines the relations among terms in sharable format”.
Ontologies eat Lion Antelope ? Crocodile Lion Crocodile subClassOf Carnivore Animal eat subClassOf not subClassOf Herbivore Plant eat Tree subClassOf
Web Ontology Language (OWL) OWL is the main language for ontologies on SW It is built on top of RDF, RDF Schema and RFF/XML syntax Helps to define: Class and property hierarchies Instances Axioms / constraints Based on formal description logic, which means: proper OWL ontology does not have logical conflicts Know knowledge can be safely derived through formal inference and querying the ontology
Criticism of Semantic Web Practical feasibility HTML is easy Semantic Web is way to complicated Censorship and privacy Semantic layer of Web information makes it easier for governments to discover knowledge and control Doubling output formats Information has to be presented in regular and semantic way
The current state of the Web
The current state of Semantic Web
Linked Data Deployment
Ontological Agreement
RDF Links
Open Linked Data
HTML-embedded Data
Schema.org Work started in August 2010 Google, Yahoo!, Microsoft & then Yandex Goals: One vocabulary understood by all the search engines Make it very easy for the webmaster It is A vocabulary. Not The vocabulary. Webmasters can use it together other vocabs We might not understand the other vocabs. Others might
Principles Incremental Simplicity Simple things should be simple Webmasters shouldn’t have to deal with N namespaces Complex things should be possible Advanced webmasters should be able to mix and match vocabularies has to fit in with existing workflows Incremental Started simple ( ~ 100 categories at launch) Applies to every area Add complexity after adoption (now >1200 vocab items) Go back and fill in the blanks Collaboration Partner with Authoring platforms (Drupal, Wordpress, Blogger, YouTube…) Work with others to incorporate their vocabularies Any syntax possible (Microformats, RDFa, JSON-LD, …)
Overall Adoption in 2015
Widely-used Classes
Adoption by E-Commerce Websites
Properties used to Describe Products
Adoption by Travel Websites
Properties used to Describe Hotels
Adoption by Job Portals
Properties used to Describe Job Postings
Cool Applications
Cool Applications
Post-quiz Who knows what Semantic Web is? Who knows what R, D and F stand for in RDF? Who knows why have we mentioned Dublin today? Who knows what owl has to do any of this?