Embedding Knowledge in HTML

Slides:



Advertisements
Similar presentations
Semantic Descriptions for RESTful Services SA-REST by Knoesis Service Research Lab Tomas Vitvar WSMO Phone Conference January 09,
Advertisements

Steffen Staab 1WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Structured Data on the Web Introduction to.
Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
Gleaning Resource Descriptions from Dialects of Languages (GRDDL) W3C Team Submission 16 May 2005 Dominique Hazaël-Massieux, Dan Connolly Summarized by.
RDFa: Embedding RDF Knowledge in HTML Some content from a presentation by Ivan Herman of the W3c, Introduction to RDFa, given at the 2011 Semantic Technologies.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
Semantic Web Introduction
MICROFORMATS Ioana B ă rb ă nan Semantic Web developer.
RDF formats for Linked Data by Mabi Harandi. RDF is not a format, it is a model for data So: It will provide supports for different formats like :  Turtle.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
RDFa Lite. RDFa 1.1 Lite is a subset of RDFa 1.1 Five simple attributes: vocab, typeof, property, resource, and prefix Completely upwards compatible RDFa.
Sematic Web Microdata, Microformat and RDF Advanced Web-based Systems | Misbhauddin.
The Web of data with meaning... By Michael Griffiths.
2011 Semantic Technologies Conference 7 th of June, 2011, San Francisco, CA, USA Ivan Herman, W3C.
Ivan Herman (Changing slide set, re-used and a copy frozen for a specific presentation)
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
Representation of Web Data in a Web Warehouse Ragini A.S. & Shipra Dutta November 20 th, 2001.
RDF: Concepts and Abstract Syntax W3C Recommendation 10 February Michael Felderer Digital Enterprise.
Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
Who Wrote This $#!7 Authorship and Semantic Web Markup for SEO and Social Media Integration.
Semantic Web outlook and trends May The Past 24 Odd Years 1984 Lenat’s Cyc vision 1989 TBL’s Web vision 1991 DARPA Knowledge Sharing Effort 1996.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Logics for Data and Knowledge Representation
CP2022 Multimedia Internet Communication1 HTML and Hypertext The workings of the web Lecture 7.
The Semantic Web and Microformats. The Semantic Web Syntax = how you say something – Letters, words, punctuation Semantics = meaning behind what you say.
© Copyright 2008 STI INNSBRUCK HTML Data Guide W3C Interest Group Note 08 March 2012 OC Working Group –
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
RDFa, Microformats, and Atom Semantic Web Presented by: Anuradha Kandula Instructor: Steven Seida.
Microformats and Wikipedia. Microformats defined Microformats are a way of adding simple markup to human-readable data items such as events, contact details.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
Presented By:- Thomas Steiner Raphael Troncy Michael Hausenblas Reviewed By:- Sudeep Malik Professor :- Chris Mattmann.
 Structured Data An Introduction to Semantic Web “It is very hard for search engines to understand the structure and semantics of data embedded in an.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
RDFa Lite. RDFa 1.1 Lite is a subset of RDFa 1.1 Five simple attributes: vocab, typeof, property, resource, and prefix Completely upwards compatible RDFa.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
RDFa Primer Bridging the Human and Data webs Presented by: Didit ( )
Microdata in HTML 5.0 Web Applications Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague,
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Google maps engine and language presentation Ibrahim Motala.
Leveraging Web Content Management in SharePoint 2013 Christina Wheeler.
Semantic Web in Depth RDFa, GRDDL and POWDER Dr Nicholas Gibbins
The Semantic Web & Content Managment Systems Ole Gulbrandsen, CTO Stand: E7049.
Bryan Kelly Future Proof your Website for Organic Search Visibility (using Microformats)
Semantic Web in Depth Schema.org RDFa, JSON-LD, Microdata Professor Steffen Staab 2016, Many slides courtesy by Dr. Nick Gibbins.
An Introduction to RDFa QingXia Liu Contents What is RDFa? Why RDFa? Versions of RDFa An Example
ISWC 2010, Shanghai, 8 th November, 2010 Ivan Herman ( 郝易文 ), W3C.
Tutorial on Semantic Web
Web-Technology Lecture 13.
Embedding Knowledge in HTML
RDFa: Embedding RDF Knowledge in HTML
RDFa How and Why Ralph R. Swick World Wide Web Consortium
XML QUESTIONS AND ANSWERS
Microdata and schema.org
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
A Lightweight Structured Data Implementation Using JSON-LD and Schema
Introduction to XHTML.
Basic HTML and Embed Codes
PREMIS Tools and Services
Microdata and schema.org
Introduction to World Wide Web
Embedding Knowledge in HTML
CS 240 – Advanced Programming Concepts
RDFa: Embedding RDF Knowledge in HTML
JSON for Linked Data: a standard for serializing RDF using JSON
Resource Description Framework (RDF)
CSE591: Data Mining by H. Liu
Linked Data Reuse in the Language Services Industry
Presentation transcript:

Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c

Overview Why we want to embed structured data in HTML RDFa Microdata and schema.org RDFa lite as an encoding for Microdata JSON-LD as an encoding for RDF and Microdata Usecases and examples

HTML is Everywhere We usually think of HTML as the language of Web pages But it’s also widely used on/for mobile devices and tablets Readily adapts to different screen sizes/orientations And is the basis of many ebook formats E.g. Kindle’s formats, mobi, epub, … How can we add knowledge to HTML pages?

Adding RDF-like data to HTML Like to add semi-structured knowledge in HTML Humans see and understand regular HTML content (text, images, videos, audio) Machines see and understand data markup in XML, RDF or some other format Possibilities include Add a link to separate document with knowledge Add new HTML tags Embed knowledge as comments, Javascript, etc. Distribute knowledge markup throughout HTML as attributes of existing HTML tags

One page, not two Content providers prefer not to generate multiple page:, one for humans (HTML) and another for machines (RDF) RDF serializations are complex Requires separate mechanisms for storage, generation, etc. Introduces redundancy, which can lead to errors if we change one page but not the other Simplifies the job of search engines as well

General approach Two embedding techniques Provide/reuse tag attributes to encode the metadata; browsers/apps ignore attributes they don’t understand Add a graph for page in JSON-LD Several approaches have been developed Microformats (~ 2005) RDFa (~ 2007) Microdata (aka schema.org) (~ 2012) Status 2014/5 (IMHO) Microformats still used but future is limited RDFa and JSON-LD are preferred encoding of choice Schema.org vocabularies getting large uptake

Earliest idea, supplanted by RDFa and Microdata Reuses HTML attributes like @class, @title Separate vocabularies developed for common use cases, e.g., address, CV, recipes … Difficult to mix microformats (no concept of namespaces) Doesn’t define an RDF representation possible to transform via, e.g., XSLT + GRDDL, but transformations are vocabulary dependent

Microdata approach Defined and supported by Google, Bing, Yahoo and Yandex Adds new attributes to HTML5 to express metadata Works well for simple “single-vocabulary” cases, but not well suited for mixing vocabularies or for complex vocabularies No notion of datatypes or namespaces Defines a generic mapping to RDF

RDFa approach Adds new (X)HTML/XML attributes Has namespaces and URIs at its core So mixing vocabulary is easy, as in RDF Complete flexibility for using literals or URI resources Is a complete serialization of RDF

Yielding this RDF <http://www.ivan-herman.net/foaf#me> schema:alumniOf <http://www.elte.hu> ; foaf:schoolHomePage <http://www.elte.hu> ; schema:worksFor <http://www.w3.org/W3C#data> ; … <http://www.elte.hu> dc:title "Eötvös Loránd University of Budapest" . <http://www.w3.org/W3C#data> dc:title "World Wide Web Consortium (W3C)”

Yielding this RDF [ rdf:type schema:Review ; schema:name "Oscars 2012: The Artist, review" ; schema:description "The Artist, an utterly beguiling…" ; schema:ratingValue "5" ; … ]

Rich Snippets Search engines add text under results to preview what’s on page and why it’s relevant Text often extracted from structured data embedded on the page See http://bit.ly/RichSN for more information

RDFa and Microdata: similarities RDFa and Microdata are modern options Both have similar approaches Structured data encoded in HTML attributes only – no new elements Define some special attributes e.g., itemscope for microdata, resource for RDFa Reuse some HTML core attributes (e.g., href) Use textual content of HTML source, if needed RDF data can be extracted from both

RDFa and microdata: differences Microdata optimized for simpler use cases: One vocabulary at a time Tree shaped data Few datatypes RDFa provides full serialization of RDF in XML or HTML Price is extra complexity over Microdata RDFa 1.1 Lite is a simplified authoring profile of RDFa, very similar to microdata

Amount of structured data on Web? Web Data Commons project uses Common Crawl data to estimate amount of structured data on Web Looks for Microdata, RDFa, other formats (e.g., hCalendar, hCard) in URLs parsable as HTML Oct. 2016 crawl analyzed 3.2B HTML pages: 1.2B pages (30%) with structured data in 5.6M domains of 34M (17%) 44B triples about 9B typed entities Data can be downloaded Hcalendar is a microformat for icalendar data Hcad is a microformat for vcard data

Amount of structured data on Web? Hcalendar is a microformat for icalendar data Hcad is a microformat for vcard data

Conclusions The amount of structured data on the web is growing steadily Microdata shows the strongest growth RDFa also common Microformat data is probably not growing as much