Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.

Slides:



Advertisements
Similar presentations
Semantic Descriptions for RESTful Services SA-REST by Knoesis Service Research Lab Tomas Vitvar WSMO Phone Conference January 09,
Advertisements

Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
Gleaning Resource Descriptions from Dialects of Languages (GRDDL) W3C Team Submission 16 May 2005 Dominique Hazaël-Massieux, Dan Connolly Summarized by.
RDFa: Embedding RDF Knowledge in HTML Some content from a presentation by Ivan Herman of the W3c, Introduction to RDFa, given at the 2011 Semantic Technologies.
SPECIAL TOPIC XML. Introducing XML XML (eXtensible Markup Language) ◦A language used to create structured documents XML vs HTML ◦XML is designed to transport.
MICROFORMATS Ioana B ă rb ă nan Semantic Web developer.
Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c.
Sematic Web Microdata, Microformat and RDF Advanced Web-based Systems | Misbhauddin.
2011 Semantic Technologies Conference 7 th of June, 2011, San Francisco, CA, USA Ivan Herman, W3C.
Ivan Herman (Changing slide set, re-used and a copy frozen for a specific presentation)
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
1 COS 425: Database and Information Management Systems XML and information exchange.
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
Semantic Web Presented by: Edward Cheng Wayne Choi Tony Deng Peter Kuc-Pittet Anita Yong.
XLink: Open Linking Standard XML / XSL separate  data semantics  presentation semantics Need to also separate out  navigation semantics Single unique.
Overview of Search Engines
CMPT241 Web Programming More HTML. Homework 1 Summary of what you have learned Link to the homework on the home page The page needs to be representable.
CEDROM-SNi’s DITA- based Project From Analysis to Delivery By France Baril Documentation Architect.
Microdata and schema.org. Basics Microdata is a simple semantic markup scheme that’s an alternative to RDFa Microdata Developed by WHATWG and supported.
Formalizing and Querying Heterogeneous Documents with Tables Krishnaprasad Thirunarayan and Trivikram Immaneni Department of Computer Science and Engineering.
Semantic Web Technologies ufiekg-20-2 | data, schemas & applications | lecture 21 original presentation by: Dr Rob Stephens
16-1 The World Wide Web The Web An infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that information.
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
Learning Web Design: Chapter 4. HTML  Hypertext Markup Language (HTML)  Uses tags to tell the browser the start and end of a certain kind of formatting.
CP2022 Multimedia Internet Communication1 HTML and Hypertext The workings of the web Lecture 7.
>> Introduction to HTML: Tags. Hyper - is the opposite of linear Text – words / sentences / paragraphs Mark-up – Marking the text Language – It is a language.
The Semantic Web and Microformats. The Semantic Web Syntax = how you say something – Letters, words, punctuation Semantics = meaning behind what you say.
 2008 Pearson Education, Inc. All rights reserved Introduction to XHTML.
Intro. to XML & XML DB Bun Yue Professor, CS/CIS UHCL.
CSCI 1101 Intro to Computers 7.1 Learning HTML. 2 Introduction Web pages are written using HTML Two key concepts of HTML are:  Hypertext (links Web pages.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
RDFa, Microformats, and Atom Semantic Web Presented by: Anuradha Kandula Instructor: Steven Seida.
Microsearch and SearchMonkey Interfaces for Semantic Search Peter Mika Researcher, Data Architect Yahoo! Research.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
HTML Basic. What is HTML HTML is a language for describing web pages. HTML stands for Hyper Text Markup Language HTML is not a programming language, it.
 Structured Data An Introduction to Semantic Web “It is very hard for search engines to understand the structure and semantics of data embedded in an.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
THE SEMANTIC WEB By Conrad Williams. Contents  What is the Semantic Web?  Technologies  XML  RDF  OWL  Implementations  Social Networking  Scholarly.
Headings are defined with the to tags. defines the largest heading. defines the smallest heading. Note: Browsers automatically add an empty line before.
Department of Computer Science, Florida State University CGS 3066: Web Programming and Design Spring
RDFa Primer Bridging the Human and Data webs Presented by: Didit ( )
Microdata in HTML 5.0 Web Applications Martin Nečaský Department of Software Engineering, Faculty of Mathematics and Physics, Charles University in Prague,
 XML derives its strength from a variety of supporting technologies.  Structure and data types: When using XML to exchange data among clients, partners,
Google maps engine and language presentation Ibrahim Motala.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Chapter 8 Adding Multimedia Content to Web Pages HTML5 & CSS 7 th Edition.
Semantic Web in Depth RDFa, GRDDL and POWDER Dr Nicholas Gibbins
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
The Semantic Web & Content Managment Systems Ole Gulbrandsen, CTO Stand: E7049.
Semantic Web in Depth Schema.org RDFa, JSON-LD, Microdata Professor Steffen Staab 2016, Many slides courtesy by Dr. Nick Gibbins.
ISWC 2010, Shanghai, 8 th November, 2010 Ivan Herman ( 郝易文 ), W3C.
Chapter 17 The Need for HTML 5.
Tutorial on Semantic Web
Embedding Knowledge in HTML
RDFa: Embedding RDF Knowledge in HTML
RDFa How and Why Ralph R. Swick World Wide Web Consortium
Microdata and schema.org
Introduction to XHTML.
Embedding Knowledge in HTML
HTML Vocabulary.
Basic HTML and Embed Codes
Microdata and schema.org
Introduction to World Wide Web
Embedding Knowledge in HTML
RDFa: Embedding RDF Knowledge in HTML
JSON for Linked Data: a standard for serializing RDF using JSON
Resource Description Framework (RDF)
Linked Data Reuse in the Language Services Industry
Presentation transcript:

Embedding Knowledge in HTML Some content from a presentations by Ivan Herman of the W3c

HTML is Everywhere We usually think of HTML as the language of Web pages But it’s also widely used on/for mobile devices and tablets – It readily adapts for different screen sizes/orientations And is the basis of many ebook formats – E.g. Kindle How can we add knowledge to HTML pages?

Adding RDF-like data to HTML We’d like to add semi-structured know- ledge to a conventional HTML document – Humans can see and understand the regular HTML content (text, images, videos, audio) – Machines can see and understand the data markup in XML, RDF or some other format Possibilities include – Add a link to a separate document with the knowledge – Embed the knowledge as comments, javascript, etc. – Distribute the knowledge markup throughout the HTML as attributed of existing HTML tags

Content providers prefer not to generate multiple pages, one for humans (HTML) and another for machines (RDF) – RDF serializations are complex – Requires a separate storage, generation, etc. mechanism – Introduces redundancy, which can lead to errors if we change one page but not the other Simplifies the job of search engines as well One page, not two

General approach Provide or reuse tag attributes to encode the metadata – Browsers and other web systems ignore attributes they don’t understand Three approaches have been developed – Microformats (~ 2005) Microformats – RDFa (~ 2007) – Microdata (~ 2012)

Reuses HTML Separate vocabularies (address, CV, …) Difficult to mix microformats (no concept of namespaces) Does not, inherently, define an RDF representation – possible to transform via, e.g., XSLT + GRDDL, but all transformations are vocabulary dependent Microformats approach

Defined and supported by Google and Bing Adds new attributes to HTML5 to express metadata works well for simpler “single-vocabulary” cases, but not well suited for mixing vocabularies or for complex vocabularies No notion of datatypes or namespaces Defines a generic mapping to RDF Microdata approach

Adds new (X)HTML/XML attributes Has namespaces and URIs at its core – So mixing vocabulary is easy, as in RDF Complete flexibility for using literals or URI resources Is a complete serialization of RDF RDFa approach

Yielding this RDF schema:alumniOf ; foaf:schoolHomePage ; schema:worksFor ; … dc:title "Eötvös Loránd University of Budapest". … dc:title "World Wide Web Consortium (W3C)” … schema:alumniOf ; foaf:schoolHomePage ; schema:worksFor ; … dc:title "Eötvös Loránd University of Budapest". … dc:title "World Wide Web Consortium (W3C)” …

Yielding this RDF [ rdf:type schema:Review ; schema:name "Oscars 2012: The Artist, review" ; schema:description "The Artist, an utterly beguiling…" ; schema:ratingValue "5" ; … ] [ rdf:type schema:Review ; schema:name "Oscars 2012: The Artist, review" ; schema:description "The Artist, an utterly beguiling…" ; schema:ratingValue "5" ; … ]

Rich Snippets Search engines add a few lines of text under results, giving users an idea of what’s on the page and why it’s relevant to their query These are often extracted from structured data embedded on the page See for more informationhttp://bit.ly/RichSN

RDFa and Microdata are modern options – Microformats is another Microformats Both have similar approaches – Structured data encoded in HTML attributes only – no new elements – Define some special attributes e.g., itemscope for microdata, resource for RDFa – Reuse some HTML core attributes (e.g., href ) – Use textual content of HTML source, if needed RDF data can be extracted from both RDFa and microdata: similarities

Microdata optimized for simpler use cases: – One vocabulary at a time – Tree shaped data – No datatypes RDFa provides full serialization of RDF in XML or HTML – Price is extra complexity over Microdata RDFa 1.1 Lite is a simplified authoring profile of RDFa, very similar to microdata RDFa and microdata: differences

Structured data in HTML is increasing … 25% of webpages containing RDFa data […] over 7% of web pages containing microdata. Mail from Peter Mika, YahooMail from Peter Mika, Yahoo! Based on a crawl evaluation by P. Mika and T. Potter LDOW2012 Workshop, April 2012, Lyon, France … web pages that contain structured data has increased from 6% in 2010 to 12% in Hannes Mühleisen and Christian Bizer Web Data Commons—Extracting Structured Data from Two Large Web Corpora, LDOW2012 Workshop, April 2012, Lyon, France