Paths towards the Sustainable Consumption of Semantic Data on the Web Aidan Hogan & Claudio Gutierrez.

Slides:



Advertisements
Similar presentations
REST Vs. SOAP.
Advertisements

Building a Simple Web Proxy
OCLC Research TAI CHI Webinar 5/27/2010 A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research.
Semantic Web Introduction
© Copyright IBM Corporation 2014 Getting started with Rational Engineering Lifecycle Manager queries Andy Lapping – Technical sales and solutions Joanne.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Skills: none Concepts: Web client (browser), Web server, network connection, URL, mobile client, protocol This work is licensed under a Creative Commons.
Actual Trends Semantic Web Lecture WS 2010/2011. What‘s next? W3C view: Look at Semantic Web activity:
IT skills: IT concepts: Web client (browser), Web server, network connection, URL, mobile client, peer-to- peer application This work is licensed under.
1 CS1001 Lecture Overview Java Programming Java Programming Midterm Review Midterm Review.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Session Management A290/A590, Fall /25/2014.
Computer communication B Introduction to the Semantic Web.
Proxy Servers Dr. Ronald Bergmann, CIO, ISO. Proxy servers A proxy server is a machine which acts as an intermediary between the computers of a local.
 Proxy Servers are software that act as intermediaries between client and servers on the Internet.  They help users on private networks get information.
Distributed File Systems Sarah Diesburg Operating Systems CS 3430.
PerfSONAR Client Construction February 11 th 2010, APAN 29 – perfSONAR Workshop Jeff Boote, Assistant Director R&D.
Ricerca Distribuita Semantica Protocolli opensource per la condivisione di risorse online.
Rajashree Deka Tetherless World Constellation Rensselaer Polytechnic Institute.
Michalis Vafopoulos NTUA, GFOSS & The transformers GREEN CITY HACKATHON.
Maryam Elahi University of Calgary – CPSC 441.  HTTP stands for Hypertext Transfer Protocol.  Used to deliver virtually all files and other data (collectively.
Java Omar Rana University of South Asia. Course Overview JAVA  C/C++ and JAVA Comparison  OOP in JAVA  Exception Handling  Streams  Graphics User.
The need for a stronger IVOA review process An SDM+SSA correctness assessment Bruno Rino, Nov
The Semantic Web Web Science Systems Development Spring 2015.
ITIS 1210 Introduction to Web-Based Information Systems Chapter 23 How Web Host Servers Work.
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
The Anatomy of a Large-Scale Hypertextual Web Search Engine Presented By: Sibin G. Peter Instructor: Dr. R.M.Verma.
The INTERNET how it works. the internet: defined So, what is it?
Integrating Live Plant Images with Other Types of Biodiversity Records Steve Baskauf Vanderbilt Dept. of Biological Sciences
KIT – University of the State of Baden-Württemberg and National Large-scale Research Center of the Helmholtz Association Institute of Applied Informatics.
Michalis Vafopoulos NTUA & Linked Data in a nutshell summer school NCSR, IRSS-2013.
Boris Villazón-Terrazas, Ghislain Atemezing FI, UPM, EURECOM, Introduction to Linked Data.
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,
LOD for the Rest of Us Tim Finin, Anupam Joshi, Varish Mulwad and Lushan Han University of Maryland, Baltimore County 15 March 2012
Introduction to the Semantic Web and Linked Data
Shridhar Bhalerao CMSC 601 Finding Implicit Relations in the Semantic Web.
The Semantic Web Matt Klubertanz. What is it? “The Semantic Web is an extension of the current web in which information is given well- defined meaning,
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
CS470 Computer Networking Protocols
Toward a framework for statistical data integration Ba-Lam Do, Peb Ruswono Aryan, Tuan-Dat Trinh, Peter Wetz, Elmar Kiesling, A Min Tjoa Linked Data Lab,
RDF and Relational Databases
Cloud Computing Computer Science Innovations, LLC.
Web Technologies Lecture 10 Web services. From W3C – A software system designed to support interoperable machine-to-machine interaction over a network.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Internet addresses By Toni Grey & Rashida Swan HTTP Stands for HyperText Transfer Protocol Is the underlying stateless protocol used by the World Wide.
San Diego Las Vegas November 18, DIRT User Community & Support Virtual Private DIRT DIRT 5.6 Release (Demo) DIRT 6.0 Status (Demo) DIRT French Translation.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
Linked Data Publishing on the Semantic Web Dr Nicholas Gibbins
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
What’s Really Happening
Shared innovation Linking Distributed Data across the Web Dr Tom Heath Researcher, Platform Division Talis Information Ltd t
Tiny http client and server
Developing Linked Data Applications
Distributed File Systems
Triple Stores.
Data.gov: Web, Data Web, Social Data Web 7/22/2010 #health2stat.
HTTP Protocol Specification
Some bits on how it works
Chapter 12: Automated data collection methods
Zachary Cleaver Semantic Web.
Triple Stores.
Tiers vs. Layers.
CC La Web de Datos Primavera 2018 Lecture 8: SPARQL [1.1]
Triple Stores.
Low-bandwidth Semantic Web
Semantic-Web, Triple-Strores, and SPARQL
AUCTORITAS: A Semantic Web-based tool for Authority Control
Presentation transcript:

Paths towards the Sustainable Consumption of Semantic Data on the Web Aidan Hogan & Claudio Gutierrez

Growth of the Linked Data Cloud Oct. 2007

Growth of the Linked Data Cloud Oct Nov Oct Nov. 2007

Growth of the Linked Data Cloud Oct Nov Feb Oct Nov Feb. 2008

Growth of the Linked Data Cloud Oct Nov Feb Sep Oct Nov Feb Sep. 2008

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar Oct Nov Feb Sep Mar. 2009

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar July 2009 Oct Nov Feb Sep Mar July 2009

Growth of the Linked Data Cloud Oct 2007 Oct Nov Feb Sep Mar July 2009 Sept Oct Nov Feb Sep Mar July 2009 Sept. 2010

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar July 2009 Sept Sept Oct Nov Feb Sep Mar July 2009 Sept Sept. 2011

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar July 2009 Sept Sept Sept Oct Nov Feb Sep Mar July 2009 Sept Sept Sept. 2012

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar July 2009 Sept Sept Sept Sept Oct Nov Feb Sep Mar July 2009 Sept Sept Sept Sept. 2013

Growth of the Linked Data Cloud Oct Nov Feb Sep Mar July 2009 Sept Sept Sept Sept June 2014 Oct Nov Feb Sep Mar July 2009 Sept Sept Sept Sept June 2014

Growth of the Linked Data Cloud One new dataset in the past year

In loving memory of Linked Data 2007–2012 Survived by its research community

Access Methods Client has a request/query Q Server has a dataset D Client issues Q to server Server computes and returns response Q(D) D HTTP Q Q(D)

Access Methods Q Q D D D Q Multiple clients / multiple servers (blurred) Remote, decentralised, uncoordinated Web scale Anything else like this? Q D

Access Methods (over the web) “The Web is not a database” -- Alberto Mendelzon, slides from a presentation at the Workshop on Internet Data Management, Washington, November 1998.

Linked Data Access Methods 1.Dereferencing: – Look up a URI, get an RDF document 2.Dumps: – Get all data in an archive 3.SPARQL Queries: – Send a query, get the answers

DEREFERENCING

Dereferencing (what is it?) Q = “ Q(D) =

Dereferencing (what’s wrong with it?) Responses vary from server to server – local triples where URI is subject (83%) vs. – local triples where URI is subject or object (55%) [Hogan et al. 2012] W ELL - DEFINED : For a given Q and D, clients and servers agree on what Q(D) should be.

Dereferencing (what’s wrong with it?) Very coarse: – Give me all capitals of South American countries. Dereference documents for all country URIs See which ones are in South America, throw away rest Throw away triples other than capitals G RANULAR : The language for Q allows the client to be specific so as to avoid wasting bandwidth

Dereferencing (what’s wrong with it?) No pagination: – Give me some information about Italy. Load document with 100,000 triples Throw away 99,900 triples the user won’t read P AGINATION : A large response Q(D) can be split into chunks

DUMPS

Dumps (what are they?)

Dumps (what’s wrong with them?) 15× compression for RDF achievable But same weaknesses as for deref. still apply [Fernández et al. 2012]

Dumps (what’s wrong with them?) 15× compression for RDF achievable But same weaknesses as for deref. still apply Also, no standard access methods: – Various compressions and formats – Linked through generic HTML [Fernández et al. 2012] A CCESSIBLE : The protocol and formats are defined for automatic access by software agents

SPARQL

SPARQL (what is it?) “What’s the name of the query language again … what’s that? Oh yes, SPARQL.” -- C. Mohan, June 2014, Cartagena, AMW Summer School

SPARQL (what is it?) Q = Q(D) = PREFIX dbo:... SELECT ?capital WHERE { ?s a dbo:Country ; dbp:capital ?c ; dcterms:subject category:Countries_in_South_America. ?c rdfs:label ?capital. FILTER (lang(?capital)="en") } PREFIX dbo:... SELECT ?capital WHERE { ?s a dbo:Country ; dbp:capital ?c ; dcterms:subject category:Countries_in_South_America. ?c rdfs:label ?capital. FILTER (lang(?capital)="en") }

SPARQL (to the rescue?)

SPARQL (what’s wrong with it?) [Buil Aranda et al. 2012]

SPARQL (what’s wrong with it?) [Buil Aranda et al. 2012]

SPARQL (what’s wrong with it?) Simple Protocol and RDF Query Language SPARQL evaluation: PS PACE -complete SPARQL 1.1 even more complex [Perez et al. 2009] [Arenas et al. 2013]

SPARQL (what’s wrong with it?) C OSTABLE : The cost of processing a query can be anticipated before actual processing. C ACHEABLE : Common requests can be cached and re-used. Queries can be anticipated.

SPARQL (what’s wrong with it?)

Simple Protocol And RDF Query Language Protocol always expects a perfect answer – No support for partial results, timeouts, exception handling, pagination … R OBUST : Access method can gracefully handle error cases and provide Quality of Service

SPARQL (what’s wrong with it?) D is a black-box for the user

SPARQL (what’s wrong with it?) Q = Q(D) = PREFIX dbo: SELECT (COUNT(?c) as ?count) WHERE { ?c a dbo:Country. } PREFIX dbo: SELECT (COUNT(?c) as ?count) WHERE { ?c a dbo:Country. } T RANSPARENT : The client can determine if a dataset D is relevant and the service sufficient.

WRAP-UP

Problem Categories 1.Standardised 2.Bandwidth-efficient 3.Server-processing-efficient 4.Usable by client

Conclusion Data access methods crucial – Access = Protocol + Query Language But current ones don’t work → need something else (or multiple things?) – Open question: