1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan.

Slides:



Advertisements
Similar presentations
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Advertisements

Schema Matching and Query Rewriting in Ontology-based Data Integration Zdeňka Linková ICS AS CR Advisor: Július Štuller.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Outline  Introduction  Background  Distributed DBMS Architecture  Distributed Database Design  Semantic Data Control ➠ View Management ➠ Data Security.
RDF Tutorial.
RDF Databases By: Chris Halaschek. Outline Motivation / Requirements Storage Issues Sesame General Introduction Architecture Scalability RQL Introduction.
Probabilistic RDF Octavian Udrea 1 V.S. Subrahmanian 1 Zoran Majkić 2 1 University of Maryland College Park 2 University “La Sapienza”, Rome, Italy.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 5 More SQL: Complex Queries, Triggers, Views, and Schema Modification.
Incremental Maintenance for Non-Distributive Aggregate Functions work done at IBM Almaden Research Center Themis Palpanas (U of Toronto) Richard Sidle.
Incremental Materialization of RDF Graph Closures for Stream Reasoning Alexandre Mello Ferreira (PhD student) 22/11/2010.
Dr. Alexandra I. Cristea RDF.
Introduction to Structured Query Language (SQL)
TOSS: An Extension of TAX with Ontologies and Similarity Queries Edward Hung, Yu Deng, V.S. Subrahmanian Department of Computer Science University of Maryland,
RDF: Building Block for the Semantic Web Jim Ellenberger UCCS CS5260 Spring 2011.
Slides adapted from A. Silberschatz et al. Database System Concepts, 5th Ed. SQL - part 2 - Database Management Systems I Alex Coman, Winter 2006.
Database Systems More SQL Database Design -- More SQL1.
SQL I. SQL – Introduction  Standard DML/DDL for relational DB’s  DML = “Data Manipulation Language” (queries, updates)  DDL = “Data Definition Language”
Chapter 3A Semantic Web Primer 1 Chapter 3 Querying the Semantic Web Grigoris Antoniou Paul Groth Frank van Harmelen Rinke Hoekstra.
GRIN – A Graph Based RDF Index Octavian Udrea Andrea Pugliese V. S. Subrahmanian Presented by Tulika Thakur.
SPARQL All slides are adapted from the W3C Recommendation SPARQL Query Language for RDF Web link:
©Silberschatz, Korth and Sudarshan4.1Database System Concepts Chapter 4: SQL Basic Structure Set Operations Aggregate Functions Null Values Nested Subqueries.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 3: Introduction.
Relational DBs and SQL Designing Your Web Database (Ch. 8) → Creating and Working with a MySQL Database (Ch. 9, 10) 1.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Logics for Data and Knowledge Representation
IDB, SNU Dong-Hyuk Im Efficient Computing Deltas between RDF Models using RDFS Entailment Rules (working title)
CSC271 Database Systems Lecture # 12. Summary: Previous Lecture  Row selection using WHERE clause  WHERE clause and search conditions  Sorting results.
Storage and Retrieval of Large RDF Graph Using Hadoop and MapReduce Mohammad Farhan Husain, Pankil Doshi, Latifur Khan, Bhavani Thuraisingham University.
Ontology Query. What is an Ontology Ontologies resemble faceted taxonomies but use richer semantic relationships among terms and attributes, as well as.
SQL: Data Manipulation Presented by Mary Choi For CS157B Dr. Sin Min Lee.
DATABASE TRANSACTION. Transaction It is a logical unit of work that must succeed or fail in its entirety. A transaction is an atomic operation which may.
RDF and XML 인공지능 연구실 한기덕. 2 개요  1. Basic of RDF  2. Example of RDF  3. How XML Namespaces Work  4. The Abbreviated RDF Syntax  5. RDF Resource Collections.
6 1 Lecture 8: Introduction to Structured Query Language (SQL) J. S. Chou, P.E., Ph.D.
Database Management COP4540, SCS, FIU Structured Query Language (Chapter 8)
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
IS 230Lecture 6Slide 1 Lecture 7 Advanced SQL Introduction to Database Systems IS 230 This is the instructor’s notes and student has to read the textbook.
Chapter 11 Hash Tables © John Urrutia 2014, All Rights Reserved1.
FlexTable: Using a Dynamic Relation Model to Store RDF Data IDS Lab. Seungseok Kang.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 7 (Part II) INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) Instructor.
Practical RDF Chapter 10. Querying RDF: RDF as Data Shelley Powers, O’Reilly SNU IDB Lab. Hyewon Lim.
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Practical RDF Ch.10 Querying RDF: RDF as Data Taewhi Lee SNU OOPSLA Lab. Shelley Powers, O’Reilly August 27, 2004.
1 CSCE Database Systems Anxiao (Andrew) Jiang The Database Language SQL.
Description of Information Resources: RDF/RDFS (an Introduction)
Of 38 lecture 6: rdf – axiomatic semantics and query.
RDF & SPARQL Introduction Dongfang Xu Ph.D student, School of Information, University of Arizona Sept 10, 2015.
Bhanu Pratap Gupta Devang Vira S. Sudarshan Dept. of Computer Science and Engineering, IIT Bombay.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
GRIN: A Graph Based RDF Index Octavian Udrea 1 Andrea Pugliese 2 V. S. Subrahmanian 1 1 University of Maryland College Park 2 Università di Calabria.
7 1 Database Systems: Design, Implementation, & Management, 7 th Edition, Rob & Coronel 7.6 Advanced Select Queries SQL provides useful functions that.
SQL: Interactive Queries (2) Prof. Weining Zhang Cs.utsa.edu.
Concepts of Database Management, Fifth Edition Chapter 3: The Relational Model 2: SQL.
1 RDF Storage and Retrieval Systems Jan Pettersen Nytun, UiA.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Teacher Workshop Database Design Pearson Education © 2014.
Service-Oriented Computing: Semantics, Processes, Agents
SQL Query Getting to the data ……..
Chapter 5 Introduction to SQL.
Introduction to Databases (2)
Slides are reused by the approval of Jeffrey Ullman’s
Relational Algebra - Part 1
Service-Oriented Computing: Semantics, Processes, Agents
Middleware independent Information Service
SQL: Advanced Options, Updates and Views Lecturer: Dr Pavle Mogin
Chapter # 7 Introduction to Structured Query Language (SQL) Part II.
Instructor: Mohamed Eltabakh
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Lu Xing CS59000GDM Sept 7th, 2018.
Semantic-Web, Triple-Strores, and SPARQL
Presentation transcript:

1 RDF Aggregate Queries and Views Edward Hung, Yu Deng, V.S. Subrahmanian University of Maryland, College Park ICDE 2005, April 7, Tokyo, Japan

2 Maintenance of RDF Aggregate Views Introduction of RDF and RDQL RDQL Extension for Aggregate Views Aggregate View Maintenance Algorithms AMX Implementation and Experiments Related Work

3 Introduction Resource Description Framework (RDF)  W3C Recommendation  Represents metadata about resources identifiable on the web (by Uniform Resource Identifier (URI))  Triple: (Resource, Property, Value) (Artist, rdf:type, rdfs:Class) (Painter, rdf:type, rdfs:Class) (Painter, rdfs:subClassOf, Artist)

]> ]> <rdf:RDF xmlns:rdf =" xmlns:ns1=" Guy RDF Schema RDF Instance

]> ]> <rdf:RDF xmlns:rdf =" xmlns:ns1=" Guy Artist String Painter fname subClassOf &r1 Guy fname &r1 =

7 RDQL: RDF Query Language SELECT?highprice WHERE (?artist,, "Rose"), (?artist,, "Guy"), (?artist,, ?artifact), (?artifact,, ?price), (?price,, ?highprice), (?artifact,, ?date) AND <= ?date <= USING ns1 FOR graph pattern

8 RDQL Extension for Aggregates and Views CREATEVIEW AS SELECT max(?highprice) WHERE (?artist,, "Rose"), (?artist,, "Guy"), (?artist,, ?artifact), (?artifact,, ?price), (?price,, ?highprice), (?artifact,, ?date) AND <= ?date <= USING ns1 FOR

9 Aggregate Query Aggregate operators, e.g. min, max, sum, count, average GROUP BY clause Output a table of tuples  Output can be (i) an RDF instance or (ii) a table  Advantage of (i): allows us to further query the result  However, (ii) allows any forms of tables, which include the possibility to output in the form of an RDF instance if the table consists of a set of RDF tuples.

We are expanding the syntax of RDQL so that it allows constants in SELECT clauses which equivalently creates new resources using the constants. For example, the previous query can be modified as follows CREATEVIEW AS SELECT,, max(?highprice) WHERE (?artist,, "Rose"), (?artist,, "Guy"), (?artist,, ?artifact), (?artifact,, ?price), (?price,, ?highprice), (?artifact,, ?date) AND <= ?date <= USING ns1 FOR The result is a valid RDF statement (,,``800000"^^ns1:USD)

11 Aggregate View Maintenance Relational Approach  Store all triples in a relational table with schema (Resource, Property, Value) OR  Store resources and values of the same property in a separate relational table with schema (Resource, Value)  #self-joins = (#triples in where-clause) – 1  Large number of delta rules during relational view maintenance  expensive

12 Aggregate View Maintenance Our Approach  Localized search in RDF graphs  Modified version of breadth-first search starting at the inserted/deleted edge  auxiliary data are needed for certain aggregate views min, max, avg

13 Distributive Aggregate Function An aggregate function f is distributive w.r.t a source update operation if and only if  the updated value is based on its old value and update without reference to the source.  Examples: count, sum, average w.r.t. insertion, deletion and update  For average, we will need an additional attribute size which stores the size of intermediate result S in order to compute the correct updated value (or, we can use sum, count to calculate it) max and min are distributive w.r.t. insertion, but not deletion and update  Auxiliary data computed from S help to avoid the need to refer to the source.

graph pattern

BAG

800000

SELECT max(?highprice) BAG ,

18 Compute Aggregates Algorithm CAA Algorithm CAA(I, Q) /* Input: RDF graph I, query Q */ /* Output: table T(Q, I) */ 1) GP  BuildGP(Q); X  aggregate variables of Q; 2) Y  GROUP BY variables of Q; 3) S  [VRetrieve(θ, GP, X U Y) | θ  MSearchAll(GP, Q, I)]; 4) Return T(Q, I)  TCompute(S, Q);

19 Aggregate View Maintenance Algorithms AMX AMI – Insertion AMD – Deletion AMT – Triple Modification AMR – Resource Modification

Update: Insertion BAG , paints

BAG , paints

SELECT max(?highprice) BAG , , paints

23 AMI for Insertion Algorithm AMI(I, Q, A(Q, I), T(Q, I), t) /* Input: RDF graph I, query Q, auxiliary data A(Q, I), query result T(Q, I), inserted triple t */ /* Output: table T(Q, I U t), auxiliary data A(Q, I U t) * 1) GP  BuildGP(Q); 2) X  aggregate variables of Q; 3) Y  GROUP BY variables of Q; 4) If TMatch(GP, t) == TRUE, then a) ΔS  [VRetrieve(θ, GP, X U Y) | θ  MSearch(GP, Q, t, I U t)]; b) return (T(Q, I U t), A(Q, I U t))  TMaintain I (T(Q,I), ΔS, A(Q, I), Q); 5) else, return (T(Q, I U t), A(Q, I U t))  (T(Q, I), A(Q, I));

24 Algorithm MSearch(GP, Q, t, I) /* Input: graph pattern GP, query Q, triple t, RDF graph I */ /* Output: Θ = {θ | θ is a pattern matching} */ 1) Θ   ; 2) for each t’  GP s.t.  θ’, t θ’ = t’ θ’, a) for each θ  bSearch(t, t’, GP, I), i. if θ satisfies the constraints in Q, then Θ  Θ U θ; 3) return Θ;

25 Handling GROUP BY From GROUP BY clause, each tuple in ΔS affects a particular group. TMaintain I only maintain each affected group (and its corresponding auxiliary data) using affecting tuples. Delete empty groups and insert new groups.

26 TMaintain I Handling sum, count, min, max  No auxiliary data required  Suppose f(x) is an aggregate function on attribute x, F the original result, F’ the new result F’ = F + if f = sum F’ = F + |ΔS| if f = count F’ = min([F] U π x (ΔS)) if f = min F’ = max([F] U π x (ΔS)) if f = max  π x (ΔS) projects a bag of values of x from ΔS

27 TMaintain I Handling average  We need size of S size’ = size+|ΔS|

BAG , , Update: Deletion paints

BAG , , paints

SELECT max(?highprice) BAG , paints

31 AMD for Deletion Algorithm AMD(I, Q, A(Q, I), T(Q, I), t) /* Input: RDF graph I, query Q, auxiliary data A(Q, I), query result T(Q, I), deleted triple t */ /* Output: table T(Q, I - t), auxiliary data A(Q, I - t) * 1) GP  BuildGP(Q); 2) X  aggregate variables of Q; 3) Y  GROUP BY variables of Q; 4) If TMatch(GP, t) == TRUE, then a) ΔS  [VRetrieve(θ, GP, X U Y) | θ  MSearch(GP, Q, t, I)]; b) return (T(Q, I - t), A(Q, I - t))  TMaintain D (T(Q,I), ΔS, A(Q, I), Q); 5) else, return (T(Q, I - t), A(Q, I - t))  (T(Q, I), A(Q, I));

32 TMaintain D Handling min, max  Min and max are not distributive w.r.t. deletion  We need to store π x (S) which projects a bag of values of x from S  The new aggregate value F’ is obtained by: F’ = min(π x (S - ΔS)) if f = min F’ = max(π x (S - ΔS)) if f = max  We need to update π x (S) to become π x (S) - π x (ΔS)

33 Implementation and Experiment Implemented in Java Jena – RDQL Engine of HP Comparison with Relational Approach (standard view maintenance algorithm on relational tables)  Counting Algorithm in Gupta et al. "Maintaining Views Incrementally", SIGMOD 1993 Dataset: Chef Moz Project RDF dump Data stored in memory

34

35 Other Related Work Volz, Oberle, Studer [DBFUSION’02]  the first to introduce a view mechanism for RDF data  Their views require that 1. the results contain class instances (i.e., a subject or object variable), or 2. the result itself has the pattern of RDF statement (i.e., a triple containing subject, predicate and object). Magkanaraki et al [ISWC’03]  proposed RVL, a view definition language that can also create virtual RDF schemas and restructure class and property hierarchies such that new resources, property values, classes and property types can be created. None of these works specifically address (i) aggregates in RDF or (ii) the problem of maintaining aggregate RDF views.

36 Summary Aggregate Views are important for RDF applications RDQL Extension for Views and Aggregates Aggregate View Maintenance Algorithms AMX  Localized search in RDF graphs

37 Thank you very much! Questions and Answers