Graph Algebra with Pattern Matching and Aggregation Support 1.

Slides:



Advertisements
Similar presentations
Three-Step Database Design
Advertisements

Database Management Systems, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
D ATABASE S YSTEMS I R ELATIONAL A LGEBRA. 22 R ELATIONAL Q UERY L ANGUAGES Query languages (QL): Allow manipulation and retrieval of data from a database.
Composite Subset Measures Lei Chen, Paul Barford, Bee-Chung Chen, Vinod Yegneswaran University of Wisconsin - Madison Raghu Ramakrishnan Yahoo! Research.
Management Information Systems, Sixth Edition
GI Systems and Science January 30, Points to Cover  Recap of what we covered so far  A concept of database Database Management System (DBMS) 
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
CS263 Lecture 19 Query Optimisation.  Motivation for Query Optimisation  Phases of Query Processing  Query Trees  RA Transformation Rules  Heuristic.
Lecture Microsoft Access and Relational Database Basics.
Geographic Information Systems
Mining Tree-Query Associations in a Graph Bart Goethals University of Antwerp, Belgium Eveline Hoekx Jan Van den Bussche Hasselt University, Belgium.
1 9 Concepts of Database Management, 4 th Edition, Pratt & Adamski Chapter 9 Database Management Approaches.
Chapter 14 The Second Component: The Database.
CS405G: Introduction to Database Systems Final Review.
The University of Akron Dept of Business Technology Computer Information Systems Database Management Approaches 2440: 180 Database Concepts Instructor:
The Relational Model Codd (1970): based on set theory Relational model: represents the database as a collection of relations (a table of values --> file)
RIZWAN REHMAN, CCS, DU. Advantages of ORDBMSs  The main advantages of extending the relational data model come from reuse and sharing.  Reuse comes.
Chapter 4: Organizing and Manipulating the Data in Databases
Databases with Scalable capabilities Presented by Mike Trischetta.
Chapter 11 Databases.
Lecture 2 The Relational Model. Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical relations.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.
1 Overview of Databases. 2 Content Databases Example: Access Structure Query language (SQL)
M1G Introduction to Database Development 6. Building Applications.
G-SPARQL: A Hybrid Engine for Querying Large Attributed Graphs Sherif SakrSameh ElniketyYuxiong He NICTA & UNSW Sydney, Australia Microsoft Research Redmond,
CODD’s 12 RULES OF RELATIONAL DATABASE
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
DANIEL J. ABADI, ADAM MARCUS, SAMUEL R. MADDEN, AND KATE HOLLENBACH THE VLDB JOURNAL. SW-Store: a vertically partitioned DBMS for Semantic Web data.
MIS 3053 Database Design & Applications The University of Tulsa Professor: Akhilesh Bajaj RM/SQL Lecture 1 ©Akhilesh Bajaj, 2000, 2002, 2003, All.
1 The Relational Database Model. 2 Learning Objectives Terminology of relational model. How tables are used to represent data. Connection between mathematical.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
9/7/2012ISC329 Isabelle Bichindaritz1 The Relational Database Model.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Lecture2: Database Environment Prepared by L. Nouf Almujally 1 Ref. Chapter2 Lecture2.
SPARQL Query Graph Model (How to improve query evaluation?) Ralf Heese and Olaf Hartig Humboldt-Universität zu Berlin.
ICS 321 Fall 2011 The Relational Model of Data (i) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 8/29/20111Lipyeow.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Creating and Maintaining Geographic Databases. Outline Definitions Characteristics of DBMS Types of database Relational model SQL Spatial databases.
C-Store: RDF Data Management Using Column Stores Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Apr. 24, 2009.
Usability and Integration H. V. Jagadish. Many Sources of Data Text XML/semi-structured Experimental measurements Public databases Some data may have.
Database Management Systems, R. Ramakrishnan1 Relational Algebra Module 3, Lecture 1.
SQL Based Knowledge Representation And Knowledge Editor UMAIR ABDULLAH AFTAB AHMED MOHAMMAD JAMIL SAWAR (Presented by Lei Jiang)
Visualization Four groups Design pattern for information visualization
Steven Seida How Does an RDF Knowledge Store Compare to an RDBMS?
Lecture 10 Creating and Maintaining Geographic Databases Longley et al., Ch. 10, through section 10.4.
Mining real world data RDBMS and SQL. Index RDBMS introduction SQL (Structured Query language)
Conceptualization Relational Model Incomplete Relations Indirect Concept Reflection Entity-Relationship Model Incomplete Relations Two Ways of Concept.
What is OLAP?.
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
Sesame A generic architecture for storing and querying RDF and RDFs Written by Jeen Broekstra, Arjohn Kampman Summarized by Gihyun Gong.
The Big Picture Chapter 3. A decision problem is simply a problem for which the answer is yes or no (True or False). A decision procedure answers a decision.
Database Systems, 8 th Edition SQL Performance Tuning Evaluated from client perspective –Most current relational DBMSs perform automatic query optimization.
Context Aware RBAC Model For Wearable Devices And NoSQL Databases Amit Bansal Siddharth Pathak Vijendra Rana Vishal Shah Guided By: Dr. Csilla Farkas Associate.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
NOSQL databases and Big Data Storage Systems
Database Performance Tuning and Query Optimization
Relational Algebra Chapter 4, Part A
Associative Query Answering via Query Feature Similarity
Relational Algebra 461 The slides for this text are organized into chapters. This lecture covers relational algebra, from Chapter 4. The relational calculus.
Design of Declarative Graph Query Languages: On the Choice between Value, Pattern and Object based Representations for Graphs Hasan Jamil Department of.
Relational Algebra Chapter 4, Sections 4.1 – 4.2
The Relational Model Textbook /7/2018.
Keyword Searching and Browsing in Databases using BANKS
G-CORE: A Core for Future Graph Query Languages
Probabilistic Databases
Chapter 11 Database Performance Tuning and Query Optimization
Query Optimization.
Presentation transcript:

Graph Algebra with Pattern Matching and Aggregation Support 1

Nowadays Graph Variety of Sources ◦ Scientific Studies ◦ Business Activities ◦ Social Needs ◦ Internet Data are often of ◦ Large Scale ◦ Highly Liked ◦ Schema-less 2

Managing Graph Data Primary Role of Database ◦ Persistent store ◦ Efficient Query RDBMS ◦ Storage Model : vertex and edge as tuples ◦ Query: Link is by join Graph Database ◦ Storage Model: graphs ◦ Query: path traversal 3

Why not RDBMS ? Schema Issue ◦ Every data inserted may of a different schema (Web Graph) ◦ Hard to represent semi structured info Scalability Issues ◦ ACID property VS CAP theorem Query performance ◦ Difficult to optimize intensive Joins 4

Graph Databases and Query Languages No Universal Languages !!! 5

No Universal Language Like SQL? No commonly agreed algebra Relational Algebra ? ◦ Expressive, test-of-time to be effective ◦ NOT suitable for GRAPH Graph Algebra ? ◦ Still at preliminary work 6

Issues with Relational Algebra (RA) Defined on Tuples or Set of Tuples ◦ Mismatch with graph nature ◦ Operators loose semantics  What is Union, Intersection, Join in GRAPH? ◦ I/O type ?  Tables not GRAPH Domain centric, not Data centric ◦ Don’t anticipate out-of-order data ◦ Treat Tuples as independent  Didn’t aware the links among Tuples  Queries written using RA are verbose and complex 7

Advantage of Graph Algebra An algebra itself is a query language ◦ Easy to work out a language with Strong theoretic support Evaluate expressiveness of given languages ◦ Justify when to use what: Gremlin, Cypher etc. Query Optimization ◦ Operator order EQUALS execution plan ◦ Algebraic Equivalence IMPLIES query optimization 8

Advantage of Graph Algebra Separation of Query and System: ◦ One can write Query on any system as long as common algebra is supported. ◦ Knowing RA, one can write SQL, PL/SQL, MS/SQL on MySQL, Oracle, SQLServer Integrate new operators to database: ◦ Current graph database systems didn’t support newly developed queries:  Graph OLAP, Graph Cube, Graph Aggregation etc. ◦ Proper Algebra can incorporate these operators 9

Existing Works on Graph Algebra Graph QL [1] ◦ A graph based algebra, operators are based on graphs ◦ Selection ◦ Join – not properly defined ◦ Template VAQL [2] ◦ Focused on visualization ◦ Selection ◦ Aggregation – restricted ◦ Visualization Selection is restricted on isomorphism Aggregation is not defined over edges No algebra equivalence [1] He, Huahai, and Ambuj K. Singh. "Graphs-at-a-time: query language and access methods for graph databases." Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, [2] Shaverdian, Anna A., et al. "A graph algebra for scalable visual analytics." Computer Graphics and Applications, IEEE 32.4 (2012):

What we want for a Graph Algebra? Universal ◦ Independent of graph types:  Directed VS Undirected. Simple VS Hyper. Homogeneous VS heterogeneous. Expressive ◦ Able to answer typical graph queries:  Pattern match, Reachability, Path finding etc. ◦ Cover Relational Algebra (RA)  This ensures that graph database can handle relational data as well Scale ◦ Able to manage data in-scale  Support queries to summarize, aggregate data 11

Extended Algebra – Graph Model 12

Extended Algebra – Operators 13

14

15

16

P(v1,v1) and P(v4,v5) are true 17

18

[1] Fan, Wenfei, et al. "Adding regular expressions to graph reachability and pattern queries." Data Engineering (ICDE), 2011 IEEE 27th International Conference on. IEEE,

20

21

22

Expressiveness This set of operators are more expressive than Relational Algebra and Graph QL It can represent many graph queries ◦ Reachability ◦ Graph Cube computation ◦ I-OLAP and T-OLAP 23

Algebra Equivalence When operators are chained up, they can form a query execution plan Find the network induced by the person whose friends comment on each other’s posts with birthday greater than Output those names as a graph friend Comment friend Base Graph Matched Result Restriction v.name V-Unification 24

Algebra Equivalence To generate multiple execution plans for a same query, we need theoretic support: 25

Conclusion Graph Algebra plays an important role in graph database development We make one step forward by proposing a Graph Algebra which: ◦ extends existing algebraic work with  Regular pattern matching  Aggregation ◦ is expressive and well-defined ◦ contains equivalence rules for further query optimization 26

27