SQL Server 2017 Graph Database Inside-Out Raymond Sondak @raymondsondak Sep 30th 2017 SQL Server 2017 Graph Database Inside-Out
Thanks to our sponsors! Please add this slide in your presentation
Please add this slide in your presentation
Hello there… W E @raymondsondak nl.linkedin.com/in/raymondsondak | BI Architect & Specialist W www.analyticsaholic.com E raymond.sondak@analyticsaholic.com The Biml Book Business Intelligence and Data Warehouse Automation Pre-oder: Apress or Amazon
Agenda Graph theory Graph database SQL Server 2017: SQL Graph Demo Limitations
Graph Theory Definition A graph G consists of a set of vertices V and a set of edges E, where an edge is an unordered pair of vertices. G=(V, E) Example Computer Networks V = {computers} E = {{A, B} | computers A and B are networked} Social Networks V = {Alice, Bob, Chris, Daniel, ...} E = {{A, B} | A and B know each other} V = {v1, v2, v3, v4, v5} and E = {e1, e2, e3, e4, e5, e6}. Source: https://math.dartmouth.edu/archive/m38s04/public_html/intro.pdf
Graph Database Database engine that models Nodes (or Vertices) and Edges Apply Graph theory to the data model that describe data relationships Stores relations as first class citizens Answer specific problem on data relationships that ‘not easy’ to achieve using relational database
Graph Database Tweet Organization Book Sales
SQL Graph Architecture SQL Server 2017 and Azure SQL Database Database can contains graph Graph is a collection of node and edge tables A node has properties and represents an entity An edge may or may not have properties and represents a relationship between two nodes Support SQL Server technologies Source: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture
SQL Graph objects Node Edge Source: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture
Create Node $node_id Hex string as internal identifier -- Create NODE Author CREATE TABLE Author ( AuthorId INTEGER PRIMARY KEY , Author VARCHAR(100) ) AS NODE $node_id Unique identifier of a node in the database JSON string of object_id and bigint value Implicit creation of non-clustered index Hex string as internal identifier Internal hidden column
Create Edge $edge_id $from_id and $to_id -- Create EDGE AuthorBy CREATE TABLE [graph].[AuthoredBy] AS EDGE $edge_id Unique identifier of an edge in the database JSON string of object_id and bigint value Implicit creation of non-clustered index $from_id and $to_id $node_id of the directed relation from and to node Hex string as internal identifier Internal hidden column
MATCH clause SELECT BookTitle FROM graph.Book, graph.AuthoredBy, graph.Author WHERE MATCH(Book-(AuthoredBy)->Author) AND Author like '%Coelho%' Using search pattern to traverse the graph from one node to another via edge ASCII-art syntax (\m/) for pattern matching
Relational database to Graph database edge node node node edge
Book database as example Book database from kaggle Enrich with own generated data Relational and Graph model demo Create node and edge tables Relational database query vs Graph query Source: https://www.kaggle.com/zygmunt/goodbooks-10k
Demo setup Create book data table Create book transaction table Load source data for book data and transaction Create relational tables Load relational tables Create graph tables Load graph tables
Demo case results Case 1 Case 4 Case 3 Case 2
Graph object query JSON string SELECT $node_id ,OBJECT_ID_FROM_NODE_ID($node_id) AS ObjectId ,JSON_VALUE($node_id, '$.type') + '.' + JSON_VALUE($node_id, '$.schema') + '.' + JSON_VALUE($node_id, '$.table') + '.' + JSON_VALUE($node_id, '$.id') AS JSONValue ,[AuthorId] ,[Author] FROM [BookDB].[graph].[Author]
Some limitations Cross database queries not supported UPDATE statement not supported (INSERT and DELETE) MERGE not supported No validation on node type for edge object (use trigger) No validation on referential integrity (use trigger) Polymorphism is not supported Transitive closure is not supported Graph algorithm like shortest path is not available
Trigger - Validation CREATE TRIGGER [graph].[ValidateAuthoredBy] ON [graph].[AuthoredBy] FOR INSERT AS IF EXISTS ( SELECT 1 FROM inserted WHERE JSON_VALUE($to_id, '$.schema')<>'graph' or JSON_VALUE($to_id, '$.table')<>'Author' ) BEGIN RAISERROR('Only author can authored book',10,1) ROLLBACK TRANSACTION END GO INSERT INTO graph.AuthoredBy VALUES ( (SELECT $node_id FROM graph.Book WHERE BookId = 1), (SELECT $node_id FROM graph.Customer WHERE CustomerId = 1) );
Trigger – Referential Integrity CREATE TRIGGER [graph].[ValidateDeleteAuthor] on [graph].[Author] FOR DELETE AS IF EXISTS ( SELECT 1 FROM deleted WHERE JSON_VALUE($node_id, '$.id') IN ( SELECT JSON_VALUE($to_id, '$.id') AS nodeid FROM [graph].[AuthoredBy] UNION SELECT JSON_VALUE($from_id, '$.id') AS nodeid FROM [graph].[AuthoredOf] ) BEGIN RAISERROR('Unable to delete, edge record available',10,1) ROLLBACK TRANSACTION END DELETE FROM [BookDB].[graph].[Author] WHERE AuthorId = 1
Graph Reporting in Power BI
Summary First release product RDBMS and Graph in one product Support SQL Server technologies (security, columnstore, partitioning, HA, R Services, …) Graph objects are just tables, will works out of the box with PowerBI, SSRS, SSIS, SSAS Complement RDBMS
Reference SQL Graph overview: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-overview SQL Graph Architecture: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture SQL Graph sample: https://docs.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-sample SQL Server Blog: https://blogs.technet.microsoft.com/dataplatforminsider/2017/04/20/graph-data-processing-with-sql-server-2017/ Force-Directed Graph Power BI Visual: https://appsource.microsoft.com/nl-nl/product/power-bi-visuals/WA104380764?src=office&tab=Overview
Please fill in the evaluations Please add this slide in your presentation