HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org)

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

CC SQL Utilities.
Introduction to Algorithms Red-Black Trees
Indexing DNA Sequences Using q-Grams
The Hierarchical Model
Optimizing Join Enumeration in Transformation-based Query Optimizers ANIL SHANBHAG, S. SUDARSHAN IIT BOMBAY VLDB 2014
The Volcano/Cascades Query Optimization Framework
Hierarchies & Trees in SQL by Joe Celko copyright 2008.
1 COP 3540 Data Structures with OOP Chapter 9 – Red Black Trees.
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Drsql.org How to Implement a Hierarchy in SQL Server Louis Davidson (drsql.org)
Presented by Brad Gall Using BI Techniques for Database Statistics.
October 15-18, 2013 Charlotte, NC How to Model and Implement a Hierarchy in SQL Server AD-318-S Louis Davidson (drsql.org)
Balanced Binary Search Trees
Data Structures Data Structures Topic #13. Today’s Agenda Sorting Algorithms: Recursive –mergesort –quicksort As we learn about each sorting algorithm,
CS 206 Introduction to Computer Science II 02 / 20 / 2009 Instructor: Michael Eckmann.
Trees. 2 Definition of a tree A tree is like a binary tree, except that a node may have any number of children Depending on the needs of the program,
CSE 143 Lecture 19 Binary Trees read slides created by Marty Stepp
Binary Search Introduction to Trees. Binary searching & introduction to trees 2 CMPS 12B, UC Santa Cruz Last time: recursion In the last lecture, we learned.
Drsql.org How to Implement a Hierarchy in SQL Server Louis Davidson (drsql.org)
Introduction To Databases IDIA 618 Fall 2014 Bridget M. Blodgett.
MySql In Action Step by step method to create your own database.
Data Structures and Algorithms Semester Project – Fall 2010 Faizan Kazi Comparison of Binary Search Tree and custom Hash Tree data structures.
Trees & Graphs Nell Dale & John Lewis (adaptation by Michael Goldwasser and Erin Chambers)
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
How a little code can help with support.. Chris Barba – Developer at Cimarex Energy Blog:
DBA Developer. Responsibilities  Designing Relational databases  Developing interface layer Environment Microsoft SQL Server,.NET SQL Layer: Stored.
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
Drsql.org How to Write a DML Trigger Louis Davidson drsql.org.
DATA STRUCTURE & ALGORITHMS (BCS 1223) CHAPTER 8 : SEARCHING.
Trees and Graphs CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
March 7 & 9, Csci 2111: Data and File Structures Week 8, Lectures 1 & 2 Multi-Level Indexing and B-Trees.
CSE 143 Lecture 18 Binary Trees read slides created by Marty Stepp
November 6-9, Seattle, WA Triggers: Born Evil or Misunderstood? Louis Davidson.
Balanced Search Trees Fundamental Data Structures and Algorithms Margaret Reid-Miller 3 February 2005.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
COSC 2007 Data Structures II Chapter 15 External Methods.
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
Symbol Tables and Search Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Week 8 - Wednesday.  What did we talk about last time?  Level order traversal  BST delete  2-3 trees.
Master Data Management & Microsoft Master Data Services Presented By: Jeff Prom Data Architect MCTS - Business Intelligence (2008), Admin (2008), Developer.
Intermacs Form Download Excel Tutorial Pivot Tables, Graphic Tools, Macros By: Devin Koehl.
Louis Davidson drsql.org.  Introduction  Trigger Coding Basics  Designing a Trigger Solution  Advanced Trigger Concepts  Summary.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Intermacs Form Download Excel Tutorial Pivot Tables, Graphic Tools, Macros By: Devin Koehl.
Louis Davidson drsql.org.  Introduction  Designing a Trigger Solution  Trigger Coding Basics  Advanced Trigger Concepts  Summary SQL Saturday East.
Data StructuresData Structures Red Black Trees. Red-black trees: Overview Red-black trees are a variation of binary search trees to ensure that the tree.
Week 7 - Wednesday.  What did we talk about last time?  Recursive running time  Master Theorem  Symbol tables.
1 Trees. Objectives To understand the concept of trees and the standard terminology used to describe them. To appreciate the recursive nature of a tree.
What is the Fastest Sorting A Computer Science Project by Timothy Hewitt Algorithm?
Overview of Security Investments in SQL Server 2016 and Azure SQL Database Jamey Johnston 1/15/2016Security Investments in SQL Server 2016 and Azure SQL.
Scott Fallen Sales Engineer, SQL Sentry Blog: scottfallen.blogspot.com.
BTM 382 Database Management Chapter 8 Advanced SQL Chitu Okoli Associate Professor in Business Technology Management John Molson School of Business, Concordia.
How to Write a DML Trigger
Triggers: Born Evil or Misunderstood?
Querying Hierarchical Data
How to Model and Implement a Hierarchy in SQL Server
How to Implement a Hierarchy in SQL Server
Let Me Finish... Isolating Write Operations
CMSC 341 Lecture 13 Leftist Heaps
Let Me Finish... Isolating Write Operations
Let Me Finish... Isolating Write Operations
slides created by Marty Stepp and Alyssa Harding
Trees and Binary Trees.
Binary Trees Based on slides by Alyssa Harding & Marty Stepp
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
Let Me Finish... Isolating Write Operations
Let Me Finish... Isolating Write Operations
CSE 326: Data Structures Lecture #14
Presentation transcript:

HOW TO OPTIMIZE A HIERARCHY IN SQL SERVER Louis Davidson (drsql.org)

Who am I? Been in IT for over 18 years Microsoft MVP For 9 Years Corporate Data Architect Written five books on database design Ok, so they were all versions of the same book. They at least had slightly different titles each time

Hierarchies 3

Hierarchy Types Trees - Single Parent Hierarchies Graphs – Multi Parent Hierarchies Note: Graphs can be complex to deal with as a whole, but often you can deal with them as a set of trees 4 Screw Piece of Wood Wood with TapeScrew and Tape Tape

Hierarchy Uses Trees Species Jurisdictions “Simple” Organizational Charts (Or at least the base manager-employee part of the organization) Directory folders Graph Bill of materials Complex Organization Chart (all those dotted lines!) Genealogies Biological (Typically with limit cardinality of parents to 2 ) Family Tree – (Sky is the limit) 5

Implementation of a Hierarchy “There is more than one way to shave a dog” None of which are pleasant for the dog or the shaver And the doctor who orders it only asks for a bald dog Hierarchies are not at all natural to manipulate/query using relational code And the natural, recursive processing of a node at a time is horribly difficult and slow in relational code So, multiple methods of processing them have arisen through the years The topic (much like the topic of how cruel it is to shave a dog), inspires religious-like arguments I find all of the implementation possibilities fascinating, so I set out to do an overview of them all… 6

Working with Trees - Background Node Recursion Relational Recursion 7

Cycles in Hierarchies 8 Parent Child “I’m my own grandpa” syndrome Must be understood or can cause infinite loop in processing Generally disallowed in trees Generally handled in graphs Grandparent

Tree Processing Algorithms There are several methods for processing trees in SQL We will cover Fixed Levels Adjacency List HierarchyId Path Technique Nested Sets Kimball Helper Table Without giving away too much, pretty much all of the methods have some use… 9

Preconceived Notions Which method/algorithm do you expect to be fastest? Fixed Levels Adjacency List HierarchyId Path Technique Nested Sets Kimball Helper Table 10

Coding for trees Manipulation: Creating a new node Moving/Reparenting a node Deleting a node (with/without children) Usage Getting the children of a node Getting the parent of a node Aggregating along the tree Note: No tree algorithms allow for “simple” SQL solutions to all of these problems We will have demos of all of these operations… 11

Reparenting Example Starting with: Perhaps ending with: 12 Dragging along all of it’s child nodes along with it

Implementing a tree – Fixed Levels CREATE TABLE CompanyHierarchy ( Company varchar(100) NULL, Headquarters varchar(100) NULL, Branch varchar(100) NULL, PRIMARY KEY (Company, Headquarters, Branch) ) Very limited, but very fast and easy to work with I will not demo this structure today because it’s use is both extremely obvious and limited 13

Implementing a tree – Adjacency List Every row includes the key value of the parent in the row Parent-less rows have no parent value Code to get information out is the most complex to write (though not as inefficient as it might seem) CREATE TABLE CompanyHierarchy ( Organization varchar(100) NOT NULL PRIMARY KEY, ParentOrganization varchar(100) NULL REFERENCES CompanyHierarchy (Organization), Name varchar(100) NOT NULL ) 14

Implementing a tree – Path Method 15 Every row includes a representation of the path to their parent Processing makes use of like and string processing ( I have seen a case that used fixed length binary values) Limitation on path size for string manipulation/indexing CREATE TABLE CompanyHierarchy ( OrganizationId int NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Path varchar(900) )

Implementing a tree – HierarchyId 16 Somewhat unnatural method to the typical SQL Programmer Similar to the Path Method, and has some of the same limitations when moving around nodes Node path does not use data natural to the table, but rather positional locationing CREATE TABLE CompanyHierarchy ( OrganizationId int NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, OrgNode hierarchyId not null )

Implementing a tree – Nested Sets Query processing is done using range queries Structure is quite slow to maintain due to fragile structure Can produce excellent performance for queries CREATE TABLE CompanyHierarchy ( Organization varchar(100) NOT NULL PRIMARY KEY, Name varchar(100) NOT NULL, Left int NOT NULL, Right int NOT NULL ) 17

Implementing a tree – Kimball Helper Developed initially for data warehousing since data is modified all at once with a fixed cost Basically explodes the hierarchy into a table that turns all hierarchy manipulations into a relational query Maintenance can be slightly costly, but using the data is extremely fast 18

Implementing a tree – Kimball Helper For the rows in yellow, expands to the table shown: 19 ParentIdChildIdDistanceParentRootNodeChildLeafNode

Demo Setup For each style of hierarchy, we will see how to: Implement a physical model that models the corporate hierarchy of the previous graphics Create Stored Procedures for Insert, Delete, and Reparenting a Node Queries to access and aggregate the data in the hierarchy We will do this for two sets of data, the data in the presentation, and then a randomly generated set. 20

Demo Code Example code available in download 21

Did I change any of your minds? 22

Graphs Generally implemented in same manner as adjacency list Can be processed in the same manner as an adjacency list Primary difference is child can have > 1 parent node Cycles are generally acceptable Graph structure will always be external to data structure Graphs are even more natural data structures than trees 23

Graphs are Everywhere Almost any many to many can be a graph 24 Movie ActorActingCast DirectorMovieDirectory

Graph Demo 25 Person InterestPersonInterest PersonConnection

Contact info Louis Davidson - Website – <-- Get slides herehttp://drsql.org Twitter – SQL Blog Simple Talk Blog – What Counts for a DBA