Managing Table Partitions at the Extreme

Slides:



Advertisements
Similar presentations
Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF.
Advertisements

This work is by Kendra Little and is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported LicenseCreative Commons Attribution-NonCommercial-NoDerivs.
One Dimensional Arrays
Big Data Working with Terabytes in SQL Server Andrew Novick
Tables Lesson 6. Skills Matrix Tables Tables store data. Tables are relational –They store data organized as row and columns. –Data can be retrieved.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Dual Partitioning for improved performance in VLDBs Ashwin Rao Karavadi, Rakesh Parida Microsoft IT.
Quartiles & Extremes (displayed in a Box-and-Whisker Plot) Lower Extreme Lower Quartile Median Upper Quartile Upper Extreme Back.
With Microsoft Office 2007 Introductory© 2008 Pearson Prentice Hall1 PowerPoint Presentation to Accompany GO! with Microsoft ® Office 2007 Introductory.
SQL Server 2005 – Table Partitioning Vinod Kumar Intel Technology India Pvt. Ltd. MVP – SQL Server
Praveen Srivatsa Director| AstrhaSoft Consulting blogs.asthrasoft.com/praveens |
SQL Server 2005 – Table Partitioning Chad Gronbach Microsoft.
# CCNZ What is going on here???
Splits, Merges and Purges THE HOW TO OF TABLE PARTITIONING.
CSCI 4333 Database Design and Implementation – Exercise (5)
Introduction to Partitioning in SQL Server
With Temporal Tables and More
Restricting and Sorting Data
Standard Operating Procedure
Parameter Sniffing in SQL Server Stored Procedures
In-Memory Capabilities
Exploring Excel Chapter 5 List and Data Management: Converting Data to
Relational Database Design
Module 2: Creating Data Types and Tables
Data Virtualization Community Edition
Introduction to SQL 2016 Temporal Tables
Running Example – Airline
Partitioned Tables and Query Performance
Creating and Formatting Tables
Why Should I Care About … Partitioned Views?
Why Should I Care About … Partitioned Views?
The Ins and Outs of Partitioned Tables
Four Rules For Columnstore Query Performance
A developers guide to Azure SQL Data Warehouse
මොඩියුල විශ්ලේෂණය Buffer Pool Extension භාවිතය.
Beginner Table Partitioning
Blazing-Fast Performance:
Database Management Systems (CS 564)
පාඨමාලා මාතෘකා Microsoft SQL Server Databases සැකසීම
Simple Partitioning Building a simple partitioning solution with SQL Server Stephen Fulcher.
Introduction to partitioning
Partitioned Tables and Query Performance
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
A developers guide to Azure SQL Data Warehouse
Why Should I Care About … Partitioned Views?
Why Should I Care About … Partitioned Views?
Chapter 9 Lesson 2 Notes.
Hash-Based Indexes Chapter 10
Table Partitioning Intro and make that a sliding window too!
Microsoft SQL Server 2014 for Oracle DBAs Module 7
Why Should I Care About … Partitioned Views?
CSCI 4333 Database Design and Implementation – Exercise (5)
Why Should I Care About … Partitioned Views?
Table Partitioning Intro and make that a sliding window too!
Box-and-Whisker Plots
Four Rules For Columnstore Query Performance
Why Should I Care About … Partitioned Views?
Clustered Columnstore Indexes (SQL Server 2014)
Table Partitioning Intro and make that a sliding window too!
Bulk Load and Minimal Logging
Center for Earned Value Management wInsight – “How to Use” Guide
Partition Switching Joe Tempel.
Why Should I Care About … Partitioned Views?
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Creating and Using Calendar Tables
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Restricting and Sorting Data
All about Indexes Gail Shaw.
Using Columnstore indexes in Azure DevOps Services. Lessons learned.
An Introduction to Partitioning
Presentation transcript:

Managing Table Partitions at the Extreme Ron Talmage 2019-04-27 Managing Table Partitions at the Extreme

Please Thank our Sponsors:

Agenda Overview of Table Partitioning The Partition Function The Partition Scheme Upper Extreme: Adding new partitions Lower Extreme: NULL and archiving

Table Partitioning Overview Some uses for table partitioning: Eliminate partitions from queries Manage indexes by partition Help with archiving data: SQL 2016+ can TRUNCATE by partition By SWITCH

Basics Partition Function Partitioned Partition Table Scheme Fillegroups Files

Table Partitioning: There Partitioned Table Ptn Function Ptn Scheme CREATE TABLE <tbl> AS … XXDate DateTime ON PtnScheme(xxDate); Refers to Ptn Function Maps partitions to FGs Specifies data type Enumerates Boundary Points Ptn Key Data type Ptn Scheme Filegroups File(s) per filegroup

And back again Partitioned Table Ptn Scheme P1 P2 P3 P4 Ptn Function CREATE TABLE <tbl> AS … XXDate DateTime ON PtnScheme(xxDate); FG1 FG2 Refers to Ptn Function Maps partitions to FGs Specifies data type Enumerates boundary values Ptns Per FG Archive unit RANGE RIGHT/LEFT Sorted boundary values NULL values in partition 1 “Past” partition “Future” partition $PARTITION() SPLIT/MERGE Ptn Key Data type Ptn Scheme Filegroups ALTER TABLE SWITCH TRUNCATE TABLE Ptn File(s) per filegroup

Zeroing in on the Extremes Upper Extreme: Adding new partitions Lower Extreme: NULL and Past data, archiving Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5

Partition Function

The Partition Function: Boundary Values Lowest Value Highest Value Range is a series of discrete possible values Range values are based on data type, sorted from lowest value to highest value To form partitions, choose boundary values Boundary values are taken from the available range values Decide RANGE RIGHT or RANGE LEFT

Why choose RANGE RIGHT or LEFT? Consider a single boundary value (two partitions) Lowest value Highest Value We must choose the type of partition that the boundary value will go into RANGE RIGHT RANGE LEFT

The Partition Function: RANGE RIGHT/LEFT Lowest value Highest Value RANGE RIGHT: boundary_value_on_right = 1 P2 P1 P3 P4 RANGE LEFT: boundary_value_on_right = 0 P1 P2 P3 P4

Why RANGE RIGHT is preferred 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value RANGE RIGHT: boundary_value_on_right = 1 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 < 2019-01-01 OR NULL >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 Suppose the DATE data type and partition by month, and partition boundaries of 2019-01-01, 2019-02-01, 2019-03-01, etc. RANGE RIGHT produces partitions of 2019-01-01 to 2019-01-31, 2019-02-01 to 2019-02-28, etc. RANGE LEFT results in 2019-01-02 to 2019-02-01, 2019-02-02 to 2019-03-01 Each partition ends up with data from two months. Would need boundaries of 2019-01-31, 2019-02-28, etc. to have monthly ptns RANGE LEFT: boundary_value_on_right = 0 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 <= 2019-01-01 OR NULL > 2019-01-01 AND <= 2019-04-01 > 2019-04-01 AND <= 2019-07-01 > 2019-07-01 AND <= 2019-10-01 > 2019-10-01 AND <= 2020-10-01 > 2020-01-01

Creating a NULL partition NULL can be a partition boundary value Only for partition 1 Some data data types in the partition function definition must use explicit conversions Date DateTime2()

Demo 1: Partition functions PtnExtremeDemo1- PartitionFunctions.sql

Partition Scheme

Partition Schemes Partition schemes map resulting partitions to filegroups To illustrate, let’s assume: Datetime data type Partition by quarter 1 year per filegroup Year 2019 in one filegroup

Adding Filegroups Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 < 2019-01-01 OR NULL >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 “Past” FG FG2019 “Future” FG

How many partitions in a filegroup? Possible criteria: simplicity and archiving Assume a date-oriented partition by quarter, 10 years and 40 Ptns 10 Years Ptn by Qtr: = 40 Ptns Archive Unit Ptns/FG Total FGs Files/FG TotalFiles /DB Balanced I/O Y/N Mtc/Complexity Level 1 Qtrly 40 4 160 Y High 2 Yearly 10 Medium 3 2 yrs 8 5 20 5 yrs Low

Demo 2 PtnExtremeDemo2 – PartitionSchemes.sql

Upper Extreme

Upper Extreme: Adding new partitions Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 FG2019 FG2020 >= 2020-01-01 FGPast < 2019-01-01 OR NULL Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01

Extending rowstore partitions – Unstacking Boxes Start with the farthest out Start with the nearest 4 4 3 3 4 2 2 3 4 1 2 3 4 1 2 3 4 Moves 3 boxes Moves 3+2+1 = 6 boxes

Rowstore first try: using SPLIT in ascending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01

Rowstore second try: SPLIT in descending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01

Undoing the expansion: Re-Stacking boxes Hmm… Do the closest first 4 4 3 3 4 2 2 3 4 1 2 3 4 1 2 3 4 Moves 3 boxes Moves 3+2+1 = 6 boxes

Reversing: MERGE in descending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01

CCI Expanding Partitions: Problem: Only empty partitions can be split or merged on CCI tables Strategy 1: SWITCH out the last partition, add the new partitions, and load them from the switched table Strategy 2: Convert the table to rowstore using a clustered index and expand.

Demo 3 PtnExtremeDemo3 - Expanding Partitions.sql

Lower Extreme

Lower Extreme: NULL and Archiving SQL Server allows NULL as a boundary value Works for datetime, integers, char/varchar Does not work for date or datetime2 The NULL partition will always be Ptn 1 With no NULL partition, partition key NULL values are stored in Ptn 1 with Past data

Adding a NULL Partition 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 >= Lowest AND < 2019-01-01 >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 NULL NULL FG “Past” FG FG2019 “Future” FG

Archiving: No NULL partition Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01 Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 FG2021 >= 2021-01-01 FGPast < 2020-01-01 OR NULL FG2020 FGPre2019 FG2019

Archiving: With NULL partition Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 Ptn 10 FGNULL FGPast <2019-01-01 FG2019 FG2020 FG2021 >= 2021-01-01 Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 FG2021 >= 2021-01-01 FGNULL FGPast < 2020-01-01 FG2020 FGPre2019 FG2019

Thanks! And references Dan Guzman, Table Partitioning Best Practices https://www.dbdelta.com/table-partitioning-best-practices/ Catherine Wilhelmsen, Table Partitioning in SQL Server – The Basics https://www.cathrinewilhelmsen.net/2015/04/12/table-partitioning-in-sql-server/ Niko Neugebauer, Columnstore Indexes – part 116 (“Partitioning Specifics”) http://www.nikoport.com/2017/12/28/columnstore-indexes-part-116-partitioning-specifics/