Managing Table Partitions at the Extreme Ron Talmage 2019-04-27 Managing Table Partitions at the Extreme
Please Thank our Sponsors:
Agenda Overview of Table Partitioning The Partition Function The Partition Scheme Upper Extreme: Adding new partitions Lower Extreme: NULL and archiving
Table Partitioning Overview Some uses for table partitioning: Eliminate partitions from queries Manage indexes by partition Help with archiving data: SQL 2016+ can TRUNCATE by partition By SWITCH
Basics Partition Function Partitioned Partition Table Scheme Fillegroups Files
Table Partitioning: There Partitioned Table Ptn Function Ptn Scheme CREATE TABLE <tbl> AS … XXDate DateTime ON PtnScheme(xxDate); Refers to Ptn Function Maps partitions to FGs Specifies data type Enumerates Boundary Points Ptn Key Data type Ptn Scheme Filegroups File(s) per filegroup
And back again Partitioned Table Ptn Scheme P1 P2 P3 P4 Ptn Function CREATE TABLE <tbl> AS … XXDate DateTime ON PtnScheme(xxDate); FG1 FG2 Refers to Ptn Function Maps partitions to FGs Specifies data type Enumerates boundary values Ptns Per FG Archive unit RANGE RIGHT/LEFT Sorted boundary values NULL values in partition 1 “Past” partition “Future” partition $PARTITION() SPLIT/MERGE Ptn Key Data type Ptn Scheme Filegroups ALTER TABLE SWITCH TRUNCATE TABLE Ptn File(s) per filegroup
Zeroing in on the Extremes Upper Extreme: Adding new partitions Lower Extreme: NULL and Past data, archiving Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5
Partition Function
The Partition Function: Boundary Values Lowest Value Highest Value Range is a series of discrete possible values Range values are based on data type, sorted from lowest value to highest value To form partitions, choose boundary values Boundary values are taken from the available range values Decide RANGE RIGHT or RANGE LEFT
Why choose RANGE RIGHT or LEFT? Consider a single boundary value (two partitions) Lowest value Highest Value We must choose the type of partition that the boundary value will go into RANGE RIGHT RANGE LEFT
The Partition Function: RANGE RIGHT/LEFT Lowest value Highest Value RANGE RIGHT: boundary_value_on_right = 1 P2 P1 P3 P4 RANGE LEFT: boundary_value_on_right = 0 P1 P2 P3 P4
Why RANGE RIGHT is preferred 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value RANGE RIGHT: boundary_value_on_right = 1 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 < 2019-01-01 OR NULL >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 Suppose the DATE data type and partition by month, and partition boundaries of 2019-01-01, 2019-02-01, 2019-03-01, etc. RANGE RIGHT produces partitions of 2019-01-01 to 2019-01-31, 2019-02-01 to 2019-02-28, etc. RANGE LEFT results in 2019-01-02 to 2019-02-01, 2019-02-02 to 2019-03-01 Each partition ends up with data from two months. Would need boundaries of 2019-01-31, 2019-02-28, etc. to have monthly ptns RANGE LEFT: boundary_value_on_right = 0 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 <= 2019-01-01 OR NULL > 2019-01-01 AND <= 2019-04-01 > 2019-04-01 AND <= 2019-07-01 > 2019-07-01 AND <= 2019-10-01 > 2019-10-01 AND <= 2020-10-01 > 2020-01-01
Creating a NULL partition NULL can be a partition boundary value Only for partition 1 Some data data types in the partition function definition must use explicit conversions Date DateTime2()
Demo 1: Partition functions PtnExtremeDemo1- PartitionFunctions.sql
Partition Scheme
Partition Schemes Partition schemes map resulting partitions to filegroups To illustrate, let’s assume: Datetime data type Partition by quarter 1 year per filegroup Year 2019 in one filegroup
Adding Filegroups Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 < 2019-01-01 OR NULL >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 “Past” FG FG2019 “Future” FG
How many partitions in a filegroup? Possible criteria: simplicity and archiving Assume a date-oriented partition by quarter, 10 years and 40 Ptns 10 Years Ptn by Qtr: = 40 Ptns Archive Unit Ptns/FG Total FGs Files/FG TotalFiles /DB Balanced I/O Y/N Mtc/Complexity Level 1 Qtrly 40 4 160 Y High 2 Yearly 10 Medium 3 2 yrs 8 5 20 5 yrs Low
Demo 2 PtnExtremeDemo2 – PartitionSchemes.sql
Upper Extreme
Upper Extreme: Adding new partitions Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 FG2019 FG2020 >= 2020-01-01 FGPast < 2019-01-01 OR NULL Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01
Extending rowstore partitions – Unstacking Boxes Start with the farthest out Start with the nearest 4 4 3 3 4 2 2 3 4 1 2 3 4 1 2 3 4 Moves 3 boxes Moves 3+2+1 = 6 boxes
Rowstore first try: using SPLIT in ascending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01
Rowstore second try: SPLIT in descending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01
Undoing the expansion: Re-Stacking boxes Hmm… Do the closest first 4 4 3 3 4 2 2 3 4 1 2 3 4 1 2 3 4 Moves 3 boxes Moves 3+2+1 = 6 boxes
Reversing: MERGE in descending order Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01
CCI Expanding Partitions: Problem: Only empty partitions can be split or merged on CCI tables Strategy 1: SWITCH out the last partition, add the new partitions, and load them from the switched table Strategy 2: Convert the table to rowstore using a clustered index and expand.
Demo 3 PtnExtremeDemo3 - Expanding Partitions.sql
Lower Extreme
Lower Extreme: NULL and Archiving SQL Server allows NULL as a boundary value Works for datetime, integers, char/varchar Does not work for date or datetime2 The NULL partition will always be Ptn 1 With no NULL partition, partition key NULL values are stored in Ptn 1 with Past data
Adding a NULL Partition 2019-01-01 2019-04-01 2019-07-01 2019-07-01 2020-01-01 Lowest value Highest Value Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 >= Lowest AND < 2019-01-01 >= 2019-01-01 AND < 2019-04-01 >= 2019-04-01 AND < 2019-07-01 >= 2019-07-01 AND < 2019-10-01 >= 2019-10-01 AND < 2020-10-01 >= 2020-01-01 NULL NULL FG “Past” FG FG2019 “Future” FG
Archiving: No NULL partition Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 FGPast < 2019-01-01 OR NULL FG2019 FG2020 FG2021 >= 2021-01-01 Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 FG2021 >= 2021-01-01 FGPast < 2020-01-01 OR NULL FG2020 FGPre2019 FG2019
Archiving: With NULL partition Current: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 Ptn 8 Ptn 9 Ptn 10 Ptn 10 FGNULL FGPast <2019-01-01 FG2019 FG2020 FG2021 >= 2021-01-01 Desired: Ptn 1 Ptn 2 Ptn 3 Ptn 4 Ptn 5 Ptn 6 Ptn 7 FG2021 >= 2021-01-01 FGNULL FGPast < 2020-01-01 FG2020 FGPre2019 FG2019
Thanks! And references Dan Guzman, Table Partitioning Best Practices https://www.dbdelta.com/table-partitioning-best-practices/ Catherine Wilhelmsen, Table Partitioning in SQL Server – The Basics https://www.cathrinewilhelmsen.net/2015/04/12/table-partitioning-in-sql-server/ Niko Neugebauer, Columnstore Indexes – part 116 (“Partitioning Specifics”) http://www.nikoport.com/2017/12/28/columnstore-indexes-part-116-partitioning-specifics/