Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Tally Table and Pseudo Cursors

Similar presentations


Presentation on theme: "The Tally Table and Pseudo Cursors"— Presentation transcript:

1 The Tally Table and Pseudo Cursors
What they are and how they replace certain While Loops by Jeff Moden #315 Pittsburgh, Pennsylvania

2 The Tally Table and Pseudo Cursors
What they are and how they replace certain While Loops by Jeff Moden #315 Pittsburgh, Pennsylvania

3 Your Speaker - Jeff Moden
Nearly 2 decades of experience working with SQL Server Mostly Self Taught One of Leading Posters on SQLServerCentral.com More than 30,000 posts (heh… some are even useful) More than 30 articles on the “Black Arts” of T-SQL Member since 2003 SQL Server MVP Since 2008 Winner of the “Exceptional DBA” award for 2011 Lead Application DBA, Lead SQL Developer, and SQL Mentor for Proctor Financial, Inc. SQL Server is both my profession and my hobby (Yeah, I know… I need to get a life ;-) The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

4 Today’s Sponsors The Tally Table and Pseudo Cursors
04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

5 Agenda Introduction Glossary Introduction to “Pseudo Cursors”
The Trouble with Loops Glossary Introduction to “Pseudo Cursors” The Hidden Power of SQL Server Introduction to the Tally Table Another Type of Pseudo Cursor Hidden RBAR The Slothfulness of Recursion A “Table-Less” Tally “Table” First Appeared in Itzik Ben-Gan’s Books Some Examples High Performance Convenience Quick Review Q’n’A The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

6 Introduction The Trouble with Loops

7 Getting Started in a Programming Class
What’s the first thing they teach you how to do in most programming classes? That’s right… It sounds like a funny thing to do but this means that you've finally got your programming environment setup and you're ready to begin to learn how to program. PRINT 'Hello World' The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

8 The Next Programming Milestone
After learning about some syntax conventions, variables, data types, and a couple of other things, what is the next major milestone in learning how to program that is taught that’s absolutely essential to advanced programming techniques? Looping is the very essence of advanced programming skills. Modern programming would be useless without being able to repeat the execution of code in loops. How To Loop The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

9 Counting with a While Loop
Definition: Count from 1 to 100 and display the count. The Human Thinks: Declare a counter Preset the counter to 1 Display the count Add 1 to the counter Is the counter <= 100? If YES, branch back to display the new count. If No, quit. This is "Procedural" code. Easy to remember because you tell the program how to proceed every step of the way. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

10 Typical “Row By Row” Solution
--===== Count from 1 to 100 and display the count. INT; --Declare a counter = 1; --Preset the counter to 1 <= 100 BEGIN Display the count + 1; --Add 1 to counter END; --Is the counter <= 100? --If Yes, branch back to display the count. --If no, Quit. What we get for our troubles is a real mess... The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

11 The Mess a Loop Creates 100 Individual Result Sets
Virtually useless to a GUI as a return. Will cause errors in SSMS if too many result sets are returned 100 Individual Messages This is the reason why SET NOCOUNT ON is so important. It IS a performance issue when loops are involved. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

12 Cleaning Up the Mess In order to return a single result set for this classic loop problem, we have to… Create a Temp Table or a Table Variable (Added) Declare a counter Preset the counter to 1 Insert the count as a new row in the Temp Table (instead of just displaying it - Added) Add 1 to the counter Is the counter <= 100? If YES, branch back to INSERT the count. If No, continue (instead of quit). SELECT from the "Table" in the proper order (Added) Quit (moved here from decision) The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

13 Code Becomes More Complex…
--===== Suppress the auto-display of row counts for performance SET NOCOUNT ON; --New code added --===== Create a place to store the results CREATE TABLE #MyHead (N INT); --New code added --===== Count from 1 to 100 and display the count. INT; --Declare a counter = 1; --Preset the counter to 1 <= 100 BEGIN INSERT INTO #MyHead (N) New code added Display the count (same as before) + 1; --Add 1 to counter END; --Is the counter <= 100? --If Yes, branch back to display the count. --If no, continue. --===== Display the count SELECT N FROM #MyHead ORDER BY N; --New code added The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

14 Performance Gets Worse
Declaration of a new object. 101 individual calculations. 100 checks to make sure we didn't go over the limit. 100 individual INSERTs. Each INSERT requires a separate execution plan even if the SQL Server Optimizer decides it can reuse the same plan. Each INSERT requires a separate lock. Each INSERT requires a separate transaction (now there's a hint) Requires a final SELECT Takes ~14 seconds (~8 seconds in an explicit transaction) for a million rows on this laptop NOT INCLUDING THE FINAL SELECT (demo). The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

15 A MUCH Simpler Way to Count
Wouldn't it be neat if most looping problems were as easy as... THIS???? --==== Count from 1 to 100 -- using a Tally table SELECT N FROM dbo.Tally WHERE N BETWEEN 1 AND 100 ORDER BY N ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

16 Glossary

17 Things that Loop Cursor RBAR Hidden RBAR
Generally means anything with a Cursor in it. Many folks also call While Loops and Recursive CTE’s a "Cursor". Most folks use these to process things "Row By Row". RBAR Pronounced "ree-bar" like the steel rods permanently stuck in cement (appropriate, don't you think?). Is a "Modenism" for any process that runs "Row By Agonizing Row". Hidden RBAR Things that look "set based" but are not. Contains a hidden "cursor" of one type or another. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

18 Types of Progamming Procedural Programming Declarative Programming
Essentially, RBAR programming. Human tells computer what to do AND how to do it… Row by Agonizing Row. Works fine in GUI's. Kills most all hopes of performance in SQL Server because this type of programming overrides the very nature of the Optimizer in SQL Server. Declarative Programming Human tells computer what to do. Computer figures out HOW to do it. Usually, Set Based programming. Works WITH the Optimizer instead of overriding it. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

19 Programming in SQL Server
Set Based Programming Declarative Programming In SQL Server Does NOT mean "all in one query" especially since a single query can contain Hidden RBAR. Does NOT mean something that doesn't have a loop especially since a single query can contain Hidden RBAR. CAN mean something that has a loop because certain queries require multiple SETS of information to be processed. Does mean "touching" each row only once, if possible, and as few times as possible if not. Requires a simple paradigm shift in thinking. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

20 Other “Loops” and “Cursors”
Recursion The act of a bit of SQL Code calling itself. It "iterates" over itself making a Hidden RBAR loop. An example of this is a "Recursive CTE" which does nothing more than call itself. The act of "making the call" can crush performance and can eat about 3 times (or more) the resources of a simple While Loop. Pseudo Cursor The hidden but very high speed looping effect that set based code experiences behind the scenes. A simple SELECT "iterates" through rows behind the scenes but in a manner that SQL knows best. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

21 Introduction to “Pseudo Cursors”
The Hidden Power of SQL Server

22 The Unexpected “Magic” of SELECT
What does the following code snippet give you? --===== Top 100 rows of data from the table SELECT TOP (100) * FROM sys.all_columns; How does it work? Behind the scenes Does some preparation Reads one row Displays one row Makes a decision as to whether it’s done or not and loops back if it’s not done. Sound familiar? It should because… It’s a LOOP! Start Counter = 0 Display The Row Add 1 to Counter <=100 Return Read a Row The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

23 Say it with me – “Pseudo Cursor”
The term "Pseudo Cursor" was coined by R. Barry Young over on SQLServerCentral.com. It's a super important concept that I use to call a "Set Based Loop". It's a whole lot more complicated behind the scenes but it helps to think of a Pseudo Cursor as... A SELECT finds a row, reads the row, processes the row, and LOOPS back to read the next row… at an incredible speed. Behind the scenes, a SELECT is a machine language level Cursor (loop). Since these loops or cursors don't appear in T-SQL code, Barry called them "Pseudo Cursors". You DON’T necessarily have to use what's in the row of a Pseudo Cursor to use the rows. Say what? The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

24 Simple Pseudo Cursors at Work
What do the following snippets do? --===== Returns all COLUMNS and ROWS SELECT * FROM sys.all_columns ; --===== Returns a COLUMN of "1’s" -- Note that no data was used from the table SELECT 1 --===== Returns a COLUMN sequential numbers SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

25 What can you use a “Pseudo Cursor” for?
In particular, what can you do with a Pseudo Cursor that DOESN’T use anything from the “source” tables? One of the problems with most databases is that they don’t have enough data to do any performance testing with. You can use the “rows” of a Pseudo Cursor as a “loop” to create millions of rows of test data in a couple of heartbeats… … and you don’t need a very big table to do that if you understand how to use a friend of the Pseudo Cursor, the CROSS JOIN… The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

26 Building a Monster “Row Source”
Let’s start off simple. We’ll just build a table with a Million numbered rows (and then add to it). To do such a thing, we need something with a very large number of rows… like a Million. Especially on new systems, no such table exists to use as a “row source”. In fact the largest table on the whole server turns out to be sys.all_columns and it has only about 4,000 rows in it (on new 2005 system, more in others). Hmmm… what’s 4,000 times 4,000? A CROSS JOIN on sys.all_columns will easily produce up to a 16 million row Pseudo Cursor. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

27 Simple Million Row Test Table
What does the following code do? --===== Create and populate a test table on-the-fly -- with a COLUMN of sequential numbers from -- 1 to 1,000,000. This takes 745 ms (demo). SELECT TOP SomeID = IDENTITY(INT,1,1) INTO #MyHead FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 ; I know what you’re thinking. “Test table? Is that all you’ve got?” The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

28 The Million Row Test Table
--===== Create and populate a 1,000,000 row test table. -- "SomeID" has a range of 1 to 1,000,000 unique numbers -- "SomeInt" has a range of 1 to 50,000 numbers -- "SomeLetters2" has a range of "AA" to "ZZ" -- "SomeMoney has a range of to numbers -- "SomeDate" has a range of >=01/01/2010 & <01/01/2020 whole dates -- "SomeDateTime" has a range of >=01/01/2010 & <01/01/2020 Date/Times -- "SomeRand" contains the value of RAND just to show it can be done -- without a loop SELECT TOP SomeID = IDENTITY(INT,1,1), SomeInt = ABS(CHECKSUM(NEWID())) % , SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())) % ) + CHAR(ABS(CHECKSUM(NEWID())) % ), SomeMoney = CAST(RAND(CHECKSUM(NEWID())) * AS DECIMAL(9,2)), SomeDate = DATEADD(dd,ABS(CHECKSUM(NEWID())) % DATEDIFF(dd,'2010','2020'),'2010'), SomeDateTime = DATEADD(dd,DATEDIFF(dd,0,'2010'), RAND(CHECKSUM(NEWID())) * DATEDIFF(dd,'2010','2020')), SomeRand = RAND(CHECKSUM(NEWID())) INTO dbo.JBMTest FROM sys.all_columns ac1 --Cross Join forms up to a 16 million row CROSS JOIN sys.all_columns ac2 --Pseudo Cursor ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

29 What else can use “Pseudo Cursor” for?
In particular, what can you do with a Pseudo Cursor that DOES use something from the “source” table? THAT’s what a Tally Table is all about. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

30 Introduction to the Tally Table
Another Type of “Pseudo Cursor”

31 What’s IN a Tally Table? A single column of sequential numbers
Starts at 1 or 0 (Can have some problems with 0) Ends at some "sufficiently" large number. My Tally table usually ends with 11,000. I need more than 8,000 to split VARCHAR(8000) I need to be able to easily create 30 years worth of DAYS which is almost 11,000 days Is "Keyed" for speed. Clustered PK on "N“ (ABSOLUTELY ESSENTIAL) FILLFACTOR = 100 (ABSOLUTELY ESSENTIAL) INT because most functions will use INT's against it. Be REAL careful about implicit conversions here. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

32 How to Build a Tally Table
--===================================================================== Create a Tally table from 1 to 11000 --===== Create and populate the Tally table on the fly. SELECT TOP 11000 IDENTITY(INT,1,1) AS N --Makes a NOT NULL column INTO dbo.Tally FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 --Cross Join for up to 16 Million Rows ; --===== Add a CLUSTERED Primary Key to maximize performance ALTER TABLE dbo.Tally ADD CONSTRAINT PK_Tally_N PRIMARY KEY CLUSTERED (N) WITH FILLFACTOR = 100 --===== Allow the general public to use it GRANT SELECT ON dbo.Tally TO PUBLIC The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

33 Splitting at the Character Level
--===== Simulate a passed parameter VARCHAR(8000); = 'Element01,Element02,Element03'; --===== Declare a character counter (RBAR SOLUTION) INT; = 1; --===== While the character counter is less then the length of the string <= BEGIN --==== Display the character counter and the character at that -- position. --==== Increment the character counter + 1; END; --===== Do the same thing as the loop did... "Step" through the variable -- and return the character position and the character... SELECT N, FROM dbo.Tally WHERE N <= ORDER BY N; N 1 E 2 l 3 e 4 m 5 e 6 n 7 t 8 0 9 1 10 , 11 E 12 l 13 e 14 m 15 e 16 n 17 t 18 0 19 2 20 , 21 E ... The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

34 How It Works Just like counting from 1 to 100, both the loop and the Tally table count from 1 to the length of the parameter. Look at the following graphic. Both the loop and the Tally table do exactly the same thing except the Tally table only uses 1 SELECT and returns a single result set. The rows of the Tally Table act as the counter except it's set based. The Tally Table is a direct replacement for the loop. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

35 Finding the Position of Delimiters
The next logical step would be to find the Delimiters. Here's how it's done with a loop... --===== Loop Method ============================================================== --===== Simulate a passed parameter VARCHAR(8000); = 'Element01,Element02,Element03'; --===== Declare a variable to remember the position of the current comma INT ; --===== Find the first delimiter, if one exists = --===== Loop through and find each delimiter starting with the -- location of the previous delimiter. > 0 BEGIN --==== Find the next comma and add 1 to it. -- Return a 0 when no more commas are found. = END; Results N ---- 10 20 The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

36 Finding the Position of Delimiters
…and here’s how it’s done with a Tally Table --===== Tally Table Method ========================== --===== Simulate a passed parameter NVARCHAR(4000); = 'Element01,Element02,Element03'; --===== Now, find all the Delimiters SELECT N FROM dbo.Tally t WHERE t.N <= AND t.N, 1) = ',' Results N ---- 10 20 The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

37 How It Works Again, the reason why the Tally Table method works is that it's joined to the variable at the character level and seeks out the delimiters using a Pseudo Cursor. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

38 Doing the Final Split Up to this point, we've been able to find each delimiter using both a loop and a Tally Table. What we need to do now is find the NEXT delimiter to isolate the characters between the CURRENT delimiter and the NEXT delimiter. Once we've done that, we need to either store or display the characters that we've isolated as a group. This effectively splits the elements out from between the delimiters. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

39 “Inch Worm” Splitter Notice first and last elements are different.
The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

40 Loop Code for 8K Splitter
CREATE FUNCTION dbo.Split8KLoop ( @pString VARCHAR(8000), @pDelimiter CHAR(1) ) TABLE (ItemNumber INT, Item VARCHAR(8000)) AS BEGIN --===== Declare some obviously named variables INT, @EndPointer INT, @Counter INT; --===== Find the first delimiter if it exists = 1, @EndPointer @Counter = 1; --===== If we found at least one delimiter, loop until we don't find any more > 0 BEGIN --===== Inserts the split item INSERT (ItemNumber, Item) SELECT ItemNumber --===== Finds the next split item, if it exists + 1, @Counter + 1; END; --===== Inserts the last or only split item Item 8000); RETURN; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

41 A Tally Table 8k Splitter
CREATE FUNCTION dbo.Split8KTally --===== Define I/O parameters CHAR(1)) RETURNS TABLE WITH SCHEMABINDING AS RETURN WITH cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once -- for each delimiter) (NOTE: THIS IS NOT A RECURSIVE CTE) SELECT 1 UNION ALL SELECT t.N+1 FROM dbo.Tally t WHERE t.N BETWEEN 1 AND AND ), cteLen(N1,L1) AS (--==== Return start and length (for use in substring) SELECT s.N1, FROM cteStart s ) --===== Do the actual split. The ISNULL/NULLIF combo handles the length -- for the final element when no delimiter is found. SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1), Item = l.N1, l.L1) FROM cteLen l ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

42 The Slothfulness of Recursion
Hidden RBAR The Slothfulness of Recursion

43 Recursive CTE’s They’re easy to write.
They have a small physical footprint in code. They’re “slick” because they look “Set-Based”. They have no explicit loop. They’re S-L-O-W. They’re resource intensive. They’re “Hidden RBAR” The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

44 An rCTE to Count Here’s a simple rCTE (recursive CTE) that counts from 0 to 11 to create a year’s worth of months for 2011. WITH cteCounter AS (--==== Counter rCTE counts from 0 to 11 SELECT 0 AS N This provides the starting point (anchor) of zero UNION ALL SELECT N This is the recursive part FROM cteCounter WHERE N < 11 )--==== Add the counter value to a start date and you get multiple dates SELECT StartOfMonth = DATEADD(mm,N,'2011') ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

45 How It Works The basic, non-technical, non-scientific explanation for the operation of code would be this... The "anchor" value is set to "0" (zero). This now means the rCTE has a row with a zero in it. The recursive part takes over. It looks at itself and says "What's the last value that I put into myself?", does a SELECT to add 1 to that value, and then checks the predicate in the WHERE clause. If the value that was just made is within the limits defined by the WHERE clause, the rCTE saves that value in itself and then it loops back to Step 2 The process continues to re-iterate through the loop formed by Steps 2 and 3 until the value being built (N+1) exceeds the limits of the WHERE clause. Once that happens, the rCTE exits to the SELECT (or other) statement that immediately follows the rCTE. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

46 Behind the Scenes The rCTE actually makes a “Work” Table in TempDB. Another name for this table is a “System Temp Table”. (12 row(s) affected) Table 'Worktable'. Scan count 2, logical reads 73, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Notice the number of reads to render just 11 rows? Yes, they’re “logical reads” (memory), but that’s still I/O. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

47 Performance of a Counting rCTE
On the chart on the next page, the painfully obvious loser is the Red line which is the rCTE. There are 3 other methods of counting on this chart. Compare and believe… The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

48 Performance of a Counting rCTE
The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

49 Even Low Counts are Painful
The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

50 Resources? You’ve GOT To See This!
The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

51 A “Table-less” Tally “Table”
First Appeared In Itzik Ben-Gan’s Books

52 Modified Ben-Gan Cascading CTE’s
This produces virtually no reads and can be almost as fast as a physical Tally Table especially when used in an “iTVF” (inline Table Valued Function). WITH E1(N) AS ( --=== Create Ten 1's SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT ), E2(N) AS (SELECT 1 FROM E1 a, E1 b), E4(N) AS (SELECT 1 FROM E2 a, E2 b), ,000 E8(N) AS (SELECT 1 FROM E4 a, E4 b), ,000,000 E16(N) AS (SELECT 1 FROM E8 a, E8 b), ,000,000,000,000,000 cteTally(N) AS (SELECT TOP ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E16) SELECT t.N --Some query that uses the sequential numbering FROM cteTally t ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

53 How It Works Each CTE is named for a “power of 10”… as in 10En. For example, 10E2 = 100. The first CTE, E1, returns up to 10 rows and is simply ten 1’s UNION ALL’d together. The second CTE, E2, is nothing more than a CROSS JOIN of E1. It returns up to 10x10 or 100 rows. Each following En CTE is a CROSS JOIN that squares the number of rows of the previous CTE. cteTally does two things… The TOP very effectively limits the number of rows that are created. ROW_NUMBER() converts the rows into a numbered sequence just like a Tally Table. It could be created as a separate iTVF and still be as fast as a physical Tally Table.. Typically, for any functions on VARCHAR(8000), only CTE’s E1 through E4 (10,000) rows are included to simplify the code a bit. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

54 “Goldilocks” Pseudo Cursors
Extending the Concept “Goldilocks” Pseudo Cursors

55 It’s Not Just a Table More important than the Tally Table itself, there’s a concept on how you can avoid the RBAR of explicit loops that we’ve learned. You don’t necessarily need a Tally Table. You don’t necessarily need a big honkin’ cascading CTE. Sometimes, all you need is a “Goldilocks” table. Something “just right”. It needs to be easy to use (UDF). It needs to be fast (iTVF). The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

56 Calculate Dates in Current Week
Just enough of an “inline” Tally Table to do the job. Inline Table Valued Function or iTVF. Very Fast. --===== Works in SQL Server 2000 and above.      -- Note that can't use CROSS APPLY in 2000. -- Note that a week starts on Sunday in this code. RETURNS TABLE WITH SCHEMABINDING AS  RETURN    FROM (SELECT 0 UNION ALL SELECT 1 UNION ALL           SELECT 2 UNION ALL SELECT 3 UNION ALL          SELECT 4 UNION ALL SELECT 5 UNION ALL          SELECT 6) t (N) ; GO --===== Works in SQL Server 2008 and above RETURNS TABLE WITH SCHEMABINDING AS  RETURN    FROM (VALUES (0),(1),(2),(3),(4),(5),(6)) t (N) ; GO The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

57 Super Brief Intro to CROSS APPLY
Basically nothing more than a “Correlated Subquery” that can return more than 1 row. Usually, VERY fast. Great for incorporating multi-line Table Valued Functions (mTVF’s). Even better when incorporating “inline” Table Valued Functions (iTVF’s). Paul White’s excellent articles on Apply The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

58 Use the iTVF to Get the Current Week
--===== Create a table and insert 2 dates with -- an identifier so we can tell what is what.  CREATE TABLE #SomeTable         (         SomeTableID  INT IDENTITY(1,1),         SomeDateTime DATETIME         ) ;  INSERT INTO #SomeTable         (SomeDateTime)  SELECT ' ' UNION ALL  SELECT GETDATE() ;  SELECT t.SomeTableID,         fn.DateInWeek    FROM #SomeTable t   CROSS APPLY dbo.DatesInWeek(t.SomeDateTime) fn ; SomeTableID DateInWeek :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 :00:00.000 The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

59 Sometimes Goldilocks is a Big Girl
There are times where a Tally Table won’t help but the act of generating a sequence like a Tally Table will. This Pseudo Cursor spans the whole table. This numbers each group of dupes and deletes all but the first one from each group. WITH cteEnumerateDupes AS (  SELECT DupeNumber = ROW_NUMBER() OVER (PARTITION BY SomeInt ORDER BY SomeDate)    FROM dbo.JBMTest )  DELETE FROM cteEnumerateDupes   WHERE DupeNumber > 1 ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

60 High Performance Convenience
Some More Examples High Performance Convenience

61 Without a Doubt… Without a doubt, most folks agree that SQL Server doesn’t handle String functionality very well. Without a doubt, most folks agree that String functionality should be left up to the GUI or Reporting Tool. Without a doubt, most folks agree that if you can’t handle Strings in either of those, then you should use a CLR. Without a doubt, if you can’t do any of those things, you’d better know how to do it all in T-SQL. The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

62 dbo.SplitDelimited8K using cteTally
CREATE FUNCTION dbo.DelimitedSplit8K --===== Define I/O parameters CHAR(1)) RETURNS TABLE WITH SCHEMABINDING AS RETURN --===== "Inline" CTE Driven "Tally Table" produces values from 0 up to 10,000... -- enough to cover VARCHAR(8000) WITH E1(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), E+1 or 10 rows E2(N) AS (SELECT 1 FROM E1 a, E1 b), --10E+2 or 100 rows E4(N) AS (SELECT 1 FROM E2 a, E2 b), --10E+4 or 10,000 rows max cteTally(N) AS (--==== This provides the "base" CTE and limits the number of rows right up front -- for both a performance gain and prevention of accidental "overruns" SELECT TOP ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4 ), cteStart(N1) AS (--==== This returns N+1 (starting position of each "element" just once for each delimiter) SELECT 1 UNION ALL SELECT t.N+1 FROM cteTally t WHERE cteLen(N1,L1) AS(--==== Return start and length (for use in substring) SELECT s.N1, FROM cteStart s ) --===== Do the actual split. The ISNULL/NULLIF combo handles the length for the final element when no delimiter is found. SELECT ItemNumber = ROW_NUMBER() OVER(ORDER BY l.N1), Item = l.N1, l.L1) FROM cteLen l ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

63 Testing/Using the Splitter
--======================================================================================================= -- TEST 1: -- This tests for various possible conditions in a string using a comma as the delimiter. -- The expected results are laid out in the comments --===== Conditionally drop the test tables to make reruns easier for testing. -- (this is NOT a part of the solution) IF OBJECT_ID('tempdb..#JBMTest') IS NOT NULL DROP TABLE #JBMTest ; --===== Create and populate a test table on the fly (this is NOT a part of the solution). -- In the following comments, "b" is a blank and "E" is an element in the left to right order. -- Double Quotes are used to encapsulate the output of "Item" so that you can see that all blanks -- are preserved no matter where they may appear. SELECT * INTO #JBMTest FROM ( # & type of Return Row(s) SELECT 0, NULL UNION ALL --1 NULL SELECT 1, SPACE(0) UNION ALL --1 b (Empty String) SELECT 2, SPACE(1) UNION ALL --1 b (1 space) SELECT 3, SPACE(5) UNION ALL --1 b (5 spaces) SELECT 4, ',' UNION ALL --2 b b (both are empty strings) SELECT 5, '55555' UNION ALL --1 E SELECT 6, ',55555' UNION ALL --2 b E SELECT 7, ',55555,' UNION ALL --3 b E b SELECT 8, '55555,' UNION ALL --2 b B SELECT 9, '55555,1' UNION ALL --2 E E SELECT 10, '1,55555' UNION ALL --2 E E SELECT 11, '55555,4444,333,22,1' UNION ALL --5 E E E E E SELECT 12, '55555,4444,,333,22,1' UNION ALL --6 E E b E E E SELECT 13, ',55555,4444,,333,22,1,' UNION ALL --8 b E E b E E E b SELECT 14, ',55555,4444,,,333,22,1,' UNION ALL --9 b E E b b E E E b SELECT 15, ' 4444,55555 ' UNION ALL --2 E (w/Leading Space) E (w/Trailing Space) SELECT 16, 'This,is,a,test.' E E E E ) d (SomeID, SomeValue) --===== Split the CSV column for the whole table using CROSS APPLY (this is the solution) SELECT test.SomeID, test.SomeValue, split.ItemNumber, split.Item, QuotedItem = QUOTENAME(split.Item,'"') FROM #JBMTest test CROSS APPLY dbo.DelimitedSplit8K(test.SomeValue,',') split The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

64 How fast is it? (1,000 Row Test)
The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

65 Building Dates for Whole Years
DATETIME, @EndYear DATETIME, @Cutoff DATETIME ; = '1950', @EndYear = '2050', @Cutoff = SELECT DATEADD(yy,ROW_NUMBER() OVER (ORDER BY (SELECT FROM dbo.Tally t1, --Being used as a row-source here dbo.Tally t2 The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

66 Building a Date Range DECLARE @StartDate DATETIME, @EndDate DATETIME,
@Cutoff DATETIME ; = ' ', @EndDate = ' ', @Cutoff = SELECT FROM dbo.Tally t --11,000 > 30 years of days ORDER BY t.N The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

67 Building a Time Interval Range
DATETIME, @EndDate DATETIME, @Cutoff DATETIME, @Interval INT ; = ' ', @EndDate = ' ', @Cutoff = @Interval = 10 SELECT DATEADD(mi, (ROW_NUMBER() OVER (ORDER BY (SELECT NULL))-1) @StartDate) FROM dbo.Tally t1, --Used as a row-source dbo.Tally t2 Change “mi” to “hh” for hours The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

68 Building Time Interval “Bins”
DATETIME, @EndDate DATETIME, @Cutoff DATETIME, @Interval INT ; = ' ', @EndDate = ' ', @Cutoff = @Interval = 1 WITH cteStartTimes AS ( SELECT StartTime = DATEADD(hh,(ROW_NUMBER() OVER (ORDER BY (SELECT FROM dbo.Tally t1, --Used as a row-source dbo.Tally t2 ) SELECT StartTime, Cutoff = DATEADD(hh,1,StartTime) FROM cteStartTimes ORDER BY StartTime; Change “hh” to “dd” for Days Change “hh” to “mi” for Minutes The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

69 Great for Preventing SQL INJECTION
Cleaning Strings CREATE FUNCTION dbo.CleanString8K ( @pString VARCHAR(8000), @pPattern VARCHAR(8000) ) RETURNS TABLE WITH SCHEMABINDING AS RETURN WITH E1(N) AS ( --=== Create Ten 1's SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), E2(N) AS (SELECT 1 FROM E1 a, E1 b), E4(N) AS (SELECT 1 FROM E2 a, E2 b), ,000 cteTally(N) AS (SELECT TOP ROW_NUMBER() OVER (ORDER BY (SELECT N)) FROM E4) SELECT CleanedString = SELECT '' + t.N, 1) FROM cteTally t WHERE t.N, 1) COLLATE Latin1_General_BIN COLLATE Latin1_General_BIN FOR XML PATH('') ); @pPattern = [A-Z] = Upper Case Alpha Only @pPattern = [a-z] = Lower Case Alpha Only @pPattern = [0-9] = Numeric Digits Only @pPattern = [A-Za-z0-9] = Alpha-Numeric Only The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

70 Initial Caps The Tally Table and Pseudo Cursors
CREATE FUNCTION dbo.InitialCap --Original code by Usman Butt, modified for extra functionality by Jeff Moden --===== Declare the IO of the function VARCHAR(8000)) RETURNS TABLE WITH SCHEMABINDING AS RETURN --===== Force the first character of the string to upper case. -- Obviously, non-letter values will not be changed by UPPER. SELECT InitialCapString = --First character always + ( --=== If the current character in the given string isn't a letter then -- concatenate the next character as an UPPER case character. -- Otherwise, make it lower case character. -- The COLLATE clause speeds up non-default collations. SELECT CASE WHEN t.N , 1) COLLATE Latin1_General_BIN LIKE '[^A-Za-z'']' COLLATE Latin1_General_BIN OR t.N , 4) COLLATE Latin1_General_BIN LIKE '[^A-Za-z][A-Za-z][A-Za-z][A-Za-z]' COLLATE Latin1_General_BIN THEN t.N+1, 1)) ELSE t.N+1, 1)) END FROM dbo.Tally t WHERE t.N < ORDER BY t.N FOR XML PATH(''), TYPE ).value('text()[1]', 'varchar(8000)') ; The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

71 Quick Review

72 Quick Review Thanks for listening, folks!
In the Introduction, we found out why loops can be so slow. We learned the definition of some “new” terms in the Glossary including “RBAR” and “Hidden RBAR”. We learned of the hidden power in SQL Server through the use of “Pseudo Cursors”. Need Test Data? Build it! We learned what a Tally Table is and how it works as a high peformance “Pseudo Cursor”. We learned that Recursive Counting CTE’s are a form of “Hidden RBAR” and are nearly as bad as While Loops for performance and are resource hogs. We learned how to create and use a “table-less” Tally “Table” that lives only in memory and causes virtually no reads. We learned how to use the Tally Table and cteTally through the use of several examples. Thanks for listening, folks! The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

73 Recommended Reading

74 Recommended Reading The "Numbers" or "Tally" Table: What it is and how it replaces a loop. Tally OH! An Improved SQL 8K “CSV Splitter” Function Generating Test Data: Parts 1 and 2 How to Make Scalar UDFs Run Faster (SQL Spackle) Hidden RBAR: Counting with Recursive CTE's Creating a comma-separated list (SQL Spackle) –Wayne Sheffield Understanding and Using APPLY: Parts 1 and 2 –Paul White The Tally Table and Pseudo Cursors 04 October 2014 © Copyright by Jeff Moden - All Rights Reserved

75 Q’n’A The Tally Table and Pseudo Cursors What they are and how they replace certain While Loops by Jeff Moden #315 Pittsburgh, Pennsylvania


Download ppt "The Tally Table and Pseudo Cursors"

Similar presentations


Ads by Google