Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimizing SQL Server and Databases for large Fact Tables =tg= Thomas Grohser, NTT Data SQL Server MVP SQL Server Performance Engineering SQL Saturday.

Similar presentations


Presentation on theme: "Optimizing SQL Server and Databases for large Fact Tables =tg= Thomas Grohser, NTT Data SQL Server MVP SQL Server Performance Engineering SQL Saturday."— Presentation transcript:

1 Optimizing SQL Server and Databases for large Fact Tables =tg= Thomas Grohser, NTT Data SQL Server MVP SQL Server Performance Engineering SQL Saturday #506 BI Edition April 30 th 2016, Baltimore, Maryland

2 select * from =tg= where topic = =tg= Thomas Grohser, NTT DATA Senior Director Technical Solutions Architecture email: Thomas.grohser@nttdata.com / tg@grohser.com Focus on SQL Server Security, Performance Engineering, Infrastructure and Architecture New Papers coming 2016 Close Relationship with SQLCAT (SQL Server Customer Advisory Team) SCAN (SQL Server Customer Advisory Network) TAP (Technology Adoption Program) Product Teams in Redmond Active PASS member and PASS Summit Speaker 21 Years with SQL Server

3 And one more thing … All I know about BI is how to not install it

4 NTT DATA Overview 20,000 professionals – Optimizing balanced global delivery $1.6B – Annual revenues with history of above-market growth Long-term relationships – >1,000 clients; mid-market to large enterprise Delivery excellence – Enabled by process maturity, tools and accelerators Flexible engagement – Spans consulting, staffing, managed services, outsourcing, and cloud Industry expertise – Driving depth in select industry verticals Why NTT DATA for MS Services: NTT DATA is a Microsoft Gold Certified Partner. We cover the entire MS Stack, from applications to infrastructure to the cloud Proven track record with 500+ MS solutions delivered in the past 20 years Why NTT DATA for MS Services: NTT DATA is a Microsoft Gold Certified Partner. We cover the entire MS Stack, from applications to infrastructure to the cloud Proven track record with 500+ MS solutions delivered in the past 20 years

5 Drawing at the end of the session  Drop your business card or fill out provided blank card and drop in the cup  Must be present at the time of drawing at the end of the session to win:

6 Agenda  Defining the issue/problem  Looking at the tools  Using the right tools  Q&A ATTENTION: Important Information may be displayed on any slide at any time! ! Without Warning ! ATTENTION: Important Information may be displayed on any slide at any time! ! Without Warning !

7 Definition of a large fact table  Moving target over time  2001 for me big was  > 1 billion rows  > 90 GB  2011 for me big was  > 1.3 trillion rows  > 250 TB was big  2016  ???  1 PB ??? 10 PB ???

8 Size matters not!  Having the right tools in place and knowing how to use them to handle the data is the solution.

9 The Problem  Trying to run 50 reports on a big fact table that each need to scan the whole table…  The data is ready at 5am in the morning…  Reports need to be ready by 9am…  The baseline  Each report takes about 2 hours to finish…

10 Tools  Hardware (Server, Storage)  SQL Server (Standard, BI, Enterprise)  Clever Configuration  Clever Query Scheduling

11 Hardware “The grade of steel”“The grade of steel”

12 CPU is not the limit  On a modern CPU each core can process about 500 MB/s  How many cores do we have in commodity server?  4-18 cores  1-8 sockets  That’s 4 to 144 cores  or 2 – 72 GB per second  or 7 to 260 TB per hour  CPU Capacity is a rarely a bottle neck

13 Understanding how SQL scans data  SQL Servers reads the data page by page  SQL Server may perform read-ahead  Dynamically adjust read-ahead size by table  Standard Edition: Up to 128 pages  Enterprise Edition: Up to 512 pages  That’s up to 1 MB (Std) or 4 MB (Ent)  Read ahead as much as possible…  Why?  Reading 4 MB takes about as long as reading 8 KB  So lets help SQL doing it.

14 Read Ahead happens if …  The next data needed is in contiguous pages on the disk.  Problem with 2 or more tables that grow at the same time.

15 Multiple Data Files 2-4-6-8-… 1-3-5-7-9-… 3-6-9-… 1-2-4-5-7-8-… 8-9 1-3 5 5 … … 6-7 2-4 … …

16 Multiple File Groups  FG1  FG2 1-2-3-4-5-6-7-8-9-…

17 SQL Server Startup Options  -E can be your friend if you have large tables  -E allocates 64 extents at a time That is 4 MB at a time for each table instead of 64 KB

18 Multiple Data Files Again

19 IO and Storage Path

20 Read speed factor – Direct Attached 1X RAID 1 1-2X 2X RAID 5 0.5-2X RAID 5 0.25-4X

21 Read speed factor - SAN 1X 2X Ensure there are enough Paths to the Array Disable Read Cache if possible

22 Understand the path to the drives CacheFiber Channel PortsControllers/Processors Switch HBA Switch RAID Cntr. SSD SAN SSD DAS

23 IO Bottle necks  Disks (10-160 MB/sec)  Disk Interface (3-12 Gb/sec)  RAID Controller (1-2 GB/sec)  Ethernet (1 or 10 Gb/sec)  Fiber Channel (2-16 Gb/sec)  Host bus Adapter (2-32 Gb/sec)  PCIe Express Bus (0.25-32 GB/sec)

24 Schema and Indexes

25 Choose the clustered index key wisely  If you have a lot of queries that range scan  WHERE value BETWEEN x AND y  Multiple dates in a table (e.g. Order, Ship, Delivery date, …)  Which to choose?  None  Put index on unique ID and have helper table DateDateTypeMinIDMaxID

26 Table Partitioning  Great tool to make maintaining the database easier but does not give us much in performance. Could actually slow us down.  Might be needed to spread data across multiple File Groups

27 Row and Page Compression  ROW compression  Almost now overhead  Can save several unused bytes in each row  Remember: 1 byte less on 1 billion rows is 1 GB  Page Compression  Some overhead  Can save a lot on repeating patterns (same values within a page)  New data is not compressed !  Never compress lookup data

28 Mary Go Round Piggy Back Scan Query 1Query 2  Enterprise Edition Only  Automatically invoked  With planning much better results

29 Column Store Index  With SQL2016 finally usable (updateable without workarounds)  Awesome Compression Ratios  Even better results if a lot of queries only require a few columns of the fact table

30 THANK YOU! and may the force be with you… Questions? thomas.grohser@nttdata.com tg@grohser.com


Download ppt "Optimizing SQL Server and Databases for large Fact Tables =tg= Thomas Grohser, NTT Data SQL Server MVP SQL Server Performance Engineering SQL Saturday."

Similar presentations


Ads by Google