Getting the most from your SAN – File and Filegroup design patterns

Getting the most from your SAN – File and Filegroup design patterns
Stephen Archbold

About me Microsoft Certified Master SQL Server 2008
Working with SQL Server for 6+ years Former SQL Server Consultant & Production DBA Currently Technical Lead at Arrowstreet Capital Specialising in Performance Tuning and Solution Design Blog at and Get me on LinkedIn

Agenda Data Filegroup/File Fundamentals Storage Design Patterns
OLTP Data Warehousing – Fast Track style Data Warehousing on a SAN What other go faster buttons have we got Case Study – The unruly fact table How do we make the changes File and Filegroup recommendations were always for maintenance, mantra of “Don’t partition for performance” – Not so much anymore

Data Filegroup/File Fundamentals
General Filegroup Recommended Practices. Separate for: Nothing but system tables in Primary I/O patterns Different volatility Data Age If using Multiple Files in a Filegroup Files must be equally sized Files must be equally full SQL does not redistribute data when adding more files Proportional fill/Round Robin

Pattern 1 - OLTP Transactional processing is all about speed
You want to get the transaction recorded and the user out as quick as possible Metric for throughput becomes less about MB/Sec, and more about transactions and I/O’s per second Amazon as an example of a user waiting

Challenges of OLTP Solid State Disk becoming more commonplace
These thrive on Random I/O As the databases can be small, file/filegroup layout can suffer Faster disk brings different challenges We’re not covering log writes becoming the bottleneck in this session

Filegroup PRIMARY MyDB.MDF File Transactions File1.NDF File2.NDF
PAGELATCH! Discuss why you separate here, even though maintainability (Piecemeal) would not be used Neatness, and only have to move hot I/O section to faster disk Reference Ref.NDF Volatile Volatile.NDF

Behind the scenes Single File Two Files Wait Type %
SOS_SCHEDULER_YIELD 55 PAGEIOLATCH_EX 17 PAGELATCH_SH 15 ASYNC_IO_COMPLETION 5 PAGELATCH_UP Wait Type % SOS_SCHEDULER_YIELD 66 PAGEIOLATCH_EX 12 PAGELATCH_SH 10 ASYNC_IO_COMPLETION 7 SLEEP_BPOOL_FLUSH 2 Page Latch contention with lower amount of files - PFS

Facts and Figures Test Conditions Data files on Solid State Disk
GUID Clustered index to generate random I/O 16 threads inserting at once Full test harness in useful links slide

What can we take away from this?
Resolving in memory contention lies with the file layout This is actually nothing new, TempDB has been tuned this way for years! Keep in mind, files are written to in a “round robin” fashion

Data Warehousing

Pattern 2 – Fast Track Scenario
Large Volume Star Schema Need to optimize for sequential throughput Scanning Entire Table Not Shared Storage

Large Partitioned Fact Table
Enclosure 1 MyFact_part1.NDF Controller 1 MyFact_part2.NDF Partition 1 HBA 1 MyFact_Part3.NDF Myfact_Part4.NDF Controller 2 Partition 3 Partition 4 CPU CPU CPU Partition 2 Enclosure 2 MyFact_part5.NDF MyFact_part6.NDF MyFact_Part7.NDF Myfact_Part8.NDF Controller 1 Controller 2 HBA 2 Partition 5 Partition 6 Partition 7 Partition 8 CPU Enclosure MyFact_part9.NDF MyFact_part10.NDF MyFact_Part11.NDF Myfact_Part12.NDF Controller 1 Controller 2 HBA 3 Partition 9 Partition 10 Partition 11 Partition 12 CPU Filegroup File / LUN

Fast Track – Pros and Cons
Easy to figure out your needs Simple, scalable and fast In depth guidance available from Microsoft Cons Not recommended for pinpoint queries Only really for processing entire data sets Need VERY understanding Infrastructure team 

Pattern 3 – Datawarehouse on SAN
Large Volume Star Schema Cannot optimize for sequential throughput Shared Storage More mixed workload

Goal – Large Request Size
We need Read Ahead Enterprise edition is capable of issuing a request for 512KB on a single read ahead request (Standard you’re stuck at 64K) It can issue several of these (outstanding I/O) at a time, up to 4MB But you may not even be close to 512KB… Larger sized IO’s when data is in logical order Will issue regardless of where data is, but if you get sequential pages, you get more bang for your buck!

How close are you to the 512k Nirvana
Run something like: And watch this guy: Mention BCP not being suitable as writing to destination can slow requests/generate async_network_io

Fragmentation - Party Foul Champion
#1 killer of read ahead Read ahead size will be reduced if pages being requested aren’t in logical order Being a diligent type, you rebuild your indexes Because SQL is awesome, it does this using parallelism! So what’s the catch…? If Read Ahead is your goal, MAXDOP 1 to rebuild your indexes!

Enclosure 1 PRIMARY MyDB.MDF Filegroup File / LUN

Filegroup File / LUN Enclosure 1 Primary MyDB.MDF Dimensions
Dimensions.NDF Volatile Staging.NDF Facts Facts.NDF Large Fact Partition 1 Partition 2 Partition 3 Partition 4 Partition 5 Partition 6 Partition 7 Partition 8 Fact.NDF Partition1.NDF Partition2.NDF Partition3.NDF Partition4.NDF Partition5.NDF Partition6.NDF Partition7.NDF Partition8.NDF </end>If you’re a BI guy… Filegroup File / LUN

Getting data out of your Data Warehouse for Analysis Services
How does Analysis Services pull in data? While it’s a single operation to process a cube, Analysis Services spins up several threads Now you’ve moved from a table scan which is single threaded, to multiple range scans Look familiar?

Do we have any go faster buttons?
On read heavy workloads and Enterprise Edition, Compression If storing multiple Tables in a Filegroup: “-E” – For Data Warehouses - This allocates 64 extents (4MB) per object, per file, rather than the standard 1 (64K) If using multiple Files in a Filegroup “-T1117” – For all - This ensures that if auto growth occurs on one file, it occurs on all others. Ensures “round robin” remains in place In General on dedicated SQL servers Evaluate “-T834” – Requires Lock Pages in memory enabled This enables large page allocations for the Buffer Pool (2Mb – 16Mb) Can cause problems if memory is fragmented by other apps 834 - Buffer pool allocations are expensive, this reduces the number of them Can cause problems if server is not dedicated or doesn’t have max memory set, as it needs to allocate the larger amount and will fail if it can’t get it or if memory is fragmented

Case Study – The Unruly Fact Table
3 TeraByte Data Warehouse Table scan was topping out at 300 mb/sec Storage was capable of 1.7 GB/sec Table partitioning was in place All tables were in a single Filegroup Had to get creative on enhancing throughput

Test Conditions 16 core server, Hyper Threaded to 32 cores
128 GB of Memory SQLIO fully sequential, storage gives 2.2 GB/Sec 32 range scans started up to simulate workload Page compression was enabled, the T834 trace flag was enabled MAXDOP of 1 on the server to ensure # of threads were controlled

Facts and Figures Not close to peak as it didn’t drive outstanding I/O high enough All files on a single disk partition

Other metrics Scenario Time Secs Avg IO (K) Avg MB/Sec Max MB/Sec
Baseline 70 181 575 622 Post Index Rebuild with MAXDOP 1 68 615 668 Single FG for large table(s) 56 511 942 1196 Multiple FG one per partition 42 505 1,150 1,281 Make sure to flag double the throughput didn’t half the time – Partition size skew Hash partitioning can resolve - #of partitions ideally should be equal to # of cores as only one thread can run per scheduler (Rest go waiting) Table scan is single threaded, partition scan is single threaded, but multiple partition scans become parallel scans

How do we make the changes
Thankfully easy - Index rebuilds! For non partitioned tables, drop and re-create on the new Filegroup For partitioned tables – Alter the partition scheme to point to the new FileGroup For heaps, create a Clustered Index on the table on the new filegroup, then drop it!

Summary File and Filegroup considerations can yield huge gains
Know your workload and optimise for it If you have a Hybrid workload, then have a Hybrid architecture! Don’t neglect your SQL Settings Code changes and indexes aren’t the only way to save the day! And gains across the board rather than pinpoint problems Help resolve in memory bottlenecks

Useful links Paul Randal – Multi file/filegroup testing on Fusion IO
Fast Track Configuration Guide Resolving Latch contention Maximizing Table Scan Speed on SQL 2008 R2 Specifying storage requirements (Find that sweet spot!) Fragmentation in Data Warehouses Partial Database Availability and Piecemeal restores

Thank you!

Getting the most from your SAN – File and Filegroup design patterns

Similar presentations

Presentation on theme: "Getting the most from your SAN – File and Filegroup design patterns"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Getting the most from your SAN – File and Filegroup design patterns

Similar presentations

Presentation on theme: "Getting the most from your SAN – File and Filegroup design patterns"— Presentation transcript:

Similar presentations

About project

Feedback