Presentation is loading. Please wait.

Presentation is loading. Please wait.

Violin Memory Inc. Proprietary 1 Bottlenecks SQL Server on flash storage Matt Henderson Consulting Engineer.

Similar presentations


Presentation on theme: "Violin Memory Inc. Proprietary 1 Bottlenecks SQL Server on flash storage Matt Henderson Consulting Engineer."— Presentation transcript:

1 Violin Memory Inc. Proprietary 1 Bottlenecks SQL Server on flash storage Matt Henderson Consulting Engineer

2 Violin Memory Inc. Proprietary 2 Modern Datacenters  Blade servers  Rack servers  Multi-CPU (2-4 sockets)  Multi-core (4-10 cores)  2.0GHz+ per core  Moore’s Law went from faster CPU’s to adding more cores  Cheap and plentiful processing power

3 Violin Memory Inc. Proprietary 3 Inside the Server  Many threads  Many cores  Many time slices  10-40% utilization  Why is the CPU not at 100%? -> Resource constrained

4 Violin Memory Inc. Proprietary 4 Databases  Persistent data storage  Data access engine  Application logic processor

5 Violin Memory Inc. Proprietary 5 SQL Server Process Management  One SQL scheduler per logical core  SQL scheduler time-slices between users

6 Violin Memory Inc. Proprietary 6 Queues  Running ‒ Currently executing process  Waiting ‒ Waiting for a resource (IO, network, locks, latches, etc)  Runnable ‒ Resource ready, waiting to get on CPU

7 Violin Memory Inc. Proprietary 7 Bottleneck Basics  Waits: What is keeping the SQL engine from continuing? ‒ Application (SQL) ‒ Hardware ‒ Architecture ‒ Tuning Production WaitTypeWait_SResource_SSignal_SWaitCountPercentageAvgWait_SAvgRes_SAvgSig_S BROKER_RECEIVE_WAITFOR661.36 0444.6165.3388 0 LCK_M_IS139.46139.350.114899.40.28520.2850.0002 LCK_M_X96.8696.540.323736.530.25970.25880.0009 LCK_M_U83.9383.910.02325.662.62272.62210.0006 PAGEIOLATCH_SH83.9283.840.0898355.660.0085 0 LCK_M_S82.4482.10.334195.560.19670.19590.0008 ASYNC_NETWORK_IO54.453.610.79331463.670.0016 0 ASYNC_IO_COMPLETION43.1 0372.911.1649 0 BACKUPIO42.2242.190.03126072.850.0033 0 BACKUPBUFFER36.6436.480.1521752.470.0168 0.0001 LCK_M_IX30.8830.850.031302.080.23760.23730.0003 IO_COMPLETION28.1228.110.0126111.90.0108 0 CXPACKET23.2721.61.6735421.570.00660.00610.0005 PREEMPTIVE_OS_CREATEFILE18.84 02471.270.0763 0

8 Violin Memory Inc. Proprietary 8 Visualizing Latency I/O Bound Apps  Oracle  DB2  SQL Server  Etc… Flash Storage 0.15ms (150 microsecond) latency Time 8ms Over 50 IO in the same amount of time Total latency = seek time + rotational latency Block of data requested Block of data returned HDD Storage 8ms latency

9 Violin Memory Inc. Proprietary 9 Optimizing Utilization CPU’s are faster than hard drives Adding cores does not correct issue 8 cores @ 10% = 1 core @ 80% 40% utilization = 60% over paying

10 Violin Memory Inc. Proprietary 10 Storage Enables Applications

11 Violin Memory Inc. Proprietary 11 Accelerate Workloads & Save Money Accelerate Same CPU’s Same work Less human time Consolidate Fewer cores = few licenses Operating System Virtualization Database / Application

12 Violin Memory Inc. Proprietary 12 Compounding issues  Adding more cores increases licensing costs  Faster cores still have to wait for blocking items  Probably faster, definitely more expensive  Best to find the bottleneck and solve it versus buying more/faster CPU’s

13 Violin Memory Inc. Proprietary 13 Buffer / RAM  SQL Buffer ‒ MRU – LRU chain ‒ Pages move down chain until they fall off the end  PLE: Page Life Expectancy ‒ How long a page lives in the chain before being cycled out ‒ MSFT recommends over 300  Working Set ‒ How much data do you want/NEED in RAM? ‒ Database size isn’t relevant ‒ What’s being worked on now, what will be in the near future?  Workload profile ‒ What data are the users hitting? ‒ Is there any way to predict the future data page hits?

14 Violin Memory Inc. Proprietary 14 Legacy Storage Architecture  Aggregation & Segregation Model  Many autonomous parts  Separate RAID groups  Data locality  Hot spots  Transient data / usage  Tiering software / admin  Cost Inefficient  Each workload bottlenecked by a different subset of components  Flash != Flash  SSDs are a modernization – disk substitution  Distributed block all-flash is a revolution

15 Violin Memory Inc. Proprietary 15 Database Trends More, More, More Data never dies – archived but not deleted More users, more applications, hybrid systems Interdependent usage Usage and complexity explosion More users SQL is adhoc Machines consume data (SaaS) Mixed workloads (OLTP & DW) Multi-core, Multi-CPU servers Processing capacity of host servers exploding Compute is cheap 24x7 is REQUIRED Little to no maintenance windows Workload is random Many LUN stripes on each disk Many users, applications, databases, servers

16 Violin Memory Inc. Proprietary 16 Flash for Databases - General Simplicity Less to design, plan. Easy to scale. No performance variance Faster, Quicker, More. Speed and Scale Reduced admin time No LUN/stripe mapping Chasing down / resolving performance issues Chores and tasks run faster ETL’s Backup/Restores – reduced RTO No index rebuilding for logical fragmentation Reduced admin costs / issues No costly tiering/locality management software No quarterly new-spindle purchases Turn asynch mirroring into synch mirroring Less hardware to do the same work (context switches) Drop from Enterprise to Standard edition SQL Server

17 Violin Memory Inc. Proprietary 17 Right Sizing & Eliminating Bottlenecks SQL Server Number of database files Windows Number of “Disks” (LUNs) Transport FC 8Gb port 100k IOPs 800MB/s iSCSI (10 GigE) 60k IOPs 550MB/s Virtualization One virtual disk

18 Violin Memory Inc. Proprietary 18 Architecting for I/O  Rule of Many: At every layer utilize many of each component to reduce bottlenecks ‒ Database files ‒ Virtual disks ‒ MPIO paths ‒ Physical ports ‒ LUNs  Parallelization: Use many objects and many processes to increase parallel workloads ‒ Spread transactions over pages ‒ Use a switch (path multiplier) ‒ Use several LUNs ‒ Increase MAXDOP (and test)  I/O Latency: Is Sacred. Don’t add anything to the I/O path that doesn’t need to be there ‒ LVM (Logical Volume Manager) ‒ Virtualization ‒ Compression ‒ De-dup

19 Violin Memory Inc. Proprietary 19 SMB Direct: Optimize CPU Utilization Windows Transport (Fibre / iSCSI) Windows *RDMA NDKPI Remote Direct Memory Access Reduce CPU time spent handling I/O More CPU for applications Reduce time (latency) to return I/O Accelerate applications * 30-80% reduction in CPU utilization

20 Violin Memory Inc. Proprietary 20 Demo

21 Violin Memory Inc. Proprietary 21 Australian Department of Defense Results Sustained 500-800K IOPS at <1ms latency Server consolidation – 45 servers to 2 Core / licensing consolidation – 90 CPU’s to 8 90% reduction in DC space & power requirements Simplified management and support processes on MS stack Solution Windows Flash Array 4-CPU host servers Applications Network monitoring / dashboard reporting MSSQL Hyper-v Challenges Latency sensitive applications Massive IOPs requirement Data center space limitations Unable to take advantage of HA and virtualization

22 Violin Memory Inc. Proprietary 22 Logistic company – WFA testing results Existing platform with Disk+Flash and SQL2012 WFA+SMB Direct and SQL2012 WFA+SMB Direct and SQL2014 (updatable in-memory columnstore index) Test environment: 0.455 mill rows/sec Test environment: 13.2 mill rows/sec Improve d 30x Test environment: 211 mill rows/sec Improve d 464x Prod environment: 0.621 mill rows/sec Prod environment: 13.2 mill rows/sec Improve d 21x Prod environment: 211 mill rows/sec Improve d 340x Technical details SQL Server with 1.3 billon rows Utilize Microsoft’s SMB Direct networking protocol with RDMA capable 10Gb Ethernet cards Host server: HP DL 580 (4 CPUs – 96 logical cores – 4 Rack U) Storage: Violin Memory WFA (3 Rack U – Windows Server embedded inside array for native SMB Direct) Results from Violin WFA storage:

23 Violin Memory Inc. Proprietary 23 Cluster SharePoint 2013 with Hyper-V and WFA 1 2 3 4  Reduce risk ‒ New tools & features ‒ New usage pattern ‒ Random and dynamic usage over many users  Reduce costs (fewer servers)  Increase performance ‒ Optimize CPU for application work ‒ Crawls / index searches ‒ Content access (read or write documents) ‒ Permissions authentication (page loads)  Live Migration ‒ Move SQL, app and web servers between hosts ‒ Move data between arrays  Balance reads/writes  Increase performance  Isolate operations work

24 Violin Memory Inc. Proprietary 24 What is Flash?  Non-volatile storage  Used in cell phones, USB ‘thumb’ drives, SSD’s, cameras, etc  SLC vs. MLC vs. eMLC vs. TLC  Pros ‒ Random access ‒ Storage density ‒ Read and Write very fast (~30us) ‒ Power & Heat  Cons ‒ Memory wear ‒ No deletes or updates at byte or bit level ‒ Erase only at block level  ~30x slower than reads/writes  Blocking transaction while moving still active values to another wafer ‒ Block erases cause the infamous Write Cliff (latency spikes up to 30x)

25 Violin Memory Inc. Proprietary 25 Flash Technology 1 Package 8 Dies 1000s of Blocks per Die256 Pages per Block Writes at Page Level Erases at Block Level

26 Violin Memory Inc. Proprietary 26 Flash Deployments PCIe card  1 st generation  Fastest latency configuration  Not sharable  Crashes host on failure  Read mostly  Requires host based RAID  Has write cliff All-flash array  3 rd generation  One block of storage, pre-RAID’d  Sharable or direct attach (fastest latency configuration)  Enterprise redundancy  Massive parallelism  No write cliff – Violin only SSD  2 nd generation  Controller adds latency  Sharable with chassis  Won’t crash host  Requires data segregation / RAID  Has write cliff

27 Violin Memory Inc. Proprietary 27 NTFS Tuning – Need several LUNs

28 Violin Memory Inc. Proprietary 28 NTFS Tuning – Content Indexing

29 Violin Memory Inc. Proprietary 29 LUN Tuning – 4k LUN at 4k NTFS ALU at 4k (used to be 64k for disk) Increases utilization of RAM, Transport, SQL Buffer

30 Violin Memory Inc. Proprietary 30 SQL Tuning – IDENTITY Column & Clustered Indexes IDENTITY column As PK = OK As Clustered index = NOT OK Huddles data onto one data page Bottlenecks on inserts Removes locks/latches as source of bottleneck

31 Violin Memory Inc. Proprietary 31 SQL Tuning – In and Outs Out Many files for speed / data placement Many partitions for speed / data placement Index defragging to re-order data MAXDOP <= 4 Admin in serial mode (one at a time) Table loads (ETL) Index maintenance Backups Archiving Admin only during off hours Segregated workloads In Any number of files, partitions MAXDOP = number of cores in server Admin in parallel Admin during business processing Mixed workloads (many servers & DB’s on same storage device)

32 Violin Memory Inc. Proprietary 32 Q&A

33 Violin Memory Inc. Proprietary 33 Thank you


Download ppt "Violin Memory Inc. Proprietary 1 Bottlenecks SQL Server on flash storage Matt Henderson Consulting Engineer."

Similar presentations


Ads by Google