Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP

Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP
SQL Server Performance Engineering SQL Saturday #575 December 10th 2016, Providence, RI

select * from =tg= where topic =
Remark SQL 4.21 First SQL Server ever used (1994) SQL 6.0 First Log Shipping with failover SQL 6.5 First SQL Server Cluster (NT4.0 + Wolfpack) SQL 7.0 2+ billion rows / month in a single Table SQL 2000 938 days with 100% availability SQL 2000 IA64 First SQL Server on Itanium IA64 SQL 2005 IA64 First OLTP long distance database mirroring SQL 2008 IA64 First Replication into mirrored databases SQL 2008R2 IA64 SQL 2008R2 x64 First 256 CPUs & > STMT/sec First Scale out > STMT/sec First time 1.2+ trillion rows in a table SQL 2012 > Transactions per second > 1.3 Trillion Rows in a table SQL 2014 > Transactions per second Fully automated deploy and management AlwaysOn Automatic HA and DR SQL 2016 Can’t wait to raise the bar again =tg= Thomas Grohser, NTT DATA Senior Director Technical Solutions Architecture / Focus on SQL Server Security, Performance Engineering, Infrastructure and Architecture New Papers coming 2016 Close Relationship with SQLCAT (SQL Server Customer Advisory Team) SCAN (SQL Server Customer Advisory Network) TAP (Technology Adoption Program) Product Teams in Redmond Active PASS member and PASS Summit Speaker 21 Years with SQL Server

NTT DATA Overview Why NTT DATA for MS Services:
20,000 professionals – Optimizing balanced global delivery $1.6B – Annual revenues with history of above-market growth Long-term relationships – >1,000 clients; mid-market to large enterprise Delivery excellence – Enabled by process maturity, tools and accelerators Flexible engagement – Spans consulting, staffing, managed services, outsourcing, and cloud Industry expertise – Driving depth in select industry verticals Why NTT DATA for MS Services: NTT DATA is a Microsoft Gold Certified Partner. We cover the entire MS Stack, from applications to infrastructure to the cloud Proven track record with MS solutions delivered in the past 20 years

Agenda What is a VLDB Lots of random transactions
Inserting / updating from many sources Tables with lots and lots of rows Managing many SQL Server instances Q&A ATTENTION: Important Information may be displayed on any slide at any time! ! Without Warning !

Definition of a VLDB Acronym: Very Large Data Base
It’s not just size, speed availability, reliability are most of the time even more challenging The best definition I came across so far: VLDB := database that needs lots of special attention and planning to operate smoothly

Scaling Up Scaling Out is the better solution
Scaling Out is much harder to do Scaling Out and Up can be combined

Scale Up Goals Allow bigger workload Scale as linear as possible
More Transactions/sec More Queries/sec Scale as linear as possible 2x the hardware = 1.9x the workload Scale out is more line 2x = 1.5 to 1.75

High end OLTP database for true 24/7 website
Online sports betting site, 24x7 world wide customer base concurrent users and growing 17+ TB database about 20 GB growth per day Schema 1000+ tables (11 very large, billions of rows , 90% of DB volume), 4000+ stored procedures Active (hot) data up to 8 years “old” Up to TSQL statements per second

Problems / Pains You will encounter problems never seen before
You are fighting physics all the time Speed of light is a nasty limitation in this universe Latency Throughput Service Window (or better lack of) Even with 4 GB/s (that’s a DVD per second) It can take days to make a full backup or restore so to have a recovery time of 8 hours you need to be prepared and tricky Concurrency Not many people out there that have actual experience with VLDB workloads None of them works in M$ One works at NTT Data

Solutions Solutions that don’t work
Believing vendors and 3rd party tool creators that there product is faster than the speed of light Remember lower cost sells not making it faster Breaking physics on your own Understand that there is no one size fits all solution for VLDB Understand the data and use that knowledge to build the applications in a way that they support the infrastructure and not just depend on it. Cooperate with SQLCAT team to get to the information in time You might need to be willing to become an early adopter

The Basic Trick Scale all Hardware Dimensions CPU Memory Storage
Clock speed Core Count Cache Size Memory Speed Capacity Storage Latency Throughput Network

Scale Up – Single NUMA node
Brand Transformation Presentation Scale Up – Single NUMA node 4 x Dual Core ITANIUM 2 CPUs 24 MB cache each 64 GB memory 4 x dual port 1 Gb/s network card 2 x dual port HBA (4Gb/s) 2 x P800 RAID controller 50 x 72 GB 15kRPM SAS disks SAN storage as needed n x 512GB (on 64 spindles each)

Multiple Nodes Scale Up
Brand Transformation Presentation Multiple Nodes Scale Up cores GB disks NIC HBA Single Dual Quad Octal Almost linear scaling PASS Community Summit 2008 <Session ID #> Failure is not an option – 24x7 OLTP Database Mgmt for VLDB

Understand the path to the drives
Cache Fiber Channel Ports Controllers/Processors Switch HBA RAID Cntr. SAN DAS SSD SSD

Thinking outside the box is great
Placing Resources outside the box not so much In most cases latency is your enemy not throughput The longer the path the higher the latency A well designed server will never benefit from a storage system read cache

Today you can scale to about
High End 16 Sockets with up to 22 cores each + HT (= 704 LP) 24 TB RAM 24 x PCIe Slots (SSD) 16 x Network 2 x 10GE each or 16Gb/s FC Commodity 4 Sockets with 18 cores + HT (= 144 LP) 6 TB RAM 10 x PCIe Slot 2 x 10 GE Network

My favorite SQL 2008R2/2012/2014/2016 feature

CREATE INDEX WITH A SMILE

Scale Up Backup BACKUP Use eight parallel one GB/s sec network interface cards (one physical network, eight subnets) Use 32 parallel backup files each on a separate set of spindles with aligned partitions Transfer four files per network interface card

Scale Up SQL Server IP Address Network Mask
Network Card Network Card File Server IP Address Network Mask Network Card Network Card

Failure is not an option Scale Up
BACKUP DATABASE MyVLDB TO DISK='\\ \backup\MyVLDB_1.bak', DISK='\\ \backup\MyVLDB_2.bak‘ WITH BLOCKSIZE = 8192 Use Jumbo Frames if you can (+100%) with about 9016 bytes frame size

Table Size Matters NOT for OLTP
BTree index very row count independent Example BIGINT Key 476 entries per index page in the tree Logical Reads Rows 1 476 2 226,576 3 107,850,176 107 Million 4 51,336,683,776 51 Billion 5 24,436,261,477,376 24 Trillion 6 11,631,660,463,231,000 11 Quadrillion 7 5,536,670,380,497,940,000 5 Quintillion

One Table made it harder
Inserts are usually limited by latency of the log drive Shifts if you have a lot of concurrent users CPU Cache to CPU Cache transfer speed becomes issue if all inserts go to the end of the table Do not use int use unique identifier or partition int by the least byte of the value

ASP.Net Session State database
300+ Web servers in single farm Hardest case I had in my life (L2/L3 CPU cache transfer speed was bottleneck) 1 Database 1 Table, Clustered and 1 secondary index 200 MB max data size 300+ connections (connection pooling) No backups and no active HA required 9 KB payload per row Select Insert Updates Delete per second

The Issue Could not write to log file fast enough (updates)
What did not work Simple recovery (still logs) Log file on SAN (SSD in SAN) Log file on SSD PCIe Card What worked Schema in Temp db (less logging) Log file in RAM Disk Ram Disk Driver and SQL Server Log Writer on the same CPU and Core using only memory attached to this CPU

Solution Today In memory OLTP with none durable table

DWH used for fraud detection
Online Poker Site Schema: 1 Fact table (with clustered index on sequence) 2 secondary indexes 82 bytes per row Hourly ETL 1.3 trillion rows (each hand ever played) 250 TB database

DWH Intelligent disk file and disk design Use compression for old data
Sequential scanning much faster than random reads DAS much faster than SAN and better control over layout Many disk faster then few disks Use compression for old data Only maintain active data

CPU is not the limit 4 WAY Intel Nehalem EX system with 8 cores each = 32 logical processors Each LP can process about 250 to 500 MB/s = 8 to 16 GB/sec Memory pre Nehalem would have been the limit Make sure the queries scan and not randomly seek Control MAXDOP on a per query level Utilize Mary Go Round Piggy Back Scan

How to layout data on disk for scan
Step 1: Figure out what SQL Server likes best | (might change with new versions ) Rows are stored in Pages (8 KB) Pages are grouped in Extents (8 pages = 64 KB) -Y startup parameter allocates 4 extents at a time (32 pages = 256 KB) -E if you must use SAN Read Ahead limit of Enterprise edition is 512 pages 512/32 = 16 files give you perfect queuing Each file needs to be on different physical spindles

Step 2: Figure out what hardware can do Single rotating hard disk (SATA 7200 RPM) 32 MB/s Chassis can hold 11 disk plus one Hot Spare = 350 MB/s Controller has 4 physical ports with 4 SAS ports each at 6 Gb/s = 750 MB/s can daisy chain 4 chassis per port = 1400 MB/s Total = 3 GB/s (PCIe 2.0 x8 = 4 GB/s can handle this) 4 Controllers 12GB/s possible scan performance Usable Capacity at 1 TB/Disk with RAID 6 and 1 HS is around 500 TB and can be staged at 250 and 375 TB

Step 3: putting 1+1 together 2 TB Partition = File Group = 16 Filies F11 F12 … 2 TB Partition = File Group = 16 Filies F21 F22 … 2 TB Partition = File Group = 16 Filies F31 F32 … Volume 11 Volume 12 F11 F31 F21 Volume 21 Volume 22 F12 F32 F22 Controller 1 Volume 31 Volume 32 … … … … Volume 41 Volume 42 Volume 51 Volume 52 Volume 61 Volume 62 Controller 2

File groups Scan Storage from previous slide
Data Partitions (compressed 2 TB each) One file per channel, fill disk after disk Extra storage on SAS 15kRPM drives on 2 more controllers Staging partitions (8 TB each) BCP areas (8 TB) Compressing table (8 TB) Aggregates (8 TB) SSD for Transaction Logs

Today Put it all on SSD’s

THANK YOU! and may the force be with you…
Questions?

How to manage hundreds of servers
Standardization and Automation If you need to do it twice or more automate it Self service portals

Related topics NUMA/Soft NUMA Affinity IO Partitioning

Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP

Similar presentations

Presentation on theme: "Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP

Similar presentations

Presentation on theme: "Size Matters Not =tg= Thomas Grohser, NTT Data SQL Server MVP"— Presentation transcript:

Similar presentations

About project

Feedback