Download presentation
Presentation is loading. Please wait.
Published bySimon Hawkins Modified over 9 years ago
1
SQL Server Scaling on Big Iron (NUMA) Systems Joe Chang jchang6@yahoo.com www.qdpma.com TPC-H
2
About Joe Chang SQL Server Execution Plan Cost Model True cost structure by system architecture Decoding statblob (distribution statistics) SQL Clone – statistics-only database Tools ExecStats – cross-reference index use by SQL- execution plan Performance Monitoring, Profiler/Trace aggregation
3
TPC-H
4
TPC-H DSS – 22 queries, geometric mean 60X range plan cost, comparable actual range Power – single stream Tests ability to scale parallel execution plans Throughput – multiple streams Scale Factor 1 – Line item data is 1GB 875MB with DATE instead of DATETIME Only single column indexes allowed, Ad-hoc
5
Observed Scaling Behaviors Good scaling, leveling off at high DOP Perfect Scaling ??? Super Scaling Negative Scaling especially at high DOP Execution Plan change Completely different behavior
7
TPC-H Published Results
8
TPC-H SF 100GB Between 2-way Xeon 5570, all are close, HDD has best throughput, SATA SSD has best composite, and Fusion-IO has be power. Westmere and Magny-Cours, both 192GB memory, are very close 2-way Xeon 5355, 5570, 5680, Opt 6176
9
TPC-H SF 300GB 8x QC/6C & 4x12C Opt, 6C Istanbul improved over 4C Shanghai by 45% Power, 73% Through-put, 59% overall. 4x12C 2.3GHz improved17% over 8x6C 2.8GHz
10
TPC-H SF 1000 Oracle RAC, 64-nodes, 128 Xeon 5450 quad-core 3.0GHz processors Power 782,608, 5.6X higher than Superdome 2 with 64-cores
11
TPC-H SF 3TB X7460 & X7560 Nehalem-EX 64 cores better than 96 Core 2.
12
TPC-H SF 100GB, 300GB & 3TB Westmere and Magny-Cours are very close Between 2-way Xeon 5570, all are close, HDD has best through-put, SATA SSD has best composite, and Fusion-IO has be power SF100 2-way SF300 8x QC/6C & 4x12C 6C Istanbul improved over 4C Shanghai by 45% Power, 73% Through-put, 59% overall. 4x12C 2.3GHz improved17% over 8x6C 2.8GHz SF 3TB X7460 & X7560 Nehalem-EX 64 cores better than 96 Core 2.
13
TPC-H Published Results SQL Server excels in Power Limited by Geometric mean, anomalies Trails in Throughput Other DBMS get better throughput than power SQL Server throughput below Power by wide margin Speculation – SQL Server does not throttle back parallelism with load?
14
TPC-H SF100 Power Through put QphH Processors Total Cores SQL GHz Mem GB SF 23,378.013,381.017,686.72 Xeon 535585sp22.6664100 67,712.938,019.150,738.42x5570 HDD88sp12.93144100 99,426.3 94,761.5 55,038.2 53,855.6 73,974.6 71,438.3 2 Xeon 5680 2 Opt 6176 12 24 8r2 3.33 2.3 192 100 70,048.537,749.151,422.42x5570 SSD88sp12.93144100 72,110.536,190.851,085.65570 Fusion88sp12.93144100
15
TPC-H SF300 Power Through put QphH Processors Total Cores SQL GHz Mem GB SF 25,206.4 67,287.4 75,161.2 109,067.1 13,283.8 41,526.4 44,271.9 76,869.0 18,298.5 52,860.2 57,684.7 91,558.2 4 Opt 8220 8 Opt 8360 8 32 5rtm 8rtm 2.8 2.5 128 256 8 Opt 8384 8 Opt 8439 32 48 8rtm 8sp1 2.7 2.8 256 300 129,198.389,547.7107,561.24 Opt 6176488r22.3512300 152,453.196,585.4121,345.64 Xeon 7560328r22.26640300 All of the above are HP results?, Sun result Opt 8384, sp1, Pwr 67,095.6, Thr 45,343.5, QphH 55,157.5
16
TPC-H 1TB Power Through put QphH Processors Total Cores SQL GHz Mem GB SF 95,789.169,367.681,367.68 Opt 8439488R2?2.85121000 108,436.896,652.7102,375.38 Opt 843948ASE2.83841000 111,557.0128,259.1123,323.1Itanium 914064 O11g 1.63841000 139,181.0141,188.1140,181.1Itanium 935064 O11R2 1.735121000 782,608.71,740,1221,166,977Xeon 5450512 O RAC 3.020481000
17
TPC-H 3TB Power Through put QphH Processors Total Cores SQL GHz Mem GB SF 120,254.887,841.4102,254.816 Xeon 7460968r22.6610243000 185,297.7142,685.6162,601.78 Xeon 7560648r22.265123000 142,790.7171,607.4156,537.3POWER664 Sybase 5.05123000 182,350.7216,967.7198,907.5SPARC128 O11R2 2.885123000
18
TPC-H Published Results Power 23,378 72,110.5 99,426.3 94,761.5 25,206.4 67,287.4 75,161.2 109,067.1 129,198.3 185,297.7 Through put 13,381 36,190.8 55,038.2 53,855.6 13,283.8 41,526.4 44,271.9 76,869.0 89,547.7 142,685.6 QphH 17,686.7 51,085.6 73,974.6 71,438.3 18,298.5 52,860.2 57,684.7 91,558.2 107,561.2 162,601.7 Processors Total Cores SQL GHz Mem GB 2 Xeon 5355 2 Xeon 5570 2 Xeon 5680 2 Opt 6176 8 8 12 24 5sp2 8sp1 8r2 2.66 2.93 3.33 2.3 64 144 192 4 Opt 8220 8 Opt 8360 8 32 5rtm 8rtm 2.8 2.5 128 256 8 Opt 8384 8 Opt 8439 32 48 8rtm 8sp1 2.7 2.8 256 4 Opt 6176488r22.3512 8 Xeon 7560648r22.26512 SF 100 300 3000
19
SF100 Big Queries (sec) Xeon 5570 with SATA SSD poor on Q9, reason unknown Both Xeon 5680 and Opteron 6176 big improvement over Xeon 5570 Query time in sec
20
SF100 Middle Q Xeon 5570-HDD and 5680-SSD poor on Q12, reason unknown Opteron 6176 poor on Q11 Query time in sec
21
SF100 Small Queries Query time in sec Xeon 5680 and Opteron poor on Q20 Note limited scaling on Q2, & 17
23
SF300 Big Queries Query time in sec Opteron 6176 poor relative to 8439 on Q9 & 13, same number of total cores
24
SF300 Middle Q Opteron 6176 much better than 8439 on Q11 & 19 Worse on Q12 Query time in sec
25
SF300 Small Q Opteron 6176 much better on Q2, even with 8439 on others Query time in sec
27
SF1000 Sybase vs. SQL Server Query time, Sybase relative SQL Server, both on DL785 48-core
28
SF1000 Large Queries
29
SF1000 Middle Queries
30
SF1000 Small Queries
31
SF1000 Itanium - Superdome Query time, Superdome 2 versus Superdome, 16-way quad-core and 32-way dual-core
32
512-core C2 RAC vs. 64-core It2 Query time, Superdome 2 versus RAC, 16-way quad-core (64 cores) and 64-node 2-way quad-core (512 cores) Oracle RAC 5.6X higher Power
34
SF 3TB – 8×7560 versus 16×7460 Broadly 50% faster overall, 5X+ on one, slower on 2, comparable on 3 5.6X
35
64 cores, PWR6 vs. Xeon 7560 Query time, POWER6 relative to X7560 Overall, Xeon 7560 is 30% faster on power, but wide variations on individual queries, some with Pwr6 faster
36
SF3000 Big Queries
37
SF3000 Middle and Small Q
39
TPC-H Summary Scaling is impressive on some SQL Limited ability (value) is scaling small Q Anomalies, negative scaling
40
TPC-H Queries
41
Q1 Pricing Summary Report
42
Query 2 Minimum Cost Supplier Wordy, but only touches the small tables, second lowest plan cost (Q15)
43
Q3
44
Q6 Forecasting Revenue Change
45
Q7 Volume Shipping
46
Q8 National Market Share
47
Q9 Product Type Profit Measure
48
Q11 Important Stock Identification Non-Parallel Parallel
49
Q12 Random IO?
50
Q13 Why does Q13 have perfect scaling?
51
Q17 Small Quantity Order Revenue
52
Q18 Large Volume Customer Non-Parallel Parallel
53
Q19
54
Q20? This query may get a poor execution plan Date functions are usually written as because Line Item date columns are “date” type CAST helps DOP 1 plan, but get bad plan for parallel
55
Q21 Suppliers Who Kept Orders Waiting Note 3 references to Line Item
56
Q22
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.