Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server.

Slides:



Advertisements
Similar presentations
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Advertisements

Big Data Working with Terabytes in SQL Server Andrew Novick
SQL Performance 2011/12 Joe Chang, SolidQ
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
A Scalable, Predictable Join Operator for Highly Concurrent Data Warehouses George Candea (EPFL & Aster Data) Neoklis Polyzotis (UC Santa Cruz) Radek Vingralek.
1. Aim High with Oracle Real World Performance Andrew Holdsworth Director Real World Performance Group Server Technologies.
Presented by Marie-Gisele Assigue Hon Shea Thursday, March 31 st 2011.
Making Data Warehouse Easy Conor Cunningham – Principal Architect Thomas Kejser – Principal PM.
SQL Server 2008 & Solid State Drives Jon Reade SQL Server Consultant SQL Server 2008 MCITP, MCTS Co-founder SQLServerClub.com, SSC
MySQL Data Warehousing Survival Guide Marius Moscovici Steffan Mejia
Kevin St. Clair Sr. Support Engineer Hewlett-Packard SESSION CODE: UNC306.
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Hochperformante und kostengünstige.
Sunil Agarwal Senior Program Manager Microsoft Corporation SESSION CODE: DAT309.
Mark Harmsworth – Architecture Nate Bruneau – Engineering Scott Kleven – Program Management Microsoft Corporation SESSION CODE: OSP321.
SQL Server Warehousing (Fast Track 4.0 & PDW)
Sometimes it is the stuff you know that hinders true progress.
Lecture 11: DMBS Internals
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
I/O – Chapter 8 Introduction Disk Storage and Dependability – 8.2 Buses and other connectors – 8.4 I/O performance measures – 8.6.
Activity Running Time DurationIntro0 2 min Setup scenario 2 2 min SQL BI components & concepts 4 5 min Data input (Let’s go shopping) 9 7 min Whiteboard.
Maciej Pilecki Consultant, SQL Server MVP Project Botticelli Ltd. SESSION CODE: DAT403.
CSCE Database Systems Chapter 15: Query Execution 1.
Boris Jabes Senior Program Manager Microsoft Corporation SESSION CODE: DEV319 Scale & Productivity in Visual C
SESSION CODE: BIE07-INT Eric Kraemer Senior Program Manager Microsoft Corporation.
Kevin Cox – SQL CAT Microsoft Corporation What are the largest SQL projects in the world? SESSION CODE: DAT305 Srik Raghavan –
END USER TOOLS AND PERFORMANCE MANAGEMENT APPS Excel PerformancePoint Svcs/ProClarity BI PLATFORM SQL Server Reporting Services SQL Server Reporting Services.
Suhail Dutta Program Manager Microsoft Corporation SESSION CODE: DEV402.
Srik Raghavan Principal Lead Program Manager Kevin Cox Principal Program Manager SESSION CODE: DAT206.
Clint Huffman Microsoft Premier Field Engineer (PFE) Microsoft Corporation SESSION CODE: VIR315 Kenon Owens Technical Product Manager Microsoft Corporation.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Jeff King Senior Program Manager, Visual Studio Microsoft Corporation SESSION CODE: WEB305.
Lori Dirks Expression Community Manager Microsoft Corporation SESSION CODE: WEB309.
Solution to help customers and partners accelerate their data.
Olivier Bloch Technical Evangelist Microsoft Corporation SESSION CODE: WEM308.
Younus Aftab Program Manager Microsoft Corporation SESSION CODE: WSV324.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
SESSION CODE: MGT205 Chris Harris Program Manager Microsoft Corporation.
BIO202 | Building Effective Data Visualizations and Maps with Microsoft SQL Server 2008 Reporting Services BIU08-INT | Using.
Martin Woodward Program Manager Microsoft Corporation SESSION CODE: DEV308.
SESSION CODE: COS301. So what do we do?
Introduction to Database Systems1 External Sorting Query Processing: Topic 0.
David A. Carley Senior SDE Microsoft Corporation SESSION CODE: DEV318.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
DMBS Internals I. What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently.
DMBS Architecture May 15 th, Generic Architecture Query compiler/optimizer Execution engine Index/record mgr. Buffer manager Storage manager storage.
SMP MPP with PDW ** Workload requirements usually drive the architecture decision.
Cube Measure Group Measure Partition Cube Dimension Dimension Attribute Attribute Relationship Hierarchy Level Cube Attribute Cube Hierarchy.
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
1 Lecture 16: Data Storage Wednesday, November 6, 2006.
DATABASE OPERATORS AND SOLID STATE DRIVES Geetali Tyagi ( ) Mahima Malik ( ) Shrey Gupta ( ) Vedanshi Kataria ( )
DESIGNING HIGH PERFORMANCE ETL FOR DATA WAREHOUSE. Best Practices and approaches. Alexei Khalyako (SQLCAT) & Marcel Franke (pmOne)
Indexing strategies and good physical designs for performance tuning Kenneth Ureña /SpanishPASSVC.
Storage HDD, SSD and RAID.
Microsoft Ignite /22/2018 7:21 PM BRK2007
Flash Storage 101 Revolutionizing Databases
Lecture 16: Data Storage Wednesday, November 6, 2006.
Database Management Systems (CS 564)
Data Warehouse in the Cloud – Marketing or Reality?
Database Performance Tuning and Query Optimization
Lecture 11: DMBS Internals
Blazing-Fast Performance:
Lecture 9: Data Storage and IO Models
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Selected Topics: External Sorting, Join Algorithms, …
Dana Kaufman SQL Server Appliance Engineering
Chapter 11 Database Performance Tuning and Query Optimization
Presentation transcript:

Fast Track, Microsoft SQL Server 2008 Parallel Data Warehouse and Traditional Data Warehouse Design BI Best Practices and Tuning for Scaling SQL Server 2008

Data Warehouse

Fast Track

PDW

Traditional MD design SSAS PDW SSAS

Characteristic Typical BI (DW’s & DM’s)OLTP (Operational Database) Data Activity Large reads (disjoint sequential scans) Large writes (new data appends) Indexed reads and writes Large scale hashing Small transactions Constant small index reads, writes, and updates Database sweet spot size 100’s of Gigabytes to Terabytes (need medium to large storage farms) Gigabytes (require smaller to medium sized storage farms) Time period Historical (contributes to large data volumes) Current Queries Largely unpredictablePredictable I/O throughput requirement Up to 20 GB/sec sustained throughput IOPS is more important than sustained throughput

Microsoft/HP Fast Track reference configurations OR SQL Server Parallel Data Warehouse (PDW) SQL Server/HP Traditional DW design reference configurations Different logical and physical DB design philosophies Mmm, what will my logical & physical DB design look like ? Lower hardware costs

It is not uncommon to have hundreds of disk drives to support the I/O throughput requirements in a traditional DW environment RAID 5

How does Fast Track and PDW get it’s speed ? X-Ray view at the physical disk level First let’s look at a traditional DW…..

Data is stored wherever it happens to land Sequential data Fact table Initial load Fact table 2 nd day load Fact table 3 rd day load Fact table 5 th day load Fact table 6 th day load

Column Index / Column Index / Column Pre-Calculated data Pre-Calculated data Duplicate data

Disk throughput is slower with indexes, aggregates and summary tables Index-lite is faster because there is less disk head movement Eliminating indexes and storing data sequentially will provide the fastest disk throughput rates Index Summary table Traditional DW design with indexes & summary tables Fast Track & PDW Index-lite Fast Track & PDW Fastest sequential scan rates

Example: Average disk Seek time is typically about 4ms; Full stroke is about 7.5ms. At 15K RPM = 250 revolutions/sec. = 4ms for a full revolution = Average latency is about 2ms. Fast Track & PDW are designed to stream large blocks of data sequentially which is even faster than “average latency” because disk heads are directly over the streaming data.

Seek time is typically 2 - 4x longer than average latency. By eliminating seek time you can have approximately 2 – 4x fewer disk drives in order to maintain a given throughput level. Fast Track & PDW are designed to stream large blocks of data sequentially! Why does PDW and Fast Track want data to be stored sequentially ?

Fast Track and PDW get it’s speed from FAST scan rates ! In addition, HP and SQL Server PDW uses Massively Parallel Processing (MPP) to expand Fast Track concepts in a BI “appliance” Fast scan rates

Traditional DB design Fast Track or PDW

Basic 6 – 12TB DL38x w/ MSA2000 Mainstream 12 – 24TB DL585 G6 w/ MSA2000 Mainstream 16 – 32 TB DL580 G5 w/ MSA2000 G2 Premium 24 – 48 TB DL785 G6 w/ MSA2000 G2

HP SQL Server 2008 Parallel Data Warehouse (PDW) Control Rack Data Rack

Free Your IT Pressures... Get More Value Without HP Factory ExpressWith HP Factory Express Faster time to solution Free up valuable IT resources Maximize your IT investment

ProLiant Servers

Miscellaneous Techniques to Improve SQL Server BI Performance

SQL Server Analysis Services 2008

SQL Server Analysis Services 2008 Techniques to Improve Performance SSAS SSAS has to major components Formula Engine (does most of the analysis work and tries to keep cells in memory) – Fast clock speeds are best Storage Engine (if cells are not in memory, the Storage Engine gets the data from disk) – Goal is to minimize Storage Engine use and keep data in memory for the Formula Engine to use Faster Storage (SSD) OR more disk drives for quicker responses to Storage Engine Manage your partitions in your AS Database by query performance required Because Large Cubes > 100 GB may not fit in memory. So we design the partitions to get into memory as quickly as possible. Best Practice – less than 4 million cells per partition

Tune memory

Buffers are allocated via Execution Trees Each of these Numbered Steps represents a new Execution Tree Spawning multiple copies of the package with a horizontal partition of data will create more process space and execution trees

Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31 st You can also register at the North America 2011 kiosk located at registration Join us in Atlanta next year