12/4/2018 12:40 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered.

Slides:



Advertisements
Similar presentations
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Advertisements

SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.
Microsoft Data Warehouse Vision Massive Scalability at Low Cost Improved Business Agility and Alignment Democratized Business Intelligence Hardware.
Doug Lanman Data Warehousing SSP North Central, Midwest and Heartland Districts SQL Server Data Warehousing.
Performance and Scalability. Optimizing PerformanceScaling UpScaling Out.
Wade Wegner Windows Azure Technical Evangelist Microsoft Corporation Windows Azure AppFabric Caching.
Richard Tkachuk Senior Program Manager Microsoft DAT301.
Private Cloud: Application Transformation Business Priorities Presentation.
PlacePlace TypeType ServiceService Analysis Caching Integration Sync Search Relational BLOB Query BackupLoad Multi Dim In Memory File XML Reporting.
Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
Overview of SQL Server Alka Arora.
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
SESSION CODE: BIE07-INT Eric Kraemer Senior Program Manager Microsoft Corporation.
Data Management Conference Data Warehousing John Plummer TSP Architect
Rushabh Mehta Managing Director (India) | Solid Quality Mentors
Service Pack 2 System Center Configuration Manager 2007.
Comprehensive Flexible Global Storage and Search Responsive Available Secure Manageable Federation Coordination Consolidation Transformation Synchronization.
Migrate SQL Server Apps to SQL Azure Cloud DB
Won Huh Product Marketing Manager
Dev and Test Solution reference architecture.
Data Platform and Analytics Foundational Training
Data Platform Modernization
Business Continuity & Disaster Recovery
Data Platform and Analytics Foundational Training
Dev and Test Solution reference architecture.
System Center Marketing
5/22/2018 1:39 AM BRK2156 Power BI Report Server: Self-service BI and enterprise reporting on-premises Christopher Finlan Senior Program Manager © Microsoft.
Dev and Test Solution reference architecture.
The story of an IoT solution
System Center Marketing
Business Critical Application Platform
SharePoint Solutions Architect, Protiviti
Enable the Hybrid Data Platform
6/19/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Dev and Test Solution reference architecture.
Dev and Test Solution reference architecture.
Installation and database instance essentials
Business Critical Application Platform
Data Warehousing: SQL Server Parallel Data Warehouse AU3 update
Required 9s and data protection: introduction to sql server 2012 alwayson, new high availability solution Santosh Balasubramanian Senior Program Manager.
Business Continuity & Disaster Recovery
9/21/2018 3:41 AM BRK3180 Architect your big data solutions with SQL Data Warehouse & Azure Analysis Services Josh Caplan & Matt Usher Program Managers.
SQL Server BI on Windows Azure Virtual Machines
SQL Server OLTP with Microsoft Azure Virtual Machines
Business Intelligence for Project Server/Online
11/11/2018 Desktop Virtualization Corey Hynes Kyle Rosenthal President Technical Lead HynesITe Inc Spider Consulting @windowspcguy.
Data Platform Modernization
Server & Tools Business
Ed oms team OMS: Log Analytics Ed oms team.
TechEd /23/ :44 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Disaster Recovery as a Service
Microsoft Virtual Academy
Dev and Test Solution reference architecture.
1/2/2019 5:18 PM THR3016 Customer stories: Plan and orchestrate large resource deployments on Azure infrastructure Igal Figlin Principal PM Manager – Azure.
TechEd /15/2019 8:08 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Windows Server 2008 Iain McDonald Director of Program Management
Context about the Data Warehouse
Enabling the hybrid cloud with remote access appliances
MDC-B203 Deploying Applications in Microsoft System Center Virtual Machine Manager Using Services John Messec Program Manager Microsoft.
Building and running HPC apps in Windows Azure
Developing for Windows Azure
Delivering great hardware solutions for Windows
Andrew Fryer Microsoft UK
Service Template Creation from the Ground Up
04 | Performance and the Premium SKU
Service Template Creation from the Ground Up
Backup your private cloud workloads before it’s too late!
Day 2, Session 2 Connecting System Center to the Public Cloud
Microsoft Virtual Academy
Microsoft Virtual Academy
Presentation transcript:

12/4/2018 12:40 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Matt Hollingsworth Microsoft DAT301 12/4/2018 12:40 AM A First Look at Large-Scale Data Warehousing in Microsoft SQL Server Code Name "Madison" Matt Hollingsworth Microsoft DAT301 © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Agenda Concepts and Principles Madison functional overview Early adoption

Symmetric Multiprocessing SMP Single DB instance “Shared Everything” Architecture Server/CPU’s share memory disks Can lead to resource contention as you scale

Massively Parallel Processing MPP Server/CPU’s have their own dedicated resources “Shared Nothing” Architecture “Secret Sauce” is parallelizing operations Lightning-fast Queries, Data Loads and Updates Linear Scalability Problem needs to be partitionable

SMP vs MPP SMP MPP HW advancements increasing ability to scale-up Scaling is limited High end SMP very expensive Extremely high concurrency for some workloads Less than 1-2 TB of data SMP will almost always be better Full SQL Server functionality HA must be architected in HW advancements increasing ability to scale-up & scale-out Scaling to 1 PB+ Scale out is relatively low cost Relatively high concurrency for complex workloads > 2 TB up to 1 PB Limited SQL Server functionality HA is built in

Best practices focus on preserving the sequential order of data Sequential I/O Sequential I/O Random I/O Ideal for data warehousing Scalable, predictable performance Large reads & writes Requires 1/3 or fewer drives for same performance Ideal for OLTP Not as predictable & scalable for data warehousing Small reads and writes Requires large number of drives Best practices focus on preserving the sequential order of data

About DATAllegro… Technology Partners Proprietary Appliance Management and MPP Database Open Source Database and OS Industry Standard Servers Industry Standard Networking Industry Standard Storage

Integration Plans Microsoft BI Reference Hardware Platforms Provide scale out through MPP on SQL Server and Windows Offer ‘Appliance like’ user experience to Data Warehouse customers Lower TCO to high end Data Warehousing Offer integrated BI platform to small and very large Enterprises Microsoft BI OPEN SOURCE DATABASE & OS Industry Standard Servers Reference Hardware Platforms Industry Standard Networking Industry Standard Storage

Balanced Across All Components A Holistic Approach SQL Server 2008 Potential Performance Bottlenecks SERVER CACHE SQL Server WINDOWS CPU Cores FC Switch A B DISK LUN FC HBA A B STORAGE CONTROLLER A B CACHE CPU Feed Rate SQL Server Read Ahead Rate HBA Port Rate Switch Port Rate SP Port Rate LUN Read Rate Disk Feed Rate

Sequential I/O Physical table structures, file layouts and SQL Server settings to maximize sequential I/O Enough disks to feed available CPU cores Carefully designed storage infrastructure to maximize and sustain sequential I/O No bottlenecks Where possible, separate I/O paths and disks for data, TempDB and logs

Pre-configured, pre-tested HW reference architectures (4-32 TB) Fast Track DW Accelerate scalable Data Warehouse deployments at lower TCO Pre-configured, pre-tested HW reference architectures (4-32 TB) SI Solution Templates Appliance-like time to value Flexibility through choice of HW platforms Low TCO through commodity hardware and value pricing Reduced risk through pre-tested and pre-tuned configurations Provides a clear upgrade path to “Madison” via Hub/Spoke

MPP Additional Considerations Principles & approach of SMP carry forward Deeper level of complexity – High Availability Parallelization Inter node data movement

Modular building blocks Balanced CPU and storage Both SMP and MPP are based on building blocks that scale by the CPU core Adds network, storage processing and disk bandwidth for each core Based on maximizing & sustaining true sequential I/O while minimizing disks Generally changes balance of systems so more can be spent on CPU and SW than on storage to give better overall performance for a given budget Building blocks can be adjusted for multiple MPP configurations – high performance, archive and extreme performance

The Future of SQL Server Data Warehousing Project "Madison" Build on Proven Scale for SQL Server Data Warehousing Predictable Scale out through MPP Customers with over 400 TB data warehouses Accelerate plan to support largest Data Warehouses Provide Massive Scale with Low TCO Integrated with Microsoft BI

SQL Server MPP: 10,000-foot view Appliance-like model Hardware and Software In unison and in balance no bottlenecks Achieve max performance per component For each HW component and each SW module: Define max performance Identify optimum workload type Adjust surrounding HW/SW to achieve optimum Packages engineering talent Lots of knowledge, many hours of tuning, trying, testing Hardware Software

Commodity Hardware Lower cost Frequent performance improvements Easier upgrade and maintenance Higher customer comfort Better compatibility

Madison MPP Data Warehouse Architecture Private Network Compute Nodes Industry Standard SAN Storage Distributed DB SQL SQL Corporate Network Control Node Active/Passive SQL Client Drivers SQL SQL Landing Zone Spare Node ETL Load Interface SQL Configuration & Monitoring Microsoft Cluster Server Backup Corporate Backup Solution

Ultra Shared Nothing An extension of traditional shared nothing design Push shared nothing architecture into SMP node IO and CPU affinity within SMP nodes Eliminate contention per user query Use full resources for each user query Multiple physical instances of tables Distribute large tables Replicate small tables Distribute AND Replicate medium tables Re-Distribute rows “on-the-fly” when necessary

Control Node & Client Drivers Client connections always go through the control node Clustered to a passive node Processes SQL requests Prepares execution plan Orchestrates distributed execution Local SQL Server to do final query plan processing / result aggregation Will use same set of drivers used by DATAllegro Provided by DataDirect ODBC, OLE-DB, JDBC and Ado.Net client drivers Wire protocol (SeQuel Link) Available drivers for 32 and 64 bits

Compute Nodes A SQL Server 2008 instance DB engine nodes autonomous on local data SQL as primary interface Each MPP node is a highly tuned SMP node with standard interfaces

Landing Zone Provides high capacity storage for data files from ETL processes Integration services available on the landing zone Connected to internal network Available as sandbox for other applications and scripts that run on internal network. Source Landing Zone Files Data Loader Compute Nodes

Backup Node Builds on SQL Server native backup/ restore facility Use VDI interface to plug into backup pipeline Database-level backup Coordinated backup across the nodes Quiesce write activity to synchronize Can only restore to another appliance with exactly the same number of distributions

Configuration and Monitoring Challenge: Is it an appliance or a collection of nodes? Madison services instrumented Logs and Performance Counters Capture and forward SNMP alerts from devices within the appliance Small subset of DMVs to union underlying node DMVs Leverage HPC for monitoring

High Availability Multiple levels of redundancy: Leveraging MSCS for node availability Cluster aware services: SQL Server, Madison, DMS Leveraging MSCS for SQL Services, DMS 1 spare node for every 6* compute nodes 6x1

Security and Encryption Retain DA v3 design Authentication and authorization done by Madison server Users and Roles as first class principals Nested role capabilities Connection to SQL back-ends through high privilege account SQL nodes reside on private network No support for integrated auth Leverages TDE to expose DB-level encryption Supports key rotation

The Logical Data Model Multiple databases per appliance Tables Views Each user database maps to one SQL Server db per node Tables Replicated, Distributed, Replicated + Distributed Leverage SQL Server compression Supports Partitioning Supports secondary indexes Views

SQL Server Data Types DAv3 Madison bigint P binary bit char / nchar date, time datetime (was date in DA) datetime2 datetimeoffset decimal float geometry / geography hierarchyid Int (was integer in DA) money real smalldatetime smallint smallmoney sql_variant text / ntext / image timestamp tinyint varchar / nvarchar / varbinary v*(max) uniqueidentifier xml Data Types Most scalar data types supported by SQL Server 2008 are supported by Madison Main exceptions Character and binary strings limited to 8K (i.e. no BLOB support) XML Sql-Variant System and CLR UDTs Latin1_General with binary comparison only

Supported SQL Syntax Aligned with ANSI SQL 92 CREATE TABLE AS SELECT Basic INSERT, UPDATE, DELETE, SELECT CREATE TABLE AS SELECT Limited analytical function support Teradata extensions Quantile, Sample,…

Manageability Web-based main administrative user interface Based on DATAllegro manageability UI Monitoring system health and activity Leveraging HPC pack 2008 Systems management Monitoring Cluster health

Query Tools GUI Tool: Command line tool: Nexus (CoffingDW) Table & view object explorer Interactive query execution Command line tool: Replacement for DA-SQL Flavor of SqlCmd

Demo Tools Walk through

MS BI Integration Integration Services Reporting Services Madison enabled as a source Data movement, lookup operations, etc. Will add a new SSIS destination Ensure integrated high performance loads Reporting Services Fully supported; including parameterized queries Will customize experience for report builder and report designer Analysis Services Will get connectivity through OLE-DB provider Will enable both MOLAP and ROLAP storage

Madison - Hub & Spoke Each business unit has own Data Marts Finance Sales HR Manufacturing SQL Server AS Spoke Madison Spoke Madison HUB SQL Server DM Spoke SQL Server DM Spoke Each business unit has own Data Marts More responsive to business needs Fits budget realities Hub provides centralized data governance platform Node-to-node data movement Parallel over Infiniband or 10 Gig Networks ~500GB per min with minimal overhead

Benefits of Hub-And-Spoke All systems connect via a dedicated high speed network Parallel database copy – speeds of up to 500 GB per min Simplification of data mart ETL / ELT processes with publishing model Separation of management and user workloads Integration of SMP SS08 and MPP systems Ability to independently expand any system Ability to add additional spokes without impacting other users Deployment of development and test environments that leverage parallel connectivity

Early Adoption MTP – Madison Technology Preview Our flavor of CTP Assess product and field/partners readiness Provide roadmap for competitive situations Location MTC’s, Partners, other MS facilities, … Working with partners to secure hardware 2-3 week engagements TAP – Technology Adoption Program Closer to traditional TAP Assess production readiness Longer engagement Go-live requirements Customer secures hardware

High Level Release Definitions Will start running MTPs in the summer V2+ Closer functional alignment with SQL Server Better integration with SQL and MS ecosystem, tools and technologies “Madison” (aka v1) Focus on time to market Compatibility with DATAllegro v3 MS BI integration H1 2010

Recap Data Warehousing Reference Architectures available today! SQL Server Fast Track SQL Server “Madison” Built for advanced, large scale data warehouses Shared-nothing MPP architecture Early evaluation programs starting soon All feedback welcome: richtk@microsoft.com Thank you!

question & answer

Get your copy autographed by Lynn or Stephen Monday, 3rd 17:00 to 18:00 Intersoft Book Shop

Required Slide 12/4/2018 12:40 AM © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.