Microsoft Analytics Platform System

Slides:



Advertisements
Similar presentations
2012 © Trivadis BASEL BERN LAUSANNE ZÜRICH DÜSSELDORF FRANKFURT A.M. FREIBURG I.BR. HAMBURG MÜNCHEN STUTTGART WIEN TechTalk Beste Skalierbarkeit dank massiv.
Advertisements

SSRS 2008 Architecture Improvements Scale-out SSRS 2008 Report Engine Scalability Improvements.
Microsoft Data Warehouse Vision Massive Scalability at Low Cost Improved Business Agility and Alignment Democratized Business Intelligence Hardware.
High Performance Analytical Appliance MPP Database Server Platform for high performance Prebuilt appliance with HW & SW included and optimally configured.
Connect with life Praveen Srvatsa Director | AsthraSoft Consulting Microsoft Regional Director, Bangalore Microsoft MVP, ASP.NET.
Introduction to Big Data and Hadoop Name Title Microsoft Corporation.
Training Workshop Windows Azure Platform. Presentation Outline (hidden slide): Technical Level: 200 Intended Audience: Developers Objectives (what do.
Using the WDK for Windows Logo and Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
LegendCorp What is System Center Virtual Machine Manager (SCVMM)? SCVMM at a glance Features and Benefits Components / Topology /
Microsoft Virtual Academy. Microsoft Virtual Academy First HalfSecond Half (01) Introduction to Microsoft Virtualization(05) Hyper-V Management (02) Hyper-V.
SQL Server 2008 R2 Parallel Data Warehouse: Under the Hood Brian Mitchell Senior Premier Field Engineer.
IT Operations Management
Data Platform and Analytics Foundational Training
Business Continuity & Disaster Recovery
Convergence /6/2018 © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Data Platform and Analytics Foundational Training
Microsoft Virtual Academy
System Center Marketing
Creating Enterprise Grade BI Models with Azure Analysis Services
System Center Marketing
SharePoint Solutions Architect, Protiviti
Microsoft /2/2018 3:42 PM BRK3129 Query Big Data using the Expanded T-SQL footprint with PolyBase in SQL Server 2016 Casey Karst Program Manager.
Microsoft Azure: The only consistent Hybrid Cloud
6/11/2018 8:14 AM THR2175 Building and deploying existing ASP.NET applications using VSTS and Docker on Windows Marcel de Vries CTO, Xpirit © Microsoft.
Enable the Hybrid Data Platform
Data Platform and Analytics Foundational Training
Introduction Module 16 9/5/2018 9:26 PM
Installation and database instance essentials
IT Operations Management
Data Warehousing: SQL Server Parallel Data Warehouse AU3 update
Required 9s and data protection: introduction to sql server 2012 alwayson, new high availability solution Santosh Balasubramanian Senior Program Manager.
Cloud Database Based on SQL Server 2012 Technologies
Business Continuity & Disaster Recovery
11/8/2018 2:35 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or.
iSCSI Software Target for Application Storage and Boot
Microsoft Azure P wer Lunch
Windows Azure 講師: 李智樺, Ruddy Lee
Microsoft Virtual Academy
TechEd /23/ :44 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Microsoft Virtual Academy
TechEd /24/2018 6:19 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Microsoft Virtual Academy
Disaster Recovery as a Service
Power-up NoSQL with Azure Cosmos DB
Microsoft Virtual Academy
1/2/2019 5:18 PM THR3016 Customer stories: Plan and orchestrate large resource deployments on Azure infrastructure Igal Figlin Principal PM Manager – Azure.
Microsoft Virtual Academy
Microsoft Virtual Academy
Building continuously available systems with Hyper-V
Microsoft Virtual Academy
Microsoft Virtual Academy
MDC-B203 Deploying Applications in Microsoft System Center Virtual Machine Manager Using Services John Messec Program Manager Microsoft.
Upgrading Your Private Cloud with Windows Server 2012 R2
Power BI with Analysis Services
Developing for Windows Azure
Andrew Fryer Microsoft UK
Service Template Creation from the Ground Up
04 | Performance and the Premium SKU
Windows Azure Hybrid Architectures and Patterns
Service Template Creation from the Ground Up
Day 2, Session 2 Connecting System Center to the Public Cloud
Monitor VMware with SC2012 SP1 Operation Manager & Veeam Microsoft Tools for VMware Integration & Migration Symon Perriman Michael Stafford Senior.
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Building Windows Store Apps with Windows Azure Mobile Services
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Presentation transcript:

Microsoft Analytics Platform System 02 – Hardware/Software Architecture Brian Walker | Microsoft ​Architect – Data Insights COE Jesse Fountain| Microsoft ​WW TSP Lead November 23, 2018

Agenda SMP and MPP differences What is APS? APS hardware components High availability PDW region overview PDW tools of the trade

Differences between SMP and MPP

SQL upgrade prospect scenario: SQL Server 2014 or APS SQL Server customers with scale and performance needs have a choice to make Customer Application SQL 2014 Solution? New H/W & back to Step1 New Requirements: More Business, More Growth ! Step 2: Optimize Step 1: Upgrade Customer Application SQL 2008 R2 Effects All Systems. Total Upgrade. Upgrade 2 releases greater Infrastructure Re-Architecture Replace/Upgrade SAN Regression Testing Re-optimization is required for new version Re-Optimization required Unknown maintenance & cost requirements Unknown gain in performance Heavy testing and thought required Start Here Customer Application APS New Requirements? Step 1: Migrate Solution? Add Capacity Push-Button Upgrade Automatic Data Distribution Infrastructure Optimized Data Center Rack Optimized Migrate once. Quickly deliver value Pre-Optimized Infrastructure Best Practice Configuration Built-In Automatic Data Distribution & Placement Migrate once and upgrades are managed

Data warehousing comparison: SMP vs. MPP Data Volume Mixed Workload 50 TB 100 TB 500 TB 10 TB 5 PB Query Concurrency Strategic, Tactical Strategic Loads Loads, SLA APS – Multi-dimensonal Scalability SMP – Tunable in one dimension at cost of other dimensions 1.000 100 10.000 The spiderweb depicts important attributes to consider when evaluating Data Warehousing options. Big Data support is newest dimension. Data Freshness Query complexity Near Real Time Data Feeds Daily Load Weekly 3-5 Way Joins Joins + OLAP operations + Aggregation + Complex “Where” constraints + Views Parallelism 5-10 Way Joins Normalized Multiple, Integrated Stars and Normalized Simple Star Multiple, Integrated Stars Batch Reporting, Repetitive Queries Ad Hoc Queries Data Analysis/Mining TB’s MB’s GB’s Query Freedom Schema Sophistication Query Data Volume

Management simplicity and lower operational costs Database Administration Task APS (MPP SQL Server) SQL Server 2014 (SMP SQL Server) Logical Data Modeling High Physical Data Modeling Low Data Partitioning Definition Data Placement Definition Auto Free Space Management Data Balancing Control Data Reorganization None Moderate Index Reorganization Workspace Management Query Tuning Change Management Rearchitect Environment Never Often Database management in APS Built-in high availability and failover Linear scalability by adding nodes Minimal tuning efforts and troubleshooting Simple database and table definition Unified administration console Minimum ongoing maintenance No need to manage disk or database subsystems No detailed space management needed No memory/cache management needed No optimizer hints needed No need to manage parallelism No need to manage physical computing nodes No index reorgs needed No index rebuilds needed Use DBAs for higher value activity, not low-level system management

What is APS?

Microsoft Analytics Platform System Low-Cost, Rapid Value Appliance: System is pre-configured at factory Industry Standard: Co-engineered with HP/Dell/Quanta Automatic Compression: 5X-15X with Columnar Technology Insight on All Data POLYBASE-SQL: easy-access to both relational and Hadoop data Near Real-Time: Access multiple data sources quickly Shared-Nothing: allow linear scalability to store all historical data High-Performance Analytics MPP: Powerful: Massively Parallel Processing (MPP) Engine Mature: Parallel Cost-Based Optimizer from SQL Server Dedicated: Direct-attached high-speed servers and storage

The foundation for data warehousing and advanced analytics Combines Hardware & Software to provide a turn-key, balanced platform specific to data warehouse & analytical workloads Built for easy scale-out as Data Warehouse capacity requirements grow Deep, native integration with Hadoop High Performance & Concurrent Data Warehouse Workloads for simultaneous data Loading & Query Built-in Development Engineering Best Practices Integrated Systems Monitoring & Management

New features come first to APS Updateable Column Store Agnostic Hadoop Integration via Polybase Cardinality Estimation Cost-Based Distributed SQL Query Engine Hub and Spoke Architecture Support Analytical Functions (e.g. Lag and Lead) Incremental functional releases each year

APS hardware components

Rack and network Contains Also added Rack Ethernet Switches InfiniBand Switches Also added Power Units (PDU)

PDW Base Scale Unit Contains Orchestration Host Passive Host Optional Passive Host Data Scale Unit

Hadoop Base Scale Unit Contains Rack & Network PDW Base Scale Unit Orchestration Host Passive Host Data Scale Unit

Data Scale Unit Servers “active” in WFC Unit of growth Used by both regions Varies in size By Vendor By Appliance Size Uses Existing Switches

Details HP configuration 2 – 56 compute nodes 1 – 7 racks MGXFY13 11/23/2018 HP configuration Base Unit (6U): Redundant Infiniband Redundant Ethernet Mgmt & Control (Active) Rack Failover Node (Passive) Extension Base Unit (5U): Redundant Infiniband Redundant Ethernet Rack Failover Node (Passive) Extension Base Unit (5U): Redundant Infiniband Redundant Ethernet Rack Failover Node (Passive) Infiniband Ethernet Control Node Failover Node Infiniband Ethernet Failover Node Infiniband Ethernet Failover Node 2 – 56 compute nodes 1 – 7 racks 1, 2, or 3 TB drives 15.1 – 1268.4 TB raw 53 – 6342 TB User data Up to 7 spare nodes available across the entire appliance Details Reserved Customer Space (9U) ETL Servers Backup Servers Passive Unit (Additional spares) Reserved Reserved Space (9U) Reserved Customer Space (8U) ETL Servers Backup Servers Passive Unit (Additional spares) JBOD 4 Compute Node 7 Compute Node 8 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 8 Compute Node 15 Compute Node 16 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 12 Compute Node 23 Compute Node 24 60TB (Raw) Full Rack Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB 120.8TB (Raw) 2 Rack 181.2TB (Raw) 3 Rack JBOD 3 Compute Node 5 Compute Node 6 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 7 Compute Node 13 Compute Node 14 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 11 Compute Node 21 Compute Node 22 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 2 Compute Node 3 Compute Node 4 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 6 Compute Node 11 Compute Node 12 Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 10 Compute Node 19 Compute Node 20 1/2 Rack 30TB (Raw) Scale Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB 90.6TB (Raw) 1 1/2 Rack JBOD 1 Compute Node 1 Compute Node 2 Base Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 5 Compute Node 9 Compute Node 10 Extension Base Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB JBOD 9 Compute Node 17 Compute Node 18 15TB (Raw) ¼ Rack Extension Base Unit (7U): 2 HP 1U Servers (16 Cores/Ea. Total: 32) JBOD 5U 1TB Drives User Data Capacity: 75TB 1¼ Rack 75.5TB (Raw) © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Dell and Quanta configuration MGXFY13 11/23/2018 Dell and Quanta configuration Infiniband Ethernet Control Node Failover Node PDW Backplane (6U): Redundant Infiniband Redundant Ethernet Mgmt & Control (Active) Rack Failover Node (Passive) 2 – 54 compute nodes 1 – 6 racks 1, 2, or 3 TB drives 22.65 – 1223.1 TB raw 79 – 6116 TB User data Up to 6 spare nodes available across the entire appliance Details Reserved Reserved (6U) JBOD 5 Compute Node 8 Compute Node 9 JBOD 6 Compute Node 7 Base Unit (10U): 3 Servers in 2U enclosure (16 Cores/Ea. Total: 48) 2 JBOD 4U ea. 1TB Drives User Data Capacity: 79TB 67.9TB (Raw) Full Rack JBOD 3 Compute Node 5 Compute Node 6 JBOD 4 Compute Node 4 Base Unit (10U): 3 Servers in 2U enclosure (16 Cores/Ea. Total: 48) 2 JBOD 4U ea. 1TB Drives User Data Capacity: 79TB 45.3TB (Raw) 2/3 Rack JBOD 1 Compute Node 2 Compute Node 3 JBOD 2 Compute Node 1 Base Unit (10U): 3 Servers in 2U enclosure (16 Cores/Ea. Total: 48) 2 JBOD 4U ea. 1TB Drives User Data Capacity: 79TB 22.6TB (Raw) 1/3 Rack JBOD 2 Compute Node 2 Compute Node 3 JBOD 1 Compute Node 1 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

High availability

Failover in Action: Control host failure HST01 node marked as failed (AD02 Persists on HST02) Cluster fails over to HST02 HST02 already “warm” so Failover is very Fast WFOHST01 HST01 CTL01 AD01 VMM HST02 CTL01 AD02 VMM HSA01 CMP01 ISCSI01 DAS01 HSA02 CMP02 ISCSI02 HSA03 CMP03 ISCSI03 DAS02 HSA04 CMP04 ISCSI04 HSA05 CMP05 ISCSI05 DAS03 HSA06 CMP06 ISCSI06

Failover in Action: Compute node failure Compute node marked as failed PDW Cluster restarts compute node on a passive server ISCSI VM does not fail over WFOHST01 HST01 CTL01 AD01 VMM HST02 CMP01 HSA01 CMP01 ISCSI01 DAS01 HSA02 CMP02 ISCSI02 HSA03 CMP03 ISCSI03 DAS02 HSA04 CMP04 ISCSI04 HSA05 CMP05 ISCSI05 DAS03 HSA06 CMP06 ISCSI06

APS disk layout: LUNs and filegroups/files Each LUN is composed of 2 drives in RAID1 mirroring configuration Distributions are now split into 2 files TempDB and Log are across all 16 LUNs No fixed TempDB or log size allocation VHDXs are on JBODs to ensure high availability Disk I/O further parallelized relative to V1: bandwidth to increase by ~70% in V2 RTM Design Details Disk 1 Disk 2 Node 1: Distribution A – file 1 Node 1: Distribution A – file 2 Temp DB Log Disk 3 Disk 4 Disk 5 Disk 6 Node 1: Distribution B – file 2 Node 1: Distribution B – file 1 Disk 7 Disk 8 . . Disk 29 Disk 30 Node 1: Distribution H – file 1 Node 1: Distribution H – file 2 Disk 31 Disk 32 Disk 33 Disk 34 Node 2: Distribution A – file 1 Temp DB Log Disk 35 Disk 36 . . Disk 65 Disk 66 Fabric storage (VHDXs for node) Disk 67 Disk 68 Hot spares Disk 69 Disk 70 JBOD

Hadoop region HA #Scale Units Replication Factor Polybase =1 2 3 >1 Head/Controlling Nodes behave exactly the same as for PDW Data Nodes are different APS relies on Hadoop data replication for data availability Disks are not Mirrored Data Nodes do not failover Replication Factor is configurable #Scale Units Replication Factor Polybase =1 2 3 >1

PDW region overview

PDW Region Hadoop Region Appliance WFOHST01 HST01 HST02 HSA01 HSA02 CTL01 AD01 VMM ISCSI01 ISCSI02 ISCSI04 ISCSI03 ISCSI05 ISCSI06 CMP01 CMP02 CMP03 CMP04 CMP05 CMP06 DAS01 DAS02 DAS03 AD02

Virtual Machine Manager Fabric Active Directory PDW region nodes WFOHST01 HST01 HST02 HSA01 HSA02 HSA03 HSA04 HSA05 HSA06 CTL01 AD01 VMM ISCSI01 ISCSI02 ISCSI04 ISCSI03 ISCSI05 ISCSI06 CMP01 CMP02 CMP03 CMP04 CMP05 CMP06 DAS01 DAS02 DAS03 AD02 PDW Nodes Control Compute (>1) Infrastructure Nodes Virtual Machine Manager Fabric Active Directory

Control and Compute workload nodes Control Node Compute Node Compute Node Compute Node 011010101010101010110101011101010101011011010010 011010101010101010110101011101010101011011010010 011010101010101010110101011101010101011011010010 Compute Node Compute Node Compute Node 011010101010101010110101011101010101011011010010 011010101010101010110101011101010101011011010010 011010101010101010110101011101010101011011010010

Services inside the Control node PDW Services Responsibilities PDW Engine DMS Core PDW Agent SQL Server Admin Console (IIS) Parse SQL / Syntax Check Validate & Authorize Generate D-SQL Plan Orchestrate D-SQL Plan Collate Diagnostic Info Admin Console Web App

Services inside the Compute node PDW Services Responsibilities DMS Core PDW Agent SQL Server Hold User Data Process Queries Move Data Load Data

Virtual Machine Manager (VMM) Deployment of Virtual Machines Configuration of Virtual Machines Hosts Windows Update Services (WSUS) Sits in the Fabric domain

PDW tools of the trade

Tools of the trade PDW Development Management SSDT (Visual Studio) SQLCMD SSIS / SSAS / SSRS Power BI SSIS adapters dwloader.exe Management Console dwconfig.exe pav.exe PowerShell System Center

Tool distribution with PDW Location Download http://www.microsoft.com/en- us/download/details.aspx?id=45294 Tools Distributed SSIS Destinations Clienttools.msi dwloader Adventureworks Help File

Connecting from SQL 2012? Use SNAC 11 Connecting from SQL 2008 R2? Connecting to PDW Management Console https://Control_Node_IP_Address/ Development tools TCP Port 17001 Security Connecting from SQL 2012? Use SNAC 11 Connecting from SQL 2008 R2? Use SNAC 10

SQL Server Data Tools (SSDT) Used for writing queries against PDW SSMS is not a supported tool SSDT for PDW part of standard SSDT deployment model

SQL Server Integration Services Destination Adapters available for SSIS 2008R2 SSIS 2012 SSIS 2014

BI solution development Power Query Power Pivot Power Map Power View SQL Server Analysis Services ROLAP Facts MOLAP Dimensions SQL Server Reporting Services

SQLCMD.exe -I SQLCMD support ..\Microsoft SQL Server\110\Tools\Binn\ SNAC 10 2008 R2 SNAC 11 for 2012+ QUOTED IDENTIFIER ON  Mandatory Must be set at SQLCMD Invocation SQLCMD.exe -I

Third-party tools Attunity Replicate MicroDesigner (MicroERD) PDW Region supported target http://www.attunity.com/solutions/data- warehousing/microsoft-pdw MicroDesigner (MicroERD) Data Modelling tool http://microerd.com/

PDW region configuration

Region must be restarted once reset

Management & monitoring

Management console Read-Only View of Management Information Can cancel User Sessions, Queries, Loads Can easily visualize DSQL structures Requires View Server State All data accessible via DMVs

Demo | Admin Console

Microsoft Analytics Platform System 11/23/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.