Massively Parallel Processing in Azure Comparing Hadoop and SQL based MPP architectures in the cloud Josh Sivey SQL Saturday #597 | Phoenix
Agenda What “kind” of MPP are we talking about? Benefits of using Azure for MPP solutions Comparing Hadoop MPP vs. SQL MPP Hadoop (Azure HDInsight) SQL (Azure SQL Data Warehouse) Discuss PaaS vs. IaaS Demos! Wrap-up SQL Saturday #597 | PHOENIX 2017
What “kind” of MPP are we talking about? massively parallel refers to the use of a large number of processors (or separate computers) to perform a set of coordinated computations in parallel (simultaneously). Share-Nothing Infrastructure Easily Scales Out SQL Saturday #597 | PHOENIX 2017
Benefits of using Azure for MPP solutions Ease / Speed of Deployment No Infrastructure Selection / Procurement Reduced Maintenance Cost Pay only for what you use Scale Out and Up SQL Saturday #597 | PHOENIX 2017
PaaS vs. IaaS Infrastructure-as-a-Service (IaaS) Equipment Servers, Storage, Networking Platform-as-a-Service (PaaS) Complete Solution Ecosystem Equipment and Software SQL Saturday #597 | PHOENIX 2017
PaaS vs. IaaS Platform-as-a-Service (PaaS) Infrastructure-as-a-Service Decreased Maintenance Abstracted Complexity of Architecture New versions/features Automatically Rolled Out Infrastructure-as-a-Service Fine grain control of environment Choice of Software Versions Customizable SQL Saturday #597 | PHOENIX 2017
Hadoop MPP vs. SQL MPP Hadoop MPP SQL MPP Hadoop Ecosystem HDFS, Hive, Tez, Impala, … Structured, Semi-Structured, Unstructured Data SQL MPP SQL Server on MPP Architecture T-SQL For Queries SSMS SQL Saturday #597 | PHOENIX 2017
Demos – What are we going to show? SQL Saturday #597 | PHOENIX 2017
Demo #1 – Hadoop MPP Demo HDInsight via Azure Marketplace Potential Use Cases Dev / POC Cases without impacting Production Testing Version Upgrades Peak Processing Offload Backups SQL Saturday #597 | PHOENIX 2017
Demo #1 – Hadoop MPP Demo Resulting Architecture SQL Saturday #597 | PHOENIX 2017
Demo #1 – Hadoop MPP Demo Review Used Cloudera Distribution from Azure Marketplace to Provision a Hadoop Cluster Connected to commonly-used Hadoop Tools in the Cloud Updated HDFS configuration to allow to connect to Azure Blob Storage Copied Data into HDFS Connected Client Tool to Cloud Cluster SQL Saturday #597 | PHOENIX 2017
Demo #2 – SQL MPP Demo Azure SQL Data Warehouse Potential Use Cases via Azure Marketplace Potential Use Cases Building a new cloud based Data Warehouse Hybrid data source scenarios High-Performance Computing Agility and Elastic Scale SQL Saturday #597 | PHOENIX 2017
Demo #2 – SQL MPP Demo - Architecture By combining MPP architecture and Azure storage capabilities, SQL Data Warehouse can: Grow or shrink storage independent of compute. Grow or shrink compute without moving data. Pause compute capacity while keeping data intact. Resume compute capacity at a moment's notice. SQL Saturday #597 | PHOENIX 2017
Demo #2 – SQL MPP Demo Review Azure SQL Data Warehouse - PaaS Azure SQL DW is a cloud-based, scale-out database capable of processing massive volumes of data Increase, decrease, pause, or resume compute in seconds. Fully fault tolerant with automatic back-ups. Develop with familiar SQL Server T-SQL and tools. SQL Saturday #597 | PHOENIX 2017
Thank you Sponsors!
Thank You