Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.

Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016

Big Picture Provides a scalable, T-SQL language extension for combining data from both universes

PolyBase Use Cases

PolyBase Across the Enterprise SQL Product Load DataQuery DataAge Out Data HadoopWASBHadoopWASBHadoopWASB SQL Server 2016 YYYYYY Analytic Platform System (APS)Y YYYYY Azure SQL DW NYNNY

The Hadoop Ecosystem

Initially: MapReduce for insights from HDFS-resident data Recently: SQL-like data warehouse technologies on HDFS e.g. Hive, Impala, HAWQ, Spark/Shark Hadoop Evolution

All the interest in Big Data Increased number and variety of data sources that generate large quantities of data. Realization that data is “too valuable” to delete. Dramatic decline in the cost of hardware, especially storage.

PolyBase View

Step 1: Setup a Hadoop Cluster Hortonworks or Cloudera Distributions Hadoop 2.0 or above Linux or Windows On premise or in Azure

Or Azure Storage Account Azure Storage Blob (ASB) exposes an HDFS layer PolyBase reads and writes from ASB using Hadoop APIs No compute push-down support for ASB

Step 2: Install SQL Server Select PolyBase feature Adds new PolyBase services - PolyBase Engine - PolyBase Data Movement Service (DMS) Pre-requisite: download and install JRE

1. Install multiple SQL Server instances with PolyBase. Step 3: Scale-out 14 Head Node PolyBase Engine PolyBase DMS PolyBase Engine 2. Choose one as Head Node. 3. Configure remaining as Compute Nodes a.Run sp_polybase_join_group b.Restart PolyBase DMS

After Step 3 PolyBase Scale-out Group Head node is the SQL Server instance to which queries are submitted Compute nodes are used for scale out query processing for data in HDFS or Azure

Step 4 - Choose Hadoop flavor Latest Hadoop distributions supported in SQL16 RTM Cloudera CHD 5.5 on Linux Hortonworks 2.3 on Linux & Windows Server What happens under the covers? Loading the right client jars to connect to Hadoop distribution -- different numbers map to various Hadoop flavors -- example: value 4 stands for HDP 2.0 on Windows or ASB, value 5 for HDP 2.0 on Linux, value 6 for CHD 5.1/5.5 on Linux, value 7 for HDP 2.1/2.2/2.3 on Linux/Windows or ASB 7

After Step 4

PolyBase Design

Under-the-hood

Uses Hadoop RecordReaders/RecordWriters to read/write standard HDFS file types HDFS bridge in DMS

Under-the-hood

Namenode (HDFS) Hadoop Cluster File System Data moves between clusters in parallel SQL16

Under-the-hood

Creating External Tables Once per Hadoop Cluster Once per File Format HDFS File Path

Creating External Tables (secure Hadoop) Once per Hadoop User HDFS File Path Once per File Format Once per Hadoop Cluster per user

Under-the-hood

Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.

Similar presentations

Presentation on theme: "Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.

Similar presentations

Presentation on theme: "Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016."— Presentation transcript:

Similar presentations

About project

Feedback