Henk van der Valk Oct.15, 2016 Level: Beginner

Slides:



Advertisements
Similar presentations
Ravi Sankar Technology Evangelist | Microsoft Corporation
Advertisements

Introduction to Big Data and Hadoop Name Title Microsoft Corporation.
Migrating to Windows Azure SQL Database Name Title Microsoft Corporation.
Using the WDK for Windows Logo and Signature Testing Craig Rowland Program Manager Windows Driver Kits Microsoft Corporation.
SQL SERVER 2012 FOR THE NEW WORLD OF DATA Doug Leland General Manager SQL Server Marketing.
Breaking points of traditional approach What if you could handle big data?
PolyBase in SQL Server 16 David J. DeWitt Rimma V. Nehme
PolyBase Query Hadoop with ease Sahaj Saini SQL Server, Microsoft.
PolyBase Query Hadoop with ease Sahaj Saini Program Manager, Microsoft.
Redmond Protocols Plugfest 2016 Casey Karst PolyBase in SQL Server 2016.
PolyBase overview Speaker Name
IT Operations Management
Data Platform and Analytics Foundational Training
Data Platform Modernization
Convergence /6/2018 © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Data Platform and Analytics Foundational Training
PolyBase: T-SQL Reaching Beyond the Database
Data Platform and Analytics Foundational Training
SQL Server deployments
System Center Marketing
Creating Enterprise Grade BI Models with Azure Analysis Services
Delivering enterprise BI with Azure Analysis Services
System Center Marketing
Microsoft Machine Learning & Data Science Summit
Microsoft /2/2018 3:42 PM BRK3129 Query Big Data using the Expanded T-SQL footprint with PolyBase in SQL Server 2016 Casey Karst Program Manager.
SQL 2016 new Hosting Offers Secure Database Hybrid HyperScale
Microsoft Azure: The only consistent Hybrid Cloud
Easily manage SQL everywhere from anywhere with SQL tools
6/11/2018 8:14 AM THR2175 Building and deploying existing ASP.NET applications using VSTS and Docker on Windows Marcel de Vries CTO, Xpirit © Microsoft.
6/12/2018 2:19 PM BRK3245 DirectQuery in Analysis Services: best practices, performance, and use cases Marco Russo SQLBI © Microsoft Corporation. All rights.
Instructional slide to Partner: REMOVE BEFORE PRESENTING TO CUSTOMER
Enable the Hybrid Data Platform
The Model Architecture with SQL and Polybase
Microsoft Virtual Academy
Microsoft Build /22/ :52 PM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
7/22/2018 9:21 PM BRK3270 Building a Better Data Solution: Microsoft SQL Server and Azure Data Services Joey D’Antoni Principal Consultant Denny Cherry.
Data Platform and Analytics Foundational Training
Installation and database instance essentials
IT Operations Management
Mission-critical performance with Microsoft SQL Server 2016
9/13/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Excel Services Deployment and Administration
Polybase Didn’t That Go Out in the 70’s Stan Geiger.
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
Microsoft Build /20/2018 5:17 AM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
Power Apps & Flow for Microsoft Dynamics SL
Overview of Azure Data Lake Store
Business Intelligence for Project Server/Online
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
Data Platform Modernization
Azure SQL Database: A Guided Tour
Server & Tools Business
Microsoft Virtual Academy
SPC2012 – IT-Pro 11/30/2018 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Delivering an End-to-End Business Intelligence Solution
TechEd /4/2018 3:19 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Power-up NoSQL with Azure Cosmos DB
Microsoft Virtual Academy
Microsoft Virtual Academy
TechEd /11/ :54 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Microsoft Office 365 ProPlus Deployment for IT Pros
TechEd /15/2019 8:08 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Context about the Data Warehouse
Developing for Windows Azure
Common Data Service Data Integrator
Inside SQL Server Polybase
Andrew Fryer Microsoft UK
Day 2, Session 2 Connecting System Center to the Public Cloud
Server & Tools Business
7/28/ :33 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or.
Presentation transcript:

Henk van der Valk Oct.15, 2016 Level: Beginner SQL Server 2016 PolyBase Henk van der Valk Oct.15, 2016 Level: Beginner http://www.sqlsaturday.com/551/Sessions/Schedule.aspx SQL PolyBase has been an high-end feature for SQL APS and now also introduced in SQL2016, SQL DB and SQLDW! It allows you to use regular T-SQL statements to ad-hoc access data stored in Hadoop and/or Azure Blob Storage from within SQL Server. This session will show you how it works & how to get started! www.Henkvandervalk.com

Starting SQL2016 on a server with 24 TB RAM Microsoft Worldwide Partner Conference 2016 Starting SQL2016 on a server with 24 TB RAM 11/27/2018 8:16 AM Just 4 fun! © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Thanks to our platinum sponsors : Please add this slide add the start of your presentation after the first welcome slide PASS SQL Saturday Holland - 2016

Thanks to our gold and silver sponsors : APS Onsite! Please add this slide add the start of your presentation after the first welcome slide PASS SQL Saturday Holland - 2016

Speaker Introduction 2002- Largest SQL DWH in the world (SQL2000) @HenkvanderValk Speaker Introduction 10+ years active in SQLPass community! 10 years of Unisys-EMEA Performance Center 2002- Largest SQL DWH in the world (SQL2000) Project Real – (SQL 2005) ETL WR - loading 1TB within 30 mins (SQL 2008) Contributor to SQL performance whitepapers Perf Tips & tricks: www.henkvandervalk.com Schuberg Philis- 100% uptime for mission critical apps Since april 1st, 2011 – Microsoft Data Platform ! All info represents my own personal opinion (based upon my own experience) and not that of Microsoft

Agenda Intro - What is PolyBase & Why? Getting started - SQL Server product versions supported - Installation & Setup Creating External Tables, Running hybrid queries Monitoring - Tips to improve Hadoop performance Scale out Groups

SQL Server 2016 as fraud detection scoring engine HTAP (Hybrid Transactional Analytical Processing) 8 socket, 192 cores 16 TB RAM https://blogs.technet.microsoft.com/machinelearning/2016/09/22/predictions-at-the-speed-of-data/

The Big Data lake Challenge Different types of data Webpages, logs, and clicks Hardware and software sensors Semi-structured/unstructured data Large scale Hundreds of servers Advanced data analysis Integration between structured and unstructured data Power of both How to orchestrate?

PolyBase builds the Bridge 11/27/2018 8:16 AM PolyBase builds the Bridge Azure Blob Storage RDBMS Hadoop PolyBase Just-in-Time data integration Across relational and non-relational data Fast, simple data loading Best of both worlds T-SQL compatible Uses computational power at source Opportunity for new types of analysis Access any data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

PolyBase View in SQL Server 2016 Execute T-SQL queries against relational data in SQL Server and ‘semi-structured’ data in HDFS and/or Azure Leverage existing T-SQL skills and BI tools to gain insights from different data stores Expand the reach of SQL Server to Hadoop(HDFS & WASB) Query Results SQL Server Hadoop Azure Blob Storage Access any data

Remove the complexity of big data T-SQL over Hadoop Server & Tools Business 11/27/2018 Remove the complexity of big data T-SQL over Hadoop PolyBase NEW  SQL Server Hadoop Manage structured & unstructured data Quote: ************************ ********************** ********************* *********************** $658.39 Simple T-SQL to query Hadoop data (HDFS) NEW  T-SQL query JSON support NEW  Name DOB State Denny Usher 11/13/58 WA Gina Burch 04/29/76 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

PolyBase use cases Load data Interactively query Age-out data 11/27/2018 8:16 AM PolyBase use cases Load data Use Hadoop as an ETL tool to cleanse data before loading to data warehouse with PolyBase Interactively query Analyze relational data with semi-structured data using split-based query processing Age-out data Age-out data to HDFS and use it as ‘cold’ but queryable storage Access any data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Polybase - Turning raw data tweets into information Query & Store Hadoop data Bi-directional seamless & fast

Azure Blob Storage

SQL Server 2016 & SQL DW Polybase! Setup & Query BCP out vs RTC

Prerequisites An instance of SQL Server (64-bit) Ent.Ed. / Developer Ed.. Microsoft .NET Framework 4.5. Oracle Java SE RunTime Environment (JRE) version 7.51 or higher (64-bit). (Either JRE or Server JRE will work). Go to Java SE downloads. Note:The installer will fail if JRE is not present. Minimum memory: 4GB Minimum hard disk space: 2GB TCP/IP connectivity must be enabled.

Step 2: Install SQL Server 11/27/2018 8:16 AM Step 2: Install SQL Server SQL16 PolyBase DLLs Install one or more SQL Server instances with PolyBase PolyBase DLLs (Engine and DMS) are installed and registered as Windows Services Prerequisite: User must download and install JRE (Oracle) Access any data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Components introduced in SQL Server 2016 PolyBase Engine Service PolyBase Data Movement Service (with HDFS Bridge) External table constructs MR pushdown computation support Access any data

How to use PolyBase in SQL Server 2016 PolyBase T-SQL queries submitted here Set up a Hadoop Cluster or Azure Storage blob Install SQL Server Configure a PolyBase group - Choose Hadoop flavor - Attach Hadoop Cluster or Azure Storage Head nodes PolyBase queries can only refer to tables here and/or external tables here Compute nodes Hadoop Cluster Access any data

Step 1: Set up a Hadoop Cluster… 11/27/2018 8:16 AM Step 1: Set up a Hadoop Cluster… Hadoop Cluster Hortonworks or Cloudera Distributions Hadoop 2.0 or above Linux or Windows On-premises or in Azure Access any data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Step 1: …Or set up an Azure Storage blob 11/27/2018 8:16 AM Step 1: …Or set up an Azure Storage blob Azure Storage Volume Azure Storage blob (ASB) exposes an HDFS layer PolyBase reads and writes from ASB using Hadoop RecordReader/RecordWrite No compute pushdown support for ASB Access any data © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Step 2: Configure a PolyBase group 11/27/2018 8:16 AM Step 2: Configure a PolyBase group SQL16 PolyBase Engine PolyBaseDMS Head node Compute nodes PolyBase scale-out group Head node is the SQL Server instance to which queries are submitted Compute nodes are used for scale-out query processing for data in HDFS or Azure © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Step 3: Choose /Select Hadoop flavor Supported Hadoop distributions Cloudera CHD 5.x on Linux Hortonworks 2.x on Linux and Windows Server What happens under the covers? Loading the right client jars to connect to Hadoop distribution Access any data

Step 4: Attach Hadoop Cluster or Azure Storage SQL16 PolyBase Engine PolyBaseDMS Head node Azure Storage Volume Hadoop Cluster Access any data

PolyBase T-SQL queries submitted here After Setup Head nodes Compute nodes are used for scale-out query processing on external tables in HDFS Tables on compute nodes cannot be referenced by queries submitted to head node Number of compute nodes can be dynamically adjusted by DBA Hadoop clusters can be shared between multiple SQL16 PolyBase groups PolyBase T-SQL queries submitted here PolyBase queries can only refer to tables here and/or external tables here Compute nodes - Improved PolyBase query performance with scale-out computation on external data (PolyBase scale-out groups) - Improved PolyBase query performance with faster data movement from HDFS to SQL Server and between PolyBase Engine and SQL Server Hadoop Cluster Access any data

Polybase configuration --1: Create a master key on the database. -- Required to encrypt the credential secret. CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'SQLSat#551'; -- select * from sys.symmetric_keys -- Create a database scoped credential for Azure blob storage. -- IDENTITY: any string (this is not used for authentication to Azure storage). -- SECRET: your Azure storage account key. CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential WITH IDENTITY = 'wasbuser', Secret = '1abcdEFGb3Mcn0F9UdJS/10taXmr5L17xrEO17rlMRL8SNYg==';

Create external Data Source --2: Create an external data source. -- LOCATION: Azure account storage account name and blob container name. -- CREDENTIAL: The database scoped credential created above. CREATE EXTERNAL DATA SOURCE AzureStorage with ( TYPE = HADOOP, LOCATION ='wasbs://staging@vault2016.blob.core.windows.net', CREDENTIAL = AzureStorageCredential ); -- view list of external data sources; select * from sys.external_data_sources

Create External file format --select * from sys.external_file_formats --3: Create an external file format. -- FORMAT TYPE: Type of format in Hadoop -- (DELIMITEDTEXT, RCFILE, ORC, PARQUET). -- With GZIP: CREATE EXTERNAL FILE FORMAT TextDelimited_GZIP WITH ( FORMAT_TYPE = DELIMITEDTEXT , FORMAT_OPTIONS (FIELD_TERMINATOR ='|', USE_TYPE_DEFAULT = TRUE) , DATA_COMPRESSION = 'org.apache.hadoop.io.compress.GzipCodec' );

Create External Table --4: Create an external table. -- The external table points to data stored in Azure storage. -- LOCATION: path to a file or directory that contains the data (relative to the blob container). -- To point to all files under the blob container, use LOCATION='/' CREATE EXTERNAL TABLE [dbo].[lineitem4] ( [ROWID1] [bigint] NULL, [L_SHIPDATE] [smalldatetime] NOT NULL, [L_ORDERKEY] [bigint] NOT NULL, [L_DISCOUNT] [smallmoney] NOT NULL, [.. [L_COMMENT] [varchar](44) NOT NULL ) WITH (LOCATION='/', DATA_SOURCE = AzureStorage, FILE_FORMAT = TextFileFormat, REJECT_TYPE = VALUE, REJECT_VALUE = 0 ));

Import ------------------------------------ -- IMPORT Data from WASB into NEW table: SELECT * INTO [dbo].[LINEITEM_MO_final_temp] from ( SELECT * FROM [dbo].[lineitem1] ) AS Import

Export data (Gzipped) -- Enable Export/ INSERT into external table sp_configure 'allow polybase export', 1; Reconfigure CREATE EXTERNAL TABLE [dbo].[lineitem_export] ( [ROWID1] [bigint] NULL, .. [L_SHIPINSTRUCT] [varchar](25) NOT NULL, [L_COMMENT] [varchar](44) NOT NULL ) WITH (LOCATION='/gzipped', DATA_SOURCE = AzureStorage, FILE_FORMAT = TextDelimited_GZIP, REJECT_TYPE = VALUE, REJECT_VALUE = 0

Manage External resources SSMS / VSTS New: - External Tables - External Resources Ext. Data Sources Ext. File formats

PolyBase query example #1 -- select on external table (data in HDFS) SELECT * FROM Customer WHERE c_nationkey = 3 and c_acctbal < 0; A possible execution plan: EXECUTE QUERY Select * from T where T.c_nationkey =3 and T.c_acctbal < 0 3 IMPORT FROM HDFS HDFS Customer file read into T 2 Additionally - there is… - Support for exporting data to external data source via INSERT INTO EXTERNAL TABLE SELECT FROM TABLE - Support for push-down computation to Hadoop for string operations (compare, LIKE) - Support for ALTER EXTERNAL DATA SOURCE statement CREATE temp table T Execute on compute nodes 1 Access any data

PolyBase query example #2 -- select and aggregate on external table (data in HDFS) SELECT AVG(c_acctbal) FROM Customer WHERE c_acctbal < 0 GROUP BY c_nationkey; What happens here? Step 1: QO compiles predicate into Java and generates a MapReduce (MR) job Step 2: Engine submits MR job to Hadoop cluster. Output left in hdfsTemp. Execution plan: hdfsTemp <US, $-975.21> <FRA, $-119.13> <UK, $-63.52> Run MR Job on Hadoop Apply filter and compute aggregate on Customer. 1 Access any data

PolyBase query example #2 -- select and aggregate on external table (data in HDFS) SELECT AVG(c_acctbal) FROM Customer WHERE c_acctbal < 0 GROUP BY c_nationkey; Execution plan: Predicate and aggregate pushed into Hadoop cluster as a MapReduce job Query optimizer makes a cost-based decision on what operators to push RETURN OPERATION Select * from T 4 IMPORT hdfsTEMP Read hdfsTemp into T 3 CREATE temp table T On DW compute nodes 2 hdfsTemp <US, $-975.21> <FRA, $-119.13> <UK, $-63.52> Run MR Job on Hadoop Apply filter and compute aggregate on Customer. Output left in hdfsTemp 1 Access any data

Server & Tools Business 11/27/2018 Summary: PolyBase Query relational and non-relational data with T-SQL Capability T-SQL for querying relational and non-relational data across SQL Server and Hadoop Benefits New business insights across your data lake Leverage existing skill sets and BI tools Faster time to insights and simplified ETL process Query relational and non-relational data, on-premises and in Azure T-SQL query When it comes to key BI investments, we are making it much easier to manage relational and non-relational data. PolyBase technology allows you to query Hadoop data and SQL Server relational data through a single T-SQL query. One of the challenges we see with Hadoop is there are not enough people knowledgeable in Hadoop and MapReduce, and this technology simplifies the skill set needed to manage Hadoop data. This can also work across your on-premises environment or SQL Server running in Azure. SQL Server Hadoop Apps Access any data © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Monitoring Polybase Queries

Lots of new DMV’s ---------------------------------------- -- Monitoring Polybase / All DMV's : SELECT * FROM sys.external_tables SELECT * FROM sys.external_data_sources SELECT * FROM sys.external_file_formats SELECT * FROM sys.dm_exec_compute_node_errors SELECT * FROM sys.dm_exec_compute_node_status SELECT * FROM sys.dm_exec_compute_nodes SELECT * FROM sys.dm_exec_distributed_request_steps SELECT * FROM sys.dm_exec_dms_services SELECT * FROM sys.dm_exec_distributed_requests SELECT * FROM sys.dm_exec_distributed_sql_requests SELECT * FROM sys.dm_exec_dms_workers SELECT * FROM sys.dm_exec_external_operations SELECT * FROM sys.dm_exec_external_work

Find the longest running query SELECT execution_id, st.text, dr.total_elapsed_time FROM sys.dm_exec_distributed_requests dr cross apply sys.dm_exec_sql_text(sql_handle) st ORDER BY total_elapsed_time DESC;

Find the longest running step of the distributed query plan SELECT execution_id, step_index, operation_type, distribution_type, location_type, status, total_elapsed_time, command FROM sys.dm_exec_distributed_request_steps WHERE execution_id = 'QID1120' ORDER BY total_elapsed_time DESC;

Details on a Step_index SELECT execution_id, step_index, dms_step_index, compute_node_id, type, input_name, length, total_elapsed_time, status FROM sys.dm_exec_external_work WHERE execution_id = 'QID1120' and step_index = 7 ORDER BY total_elapsed_time DESC;

Optimizations

Polybase - data compression to minimize data movement http://henkvandervalk.com/aps-polybase-for-hadoop-and-windows-azure-blob-storage-wasb-integration

Enable Pushdown configuration (Hadoop) Improves query performance Find the file yarn-site.xml in the installation path of SQL Server. C:\Program Files\Microsoft SQL Server\MSSQL13.SQL2016RTM\MSSQL\ Binn\Polybase\Hadoop\conf \ yarn-site.xml On the Hadoop machine: in the Hadoop configuration directory. Copy the value of the configuration key yarn.application.classpath. On the SQL Server machine, in the yarn.site.xml file, find the yarn.application.classpath property. Paste the value from the Hadoop machine into the value element.

APS Cybercrime Filmpje & Demo! Time to Insights APS Cybercrime Filmpje & Demo! Various sources Single query

Further Reading https://msdn.microsoft.com/en-us/library/mt163689.aspx Get started with Polybase: https://msdn.microsoft.com/en-us/library/mt163689.aspx Data compression tests: http://henkvandervalk.com/aps-polybase-for-hadoop-and-windows-azure-blob-storage-wasb-integration

Henk.vanderValk@microsoft.com www.henkvandervalk.com Q&A Henk.vanderValk@microsoft.com www.henkvandervalk.com

Please fill in the evaluation forms Please add this slide add the end of your presentation to get feedback from the audience