Azure SQL DWH: Tips and Tricks for developers

Slides:



Advertisements
Similar presentations
Dos and don’ts of Columnstore indexes The basis of xVelocity in-memory technology What’s it all about The compression methods (RLE / Dictionary encoding)
Advertisements

Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
SQL Server Integration Services (SSIS) Presented by Tarek Ghazali IT Technical Specialist Microsoft SQL Server (MVP) Microsoft Certified Technology Specialist.
Windows Azure Tour Benjamin Day Benjamin Day Consulting, Inc.
06 | Modifying Data in SQL Server Brian Alderman | MCT, CEO / Founder of MicroTechPoint Tobias Ternstrom | Microsoft SQL Server Program Manager.
Azure SQL DW – Elastic Data Analytics in the cloud Josh Sivey | Microsoft TSP #492 | Phoenix.
Windows Azure: Is the Relational Database Dead? Benjamin Day
Data storage services in the cloud Matt Milner Pluralsight.
Review DirectQuery in SSAS 2016, best practices and use cases
A deep dive into Azure AD B2C
Cloud BI with Azure Analysis Services
4/18/2018 6:56 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Data Platform and Analytics Foundational Training
Azure SQL Data Warehouse for Beginners
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
Advanced Topics for Azure SQL Data Warehouse
System Center Marketing
Power BI Performance Tips & Tricks
SQL Data Warehouse: lesson learned and practical implementation tips
Microsoft /2/2018 3:42 PM BRK3129 Query Big Data using the Expanded T-SQL footprint with PolyBase in SQL Server 2016 Casey Karst Program Manager.
Cloud BI with Azure Analysis Services
Why Is My SQL DW Query Slow?
Microsoft Build /22/ :52 PM © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY,
7/22/2018 9:21 PM BRK3270 Building a Better Data Solution: Microsoft SQL Server and Azure Data Services Joey D’Antoni Principal Consultant Denny Cherry.
The New Possibilities in Microsoft Business Intelligence
Design Seamless Upgrades to SQL Server 2016 with Query Store
TechEd /13/2018 7:46 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Azure SQL Datawarehouse - Datawarehouse on Cloud
Machine Learning, Analytics, & Data Science Conference
A developers guide to Azure SQL Data Warehouse
Azure SQL Data Warehouse for SQL Server DBAS
SSAS Tabular Toolbelt Sergiy Lunyakin.
Azure SQL Data Warehouse Scaling: Configuration and Guidance
Analytics for Apps: Landing and Loading Data into SQL Data Warehouse
Welcome! Power BI User Group (PUG)
Overview of Azure Data Lake Store
What is the Azure SQL Datawarehouse?
Please support our sponsors
Azure SQL Data Warehouse Performance Tuning
Microsoft Connect /17/2018 5:15 AM
Massively Parallel Processing in Azure Comparing Hadoop and SQL based MPP architectures in the cloud Josh Sivey SQL Saturday #597 | Phoenix.
Cloud BI with Azure Analysis Services
Azure SQL Data Warehouse for SQL Server DBAS
Server & Tools Business
BRK2279 Real-World Data Movement and Orchestration Patterns using Azure Data Factory Jason Horner, Attunix Cathrine Wilhelmsen, Inmeta -
A developers guide to Azure SQL Data Warehouse
Azure SQL DWH: Tips and Tricks for developers
Welcome! Power BI User Group (PUG)
MPP – Maximize Parallel Productivity
20 Questions with Azure SQL Data Warehouse
Cloud BI with Azure Analysis Services
Azure SQL DWH: Tips and Tricks for developers
Orchestration and data movement with Azure Data Factory v2
Azure SQL DWH: Optimization
Managing batch processing Transient Azure SQL Warehouse Resource
TechEd /15/2019 8:08 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Sunil Agarwal | Principal Program Manager
What query folding means to self-service BI projects
Welcome to Azure Notebooks
Microsoft Ignite NZ October 2016 SKYCITY, Auckland.
Context about the Data Warehouse
Azure SQL DWH: Tips and Tricks for developers
Power BI with Analysis Services
Azure Machine Learning on Databricks
ETL Patterns in the Cloud with Azure Data Factory
Cloud BI with Azure Analysis Services
Moving your on-prem data warehouse to cloud. What are your options?
Visual Data Flows – Azure Data Factory v2
Visual Data Flows – Azure Data Factory v2
Architecture of modern data warehouse
Presentation transcript:

Azure SQL DWH: Tips and Tricks for developers Sergiy Lunyakin Azure SQL DWH: Tips and Tricks for developers

Sponsors!

About me I’m Ukrainian DWH/BI Consultant at ITMagination Data Platform MVP, MCSE BI, MCSA Cloud Platform Leader of Speaker at SQL Conferences Organizer of SQLSaturday Lwow Contacts: sergey.lunyakin@gmail.com @slunyakin

Agenda What is Azure SQL DW Architecture of Azure SQL DW Limitations Check compatibility Handling cross-database query Handling Identity Handling ANSI Update/Delete/Merge/SCD Handling Compute columns Handling Cursor

What is Azure SQL DW Microsoft Azure Platform as a Service It’s a Massively Parallel Processing system (MPP) Distributed Compute and Distributed Storage Scale up and down in several minutes Pause compute resources Supports a subset of T-SQL Join with external data in Azure Blob Storage/Data Lake

Architecture of Azure SQL DW Dist_DB_1 Dist_DB_2 Dist_DB_15 Dist_DB_16 Dist_DB_17 Dist_DB_30 Dist_DB_46 Dist_DB_47 Dist_DB_60 … … … … … …

Logical Overview Control Compute Storage Microsoft Build 2016 4/10/2019 1:58 AM Logical Overview Compute Control Storage © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Distributions Distribution – SQL Database which stores one or more distributed table Splits data table to 60 buckets through compute nodes Hash distributed table * Round-Robin distributed table * Replicate table - New type of table * Selecting the right distribution method is key for good performance

Distributed queries Query Result Control Compute Storage Microsoft Build 2016 4/10/2019 1:58 AM Distributed queries Query Result Control Compute Storage Scale-out distributed query engine © 2016 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Distributed Query SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT SUM(*) FROM dbo.[FactInternetSales] ; Control Compute SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ; SELECT COUNT_BIG(*) FROM dbo.[FactInternetSales] ;

Limitations Primary/Foreign Keys Identity Computed Columns Triggers Cross-database joins Sequences Cursors MERGE ANSI joins on updates/deletes More limitations: https://azure.microsoft.com/en-us/documentation/articles/sql-data-warehouse-migrate-code/ https://feedback.azure.com/forums/307516-sql-data-warehouse

Check compatibility Data warehouse migration utility Free tool Helps to identify unsupported features Helps to identify HASH distribution column Migrate scheama Migrate data (BCP tool)

Cross-database query Azure SQL DW doesn’t support cross-database query. Use ELT approach. Separate schemas. Use External tables as staging tables.

CTAS CTAS is super-charched version of SELECT...INTO Parallelized Better for Data import Data copy Workarounds CREATE TABLE [dbo].[FactInternetSales_new] WITH ( DISTRIBUTION = ROUND_ROBIN , CLUSTERED COLUMNSTORE INDEX ) AS SELECT * FROM [dbo].[FactInternetSales];

Identity Handle it on source side IDENTITY property Explicit import Doesn’t support CTAS Custom Identity with ROW_NUMBER

ANSI JOINS Update/Del/Merge Update/Delete doesn’t support JOINS in FROM Use CTAS for preparing interim table with JOINS Use CTAS for Merge workaround Split Merge to operation steps and use UNION ALL Use interim table for big number of steps Use partitioning for big tables, don’t reload the whole table

Compute columns Handle it in a source system Use CTAS during import Create a View Use explicit data type and nullability check in you calculation expressions Wrong data during migration Schema error during partition switch

Cursor Use WHILE for lopping Prepare a list of elements as a table Loop through this list using While clause and variable Do some action

Summary MPP PaaS Service in Azure Cloud Storing and processing huge amount of structure data Limitation: Identity, ANSI JOINS, MERGE CTAS - Super-charged version of SELECT...INTO CTAS good way for workarounds Better reload data with CTAS than Row-By-Row operations

Sponsors!

The end