Presentation is loading. Please wait.

Presentation is loading. Please wait.

Welcome to SQL Saturday Denmark

Similar presentations


Presentation on theme: "Welcome to SQL Saturday Denmark"— Presentation transcript:

1 Welcome to SQL Saturday Denmark
Azure Data Lake

2 Thanks you our PLATINIUM sponsors

3 Thanks you our GOLD and SILVER sponsors

4 About me Kenneth M. Nielsen Worked with SQL Server since 1999
Data Solution Architect at Microsoft @doktorkermit Linkedin.com/in/KennethMNielsen

5 Agenda Azure Data Lake Store Azure Data Lake Analytics
Azure Data Lake Analytics – Using Visual Studio Azure Data Lake Analytics – Using PowerShell Q & A

6 Data Lake Store

7 Azure Data Lake Store A hyper scale repository for big data analytics workloads No limits to SCALE Store ANY DATA in its native format HADOOP FILE SYSTEM (HDFS) for the cloud ENTERPRISE READY access control, encryption at rest Optimized for analytic workload PERFORMANCE Data lake store is your new friend for storing data, actually almost unlimited data, and the price, well it cost next to nothing to store data on Azure Any file-format is supported, data is stored in its native format, meaning that you can store, images, json tables, csv, tcv, blobs etc etc. It is build on HDFS, and here it is HDFS for the cloud.

8 Azure Data Lake Store Any Data Unstructured Semi-structured Structured

9 Azure Data Lake Store

10 Azure Data Lake Store HDFS for the cloud New filesystem build from the ground up, based on HADOOP file system Integrates with HDInsight, Hortonworks and Cloudera Supports Files and Folder objects and operations Support for rename, create and delete files and folders. Files system build from the scratch, based on HADOOP files system. Microsoft Azure Data Lake Store is a Hadoop file system that’s compatible with Hadoop Distributed File System (HDFS) and works with the Hadoop ecosystem. Data Lake Store is integrated with Azure Data Lake Analytics and Azure HDInsight and will be integrated with Microsoft offerings like Revolution-R Enterprise; industry-standard distributions like Hortonworks, Cloudera, and MapR; and individual Hadoop projects like Spark, Storm, Flume, Sqoop, and Kafka.

11 Azure Data Lake Store Unlimited storage
Files sizes can be from Gigabytes to Petabytes No limits to scale Data Lake Store has no fixed limits on account size or file size. While other cloud storage offerings might restrict individual file sizes to a few terabytes, Data Lake Store can store very large files that are hundreds of times larger. At the same time, it provides very low latency read/write access and high throughput for scenarios like high-resolution video, scientific, medical, large backup data, event streams, web logs, and Internet of Things (IoT). Collect and store everything in Data Lake Store without restriction or prior understanding of business requirements.

12 Azure Data Lake Store Security Integrates with Azure Active Directory
Audit logs for all operations* Server side Encryption* ACL on files and folders* Enterprise ready security when in GA Access Control List is only at root level at the moment, meaning that a user is granted access to a root folder, and will have access to everything in that root This will be changed when the service goes into GA.

13 Data Lake Analytics

14 Azure Data Lake Analytics
A elastic analytics service built on Apache YARN that processes all data, at any size No limits to SCALE Includes U-SQL, a language that unifies the benefits of SQL with the expressive power of C# Optimized to work with ADL STORE FEDERATED QUERY across Azure data sources ENTERPRISE READY Role based access control & Auditing Pay PER JOB & Scale PER JOB

15 U-SQL A new language for Big Data
Familiar syntax to millions of SQL & .NET developers Unifies declarative nature of SQL with the imperative power of C# Unifies structured, semi-structured and unstructured data Distributed query support over all data

16 Language Overview U-SQL Fundamentals
All the familiar SQL clauses SELECT | FROM | WHERE GROUP BY | JOIN | OVER Operate on unstructured and structured data Relational metadata objects .NET integration and extensibility U-SQL expressions are full C# expressions Reuse .NET code in your own assemblies Use C# to define your own: Types | Functions | Joins | Aggregators | I/O (Extractors, Outputters)

17 U-SQL Capabilities Batch Interactive Streaming Machine Learning
AVAILABLE NOW Interactive IN PROGRESS Streaming FUTURE Machine Learning FUTURE

18 U-SQL Distributed Query
Azure Data Lake Store READ WRITE Azure Storage Blobs READ WRITE Azure SQL Database READ WRITE Azure SQL Data Warehouse READ WRITE Azure SQL DB in Azure VM READ WRITE

19 Read the input, write it directly to output (just a simple copy)
Rowset @orders = EXTRACT OrderId int, Customer string, Date DateTime, Amount float FROM "/input/orders.txt" USING Extractors.Tsv(); TO "/output/orders_copy.txt" USING Outputters.Tsv(); Apply Schema on read From a file in a Data Lake Easy delimited text handling Write out

20 Azure Data Lake Pattern
Azure Services Data Engineer Data Science VM ADL Storage Visual Studio ADL ADL Storage ADL Analytics Tweets Upload Dataset Get Data From CSV Azure Storage Where CAQS Files are stored, but would load into ADLS directly if ingesting from scratch Data Scientist Data Analyst AML Experiment Power BI Desktop Azure Services

21 Execution with Requested Parallelism
(reserve enough to do 1 vertex at a time) Requested Parallelism = 4 (reserve enough to do 4 vertices at a time)

22 AVG Vertex execution time
Stage Details 252 Pieces of work AVG Vertex execution time 4.3 Billion rows Data Read & Written

23 ADLAUs Azure Data Lake Analytics Unit Parallelism N = N ADLAUs 1 ADLAU ~= A VM with 2 cores and 6 GB of memory

24 Visual Studio Data Lake Analytics

25 Azure Data Lake – Visual Studio
Available project types U-SQL project, where you write your statements U-SQL sample project, really extensive project that you can work with on you own account, will give you a head start to getting up to speed on the topic U-SQL unit testing project,

26 Azure Data Lake – Visual Studio
Fully integrates to Solution Explorer

27 Azure Data Lake – Visual Studio
Monitor and manage jobs Browse and manage storage Browse U-SQL catalog Integrates seamlessly with server explorer

28 Creating U-SQL

29 Creating U-SQL IntelliSense Supported

30 Creating U-SQL Code behind enhance your code

31 Demonstration: Using Visual Studio

32 Installing Azure PowerShell
SMSG Readiness 9/18/2018 Installing Azure PowerShell PowerShell Gallery Recommended approach PowerShell 5.0 supports PowerShell Gallery Windows 10 ships with PowerShell 5.0 Web Platform Installation (WebPI) © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

33 Installing from the PowerShell Gallery
Launch Windows PowerShell ISE as Administrator Install-Module AzureRM Install-AzureRM

34 Finding the ADL cmdlets
Option 1 Get-Command -Module AzureRM.DataLakeStore Get-Command -Module AzureRM.DataLakeAnalytics Option 2 Get-Command *DataLake*

35 Logging in to Azure $subname = “BDHadoopTeamPMTestDemo”
Launch Windows PowerShell ISE $subname = “BDHadoopTeamPMTestDemo” Login-AzureRmAccount –SubscriptionName $subname

36 ADLS: Listing files in a store
$adls = “sqlkonferenz” Get-AzureRmDataLakeStoreChildItem -Account $adls -Path /

37 ADLS: Upload and download
$adls = “sqlkonferenz” Import-AzureRmDataLakeStoreItem Account $adls Path d:\somefile.txt Destination /somefile.txt Export-AzureRmDataLakeStoreItem Account $adls Path /somefile.txt -Destination d:\somefile_copy.txt

38 ADLA: List and submit jobs
$adla = “sqlkonferenz” Get-AzureRmDataLakeAnalyticsJob -Account $adla Submit-AzureRmDataLakeAnalyticsJob Account $adla Script “…” # U-SQL text -Name myjob Submit-AzureRmDataLakeAnalyticsJob Account $adla ScriptPath D:\test.script Name myjob

39 ADL Store (ADLS) feature set
Account Management Create new account List accounts Update account properties Delete account Transferring Data Upload into store from local disk Download from store to local disk Files and Folders List contents of folder Create Move Delete Does file exist Security Get ACLs Update ACLs Get Owner Set Owner File Content Set file content Append file content Get file content Merge files

40 ADL Analytics (ADLA) feature set
Account Management Create new account List accounts Update account properties Delete account Data Sources Add a data source List data sources Update data source Delete data source Compute List jobs Submit job Cancel job Catalog Items List items in U-SQL catalog Update item Catalog Secrets Create catalog secret List catalog secrets Delete catalog secrets

41 Demonstration: Using ADL PowerShell

42 Questions

43 Please review the event and sessions
9/18/2018 | Footer Goes Here


Download ppt "Welcome to SQL Saturday Denmark"

Similar presentations


Ads by Google