Download presentation
Presentation is loading. Please wait.
1
Welcome to SQL Saturday Denmark
Azure Data Lake
2
Thanks you our PLATINIUM sponsors
3
Thanks you our GOLD and SILVER sponsors
4
About me Kenneth M. Nielsen Worked with SQL Server since 1999
Data Solution Architect at Microsoft @doktorkermit Linkedin.com/in/KennethMNielsen
5
Agenda Azure Data Lake Store Azure Data Lake Analytics
Azure Data Lake Analytics – Using Visual Studio Azure Data Lake Analytics – Using PowerShell Q & A
6
Data Lake Store
7
Azure Data Lake Store A hyper scale repository for big data analytics workloads No limits to SCALE Store ANY DATA in its native format HADOOP FILE SYSTEM (HDFS) for the cloud ENTERPRISE READY access control, encryption at rest Optimized for analytic workload PERFORMANCE Data lake store is your new friend for storing data, actually almost unlimited data, and the price, well it cost next to nothing to store data on Azure Any file-format is supported, data is stored in its native format, meaning that you can store, images, json tables, csv, tcv, blobs etc etc. It is build on HDFS, and here it is HDFS for the cloud.
8
Azure Data Lake Store Any Data Unstructured Semi-structured Structured
9
Azure Data Lake Store
10
Azure Data Lake Store HDFS for the cloud New filesystem build from the ground up, based on HADOOP file system Integrates with HDInsight, Hortonworks and Cloudera Supports Files and Folder objects and operations Support for rename, create and delete files and folders. Files system build from the scratch, based on HADOOP files system. Microsoft Azure Data Lake Store is a Hadoop file system that’s compatible with Hadoop Distributed File System (HDFS) and works with the Hadoop ecosystem. Data Lake Store is integrated with Azure Data Lake Analytics and Azure HDInsight and will be integrated with Microsoft offerings like Revolution-R Enterprise; industry-standard distributions like Hortonworks, Cloudera, and MapR; and individual Hadoop projects like Spark, Storm, Flume, Sqoop, and Kafka.
11
Azure Data Lake Store Unlimited storage
Files sizes can be from Gigabytes to Petabytes No limits to scale Data Lake Store has no fixed limits on account size or file size. While other cloud storage offerings might restrict individual file sizes to a few terabytes, Data Lake Store can store very large files that are hundreds of times larger. At the same time, it provides very low latency read/write access and high throughput for scenarios like high-resolution video, scientific, medical, large backup data, event streams, web logs, and Internet of Things (IoT). Collect and store everything in Data Lake Store without restriction or prior understanding of business requirements.
12
Azure Data Lake Store Security Integrates with Azure Active Directory
Audit logs for all operations* Server side Encryption* ACL on files and folders* Enterprise ready security when in GA Access Control List is only at root level at the moment, meaning that a user is granted access to a root folder, and will have access to everything in that root This will be changed when the service goes into GA.
13
Data Lake Analytics
14
Azure Data Lake Analytics
A elastic analytics service built on Apache YARN that processes all data, at any size No limits to SCALE Includes U-SQL, a language that unifies the benefits of SQL with the expressive power of C# Optimized to work with ADL STORE FEDERATED QUERY across Azure data sources ENTERPRISE READY Role based access control & Auditing Pay PER JOB & Scale PER JOB
15
U-SQL A new language for Big Data
Familiar syntax to millions of SQL & .NET developers Unifies declarative nature of SQL with the imperative power of C# Unifies structured, semi-structured and unstructured data Distributed query support over all data
16
Language Overview U-SQL Fundamentals
All the familiar SQL clauses SELECT | FROM | WHERE GROUP BY | JOIN | OVER Operate on unstructured and structured data Relational metadata objects .NET integration and extensibility U-SQL expressions are full C# expressions Reuse .NET code in your own assemblies Use C# to define your own: Types | Functions | Joins | Aggregators | I/O (Extractors, Outputters)
17
U-SQL Capabilities Batch Interactive Streaming Machine Learning
AVAILABLE NOW Interactive IN PROGRESS Streaming FUTURE Machine Learning FUTURE
18
U-SQL Distributed Query
Azure Data Lake Store READ WRITE Azure Storage Blobs READ WRITE Azure SQL Database READ WRITE Azure SQL Data Warehouse READ WRITE Azure SQL DB in Azure VM READ WRITE
19
Read the input, write it directly to output (just a simple copy)
Rowset @orders = EXTRACT OrderId int, Customer string, Date DateTime, Amount float FROM "/input/orders.txt" USING Extractors.Tsv(); TO "/output/orders_copy.txt" USING Outputters.Tsv(); Apply Schema on read From a file in a Data Lake Easy delimited text handling Write out
20
Azure Data Lake Pattern
Azure Services Data Engineer Data Science VM ADL Storage Visual Studio ADL ADL Storage ADL Analytics Tweets Upload Dataset Get Data From CSV Azure Storage Where CAQS Files are stored, but would load into ADLS directly if ingesting from scratch Data Scientist Data Analyst AML Experiment Power BI Desktop Azure Services
21
Execution with Requested Parallelism
(reserve enough to do 1 vertex at a time) Requested Parallelism = 4 (reserve enough to do 4 vertices at a time)
22
AVG Vertex execution time
Stage Details 252 Pieces of work AVG Vertex execution time 4.3 Billion rows Data Read & Written
23
ADLAUs Azure Data Lake Analytics Unit Parallelism N = N ADLAUs 1 ADLAU ~= A VM with 2 cores and 6 GB of memory
24
Visual Studio Data Lake Analytics
25
Azure Data Lake – Visual Studio
Available project types U-SQL project, where you write your statements U-SQL sample project, really extensive project that you can work with on you own account, will give you a head start to getting up to speed on the topic U-SQL unit testing project,
26
Azure Data Lake – Visual Studio
Fully integrates to Solution Explorer
27
Azure Data Lake – Visual Studio
Monitor and manage jobs Browse and manage storage Browse U-SQL catalog Integrates seamlessly with server explorer
28
Creating U-SQL
29
Creating U-SQL IntelliSense Supported
30
Creating U-SQL Code behind enhance your code
31
Demonstration: Using Visual Studio
32
Installing Azure PowerShell
SMSG Readiness 9/18/2018 Installing Azure PowerShell PowerShell Gallery Recommended approach PowerShell 5.0 supports PowerShell Gallery Windows 10 ships with PowerShell 5.0 Web Platform Installation (WebPI) © 2015 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
33
Installing from the PowerShell Gallery
Launch Windows PowerShell ISE as Administrator Install-Module AzureRM Install-AzureRM
34
Finding the ADL cmdlets
Option 1 Get-Command -Module AzureRM.DataLakeStore Get-Command -Module AzureRM.DataLakeAnalytics Option 2 Get-Command *DataLake*
35
Logging in to Azure $subname = “BDHadoopTeamPMTestDemo”
Launch Windows PowerShell ISE $subname = “BDHadoopTeamPMTestDemo” Login-AzureRmAccount –SubscriptionName $subname
36
ADLS: Listing files in a store
$adls = “sqlkonferenz” Get-AzureRmDataLakeStoreChildItem -Account $adls -Path /
37
ADLS: Upload and download
$adls = “sqlkonferenz” Import-AzureRmDataLakeStoreItem Account $adls Path d:\somefile.txt Destination /somefile.txt Export-AzureRmDataLakeStoreItem Account $adls Path /somefile.txt -Destination d:\somefile_copy.txt
38
ADLA: List and submit jobs
$adla = “sqlkonferenz” Get-AzureRmDataLakeAnalyticsJob -Account $adla Submit-AzureRmDataLakeAnalyticsJob Account $adla Script “…” # U-SQL text -Name myjob Submit-AzureRmDataLakeAnalyticsJob Account $adla ScriptPath D:\test.script Name myjob
39
ADL Store (ADLS) feature set
Account Management Create new account List accounts Update account properties Delete account Transferring Data Upload into store from local disk Download from store to local disk Files and Folders List contents of folder Create Move Delete Does file exist Security Get ACLs Update ACLs Get Owner Set Owner File Content Set file content Append file content Get file content Merge files
40
ADL Analytics (ADLA) feature set
Account Management Create new account List accounts Update account properties Delete account Data Sources Add a data source List data sources Update data source Delete data source Compute List jobs Submit job Cancel job Catalog Items List items in U-SQL catalog Update item Catalog Secrets Create catalog secret List catalog secrets Delete catalog secrets
41
Demonstration: Using ADL PowerShell
42
Questions
43
Please review the event and sessions
9/18/2018 | Footer Goes Here
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.