Download presentation
Presentation is loading. Please wait.
Published byElijah Webster Modified over 6 years ago
1
Making your Data Lake smarter with Cognitive Services
Helge Rege Gårdsvoll Data Manager, Hafslund Strøm @dataHelge
2
Our awesome sponsors! Please visit the sponsor area in the break and interact with them. They are the reason we can hold this conference free of charge!
3
Azure Data Lake has three components
4
Data Lake Store Data Lake Store
Hafslund’s Data Lake Store is divided by subsidiary, and then organized with Input folder for input formats Staging folder for processed data Reference for reference data Sandbox for sandboxing and experimentation Data Lake Store Data Lake Store is a high capacity storage for all types of data We ingest data into the Data Lake Store without changing the format Processed data is written into the Data Lake Store for storage and analytics Parts of the Data Lake Store is a sandbox Access is limited by Access Control Lists (ACL) in Active Directory Only analysts and super users that access data in the Data Lake Store directly Auditing is performed with built in functions Data is encrypted in transit and in storage
5
Data Lake Analytics Data Lake Analytics
Data Lake Analytics is the primary data transformation method for Hafslund Strøm, and business logic should be implemented in Data Lake Analytics Data Lake Analytics Data Lake Analytics is a highly scalable analytics service for transforming data. Data is transformed with U- SQL scripts, that unifies SQL and C# The job service provides flexibility for cost/value considerations, and scalable performance. Data is typically read from input or staging folders/tables of Data Lake Store and written into staging files and tables Scheduling of Data Lake Analytics is handled by Data Factory Jobs are run as batch, with a given number of Analytics Units. Cost and time are considered when setting the number of units
6
The typical U-SQL example
@rows = EXTRACT OrderId int, Customer string, Date DateTime, Amount float FROM "mylake/orders.csv" USING Extractors.Csv(); @rows = SELECT * WHERE Amount > 1000; TO "mylake/orders_copy.txt" USING Outputter.Csv();
7
Cognitive Services: APIs to see, hear, understand and interpet your data
Data Lake Support Data Lake Support Data Lake Support Data Lake Support
8
Getting started Go to Sample Scripts for your Data Lake Analytics Account Select «Install U-SQL Extensions» This will add new assemblies to your account Cognitive R Python
9
Demo: Images
10
Demo: Text
11
Want to learn more? Usql.io
U-SQL tutorial: U-SQL Cognitive tutorial:
12
Helge Rege Gårdsvoll, helge.gardsvoll@hafslundstrom.no
Thank you! Helge Rege Gårdsvoll, @dataHelge
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.