Making your Data Lake smarter with Cognitive Services

Slides:



Advertisements
Similar presentations
Samsung Smart TV is a web-based application running on an application engine installed on digital TVs connected to the Internet.
Advertisements

EE-Video Yossi Biton Nir Yakobovski Outline  The concept  Main functionality  Challenges & Solutions  Design considerations Layers Class diagram.
CS525: Big Data Analytics MapReduce Languages Fall 2013 Elke A. Rundensteiner 1.
Copying, Managing, and Transforming Data With DTS.
Chapter 9 Collecting Data with Forms. A form on a web page consists of form objects such as text boxes or radio buttons into which users type information.
Setting up PDP Set up your bank Set up the Disbursement Range Set up Customer Profile ACH Bank Payee ACH Accounts.
Overview of SQL Server Alka Arora.
PHASE 4 SYSTEMS IMPLEMENTATION Application Development SYSTEMS ANALYSIS & DESIGN.
What’s New in SSIS with SQL 2008 Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
General Computer Science for Engineers CISC 106 Lecture 02 Dr. John Cavazos Computer and Information Sciences 09/03/2010.
Hive : A Petabyte Scale Data Warehouse Using Hadoop
UWG 2013 Meeting PO.DAAC Web Services Demo. What are PO.DAAC Web Services?
Data Management Console Synonym Editor
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
SharePoint Online Migration API and Improvements
Please note that the session topic has changed
CMPE 226 Database Systems April 19 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Scripting Just Enough SSIS to be Dangerous. 6/13/2015 Visit the Sponsor tables to enter their end of day raffles. Turn in your completed Event Evaluation.
Andy Roberts Data Architect
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
AZ PASS User Group Azure Data Factory Overview Josh Sivey, Solution Partner October
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Azure ML in SSIS An introduction to Azure Machine Learning Through the eyes of an SSIS developer David Söderlund – SolidQ Nordic
MESA A Simple Microarray Data Management Server. General MESA is a prototype web-based database solution for the massive amounts of initial data generated.
Dumps PDF Perform Data Engineering on Microsoft Azure HD Insight dumps.html Complete PDF File Download From.
9/24/2017 7:27 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
DATA Storage and analytics with AZURE DATA LAKE
Start-SPPowerShell – Introduction to PowerShell for SharePoint Admins and Developers Paul BAker.
4/19/ :02 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
IoT 101 with Raspberry Pi and Azure
SQL – Python and Databases
Data Platform and Analytics Foundational Training
Data Virtualization Tutorial: Introduction to SQL Script
DotNetNuke® Web Application Framework
Introducing Azure Functions
Get the Most Out of GoAnywhere: Advanced Workflows
Data Virtualization Demoette… Parameterized Queries
Building Analytics At Scale With USQL and C#
Azure Machine Learning & ML Studio
Welcome to SQL Saturday Denmark
Azure Automation and Logic Apps:
Overview of Azure Data Lake Store
SharePoint Saturday Omaha April 2016
U-SQL Object Model.
Bob Tabor | Microsoft Azure Fundamentals: Data Understanding Microsoft Azure Storage Queues Bob Tabor |
Microsoft Connect /24/ :05 AM
Near Real Time ETLs with Azure Serverless Architecture
Orchestration and data movement with Azure Data Factory v2
Topics Introduction Hardware and Software How Computers Store Data
Azure Data Lake for First Time Swimmers
Databricks: the new kid on the block
Analytics in the Cloud using Microsoft Azure
SYSTEMS ANALYSIS & DESIGN
Azure SQL DWH: Tips and Tricks for developers
Topics Introduction to File Input and Output
Orchestration and data movement with Azure Data Factory v2
Introducing Power BI dataflows
Power BI – Introduction to Dataflows
ETL Patterns in the Cloud with Azure Data Factory
CYB 130 RANK Dreams Come True / cyb130rank.com.
SSRS – Thinking Outside the Report
Moving your on-prem data warehouse to cloud. What are your options?
Using Veera with R and Shiny to Build Complex Visualizations
Open Systems Technologies Data Analyst Internship:
Data Wrangling for ETL enthusiasts
Copyright © JanBask Training. All rights reserved Get Started with Hadoop Hive HiveQL Languages.
Beyond orchestration with Azure Data Factory
Visual Data Flows – Azure Data Factory v2
Visual Data Flows – Azure Data Factory v2
Presentation transcript:

Making your Data Lake smarter with Cognitive Services Helge Rege Gårdsvoll Data Manager, Hafslund Strøm @dataHelge

Our awesome sponsors! Please visit the sponsor area in the break and interact with them. They are the reason we can hold this conference free of charge!

Azure Data Lake has three components

Data Lake Store Data Lake Store Hafslund’s Data Lake Store is divided by subsidiary, and then organized with Input folder for input formats Staging folder for processed data Reference for reference data Sandbox for sandboxing and experimentation Data Lake Store Data Lake Store is a high capacity storage for all types of data We ingest data into the Data Lake Store without changing the format Processed data is written into the Data Lake Store for storage and analytics Parts of the Data Lake Store is a sandbox Access is limited by Access Control Lists (ACL) in Active Directory Only analysts and super users that access data in the Data Lake Store directly Auditing is performed with built in functions Data is encrypted in transit and in storage

Data Lake Analytics Data Lake Analytics Data Lake Analytics is the primary data transformation method for Hafslund Strøm, and business logic should be implemented in Data Lake Analytics Data Lake Analytics Data Lake Analytics is a highly scalable analytics service for transforming data. Data is transformed with U- SQL scripts, that unifies SQL and C# The job service provides flexibility for cost/value considerations, and scalable performance. Data is typically read from input or staging folders/tables of Data Lake Store and written into staging files and tables Scheduling of Data Lake Analytics is handled by Data Factory Jobs are run as batch, with a given number of Analytics Units. Cost and time are considered when setting the number of units

The typical U-SQL example @rows = EXTRACT OrderId int, Customer string, Date DateTime, Amount float FROM "mylake/orders.csv" USING Extractors.Csv(); @rows = SELECT * FROM @rows WHERE Amount > 1000; OUTPUT @rows TO "mylake/orders_copy.txt" USING Outputter.Csv();

Cognitive Services: APIs to see, hear, understand and interpet your data Data Lake Support Data Lake Support Data Lake Support Data Lake Support

Getting started Go to Sample Scripts for your Data Lake Analytics Account Select «Install U-SQL Extensions» This will add new assemblies to your account Cognitive R Python

Demo: Images

Demo: Text

Want to learn more? Usql.io U-SQL tutorial: https://saveenr.gitbooks.io/usql-tutorial/content/ U-SQL Cognitive tutorial: https://docs.microsoft.com/en-us/azure/data-lake-analytics/data-lake-analytics-u-sql-cognitive

Helge Rege Gårdsvoll, helge.gardsvoll@hafslundstrom.no Thank you! Helge Rege Gårdsvoll, helge.gardsvoll@hafslundstrom.no @dataHelge