AZURE DISTRIBUTED DATA Storage, HDInsight Hadoop, Azure Data Lake.

Slides:



Advertisements
Similar presentations
Roger Breu SQL Server PDW Solution Sales Microsoft Western Europe Microsoft Solutions for Big Data | Oct 17th 2013 From Numbers.
Advertisements

Setting Big Data Capabilities Free How to Make Business on Big Data? Stig Torngaard, Partner Platon.
MICROSOFT BIG DATA. WHAT IS BIG DATA? How do I optimize my fleet based on weather and traffic patterns? SOCIAL & WEB ANALYTICS LIVE DATA FEEDS ADVANCED.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Running Hadoop-as-a-Service in the Cloud
Transform + analyze Visualize + decide Capture + manage Dat a.
Platinum Sponsors Titanium Sponsors. ETL Tool (SSIS, etc) EDW (SQL Svr, Teradata, etc) Extract Original Data Load Transformed Data Transform BI Tools.
SQL Server 2014 Enterprise Edition Brad Jarocki Adam Bogobowicz Matt Haynes.
Big Data Use Cases in the cloud Peter Sirota, GM Elastic
BIG DATA – WHAT’S THE BIG DEAL The call would start soon, please be on mute. Thanks for your time and patience.
Business Intelligence: The Next Big Thing (Really!) John Bair CTO, Ajilitee Sep 14, 2012 Presented to TDWI St. Louis Chapter.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Analytics Map Reduce Query Insight Hive Pig Hadoop SQL Map Reduce Business Intelligence Predictive Operational Interactive Visualization Exploratory.
Tyson Condie.
SQL Server 2014: The Data Platform for the Cloud.
An Introduction to HDInsight June 27 th,
How Companies are Using Spark And where the Edge in Big Data will be Matei Zaharia.
Modern Data Warehouse: Microsoft APS Alain Dormehl June 2015.
Windows Azure. Azure Application platform for the public cloud. Windows Azure is an operating system You can: – build a web application that runs.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
Breaking points of traditional approach What if you could handle big data?
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Denver ● SPT 104 ● March 1-3, 2016.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Big Data Yuan Xue CS 292 Special topics on.
Azure HDInsight And Excel Analyze unstructured data at scale, then visualize! George Walters Sr. Technical Solutions Professional, Data Platform Microsoft.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Motivation Customer Trends Reporting  Insights, predictions, actions Static data  Dynamic intelligence Operational efficiency  Competitive advantage.
Making Data Work for Everyone Gordon Phillips May 28, 2014.
An Introduction To Big Data For The SQL Server DBA.
Apache Hadoop on Windows Azure Avkash Chauhan
Microsoft Partner since 2011
Big Data for the SQL Eye Cindy Look, it’s SQL! SELECT score, fun FROM toDo WHERE type = 'they pay me for
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Microsoft Ignite /28/2017 6:07 PM
BI 202 Data in the Cloud Creating SharePoint 2013 BI Solutions using Azure 6/20/2014 SharePoint Fest NYC.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
BUILD BIG DATA ENTERPRISE SOLUTIONS FASTER ON AZURE HDINSIGHT
Connected Infrastructure
Organizations Are Embracing New Opportunities
Data Platform and Analytics Foundational Training
Cindy Big Data for the SQL Eye Cindy
Big Data Enterprise Patterns
Connected Living Connected Living What to look for Architecture
Data Platform and Analytics Foundational Training
Smart Building Solution
Smart Building Solution
Connected Living Connected Living What to look for Architecture
Connected Infrastructure
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Remote Monitoring solution
Cloudy with a Chance of Data
02 | Design and implement database
Overview of Azure Data Lake Store
Cloudy with a Chance of Data
Microsoft Connect /22/2018 9:50 PM
Microsoft Connect /24/ :05 AM
Managing batch processing Transient Azure SQL Warehouse Resource
Data analytics with Hadoop In the Microsoft Azure cloud
Big DATA.
Big-Data Analytics with Azure HDInsight
Moving your on-prem data warehouse to cloud. What are your options?
Data Wrangling for ETL enthusiasts
Customer 360.
SQL Server 2019 Bringing Apache Spark to SQL Server
Architecture of modern data warehouse
Presentation transcript:

AZURE DISTRIBUTED DATA Storage, HDInsight Hadoop, Azure Data Lake

GOALS AND QUESTIONS

STORE IT Azure Blob Storage Azure Data Lake Store

COMPUTE IT Hadoop on Azure – HDInsight on Linux Azure Data Lake Analytics with YARN

WHAT IS BIG DATA? It Is Scale Out Enables elasticity Encourages exploration Faster data ingestion Lower TCO Empowers self-service BI and analytics Rapid time to insight It Is NOT A well-defined thing About volume, size A replacement for everything The answer to every problem

Part 3: Single Slide A leading game development studio that creates, develops, produces, and publishes a number of popular video games needed to analyze large amounts of in-game data that were unstructured. They chose Azure HDInsight, Data Factory, SQL Server on-premises, Power View, Power Query to do in- game analytics and understanding what gamers do during game-play and what campaigns they can run to influence in-game purchases. Finally, twitter sentiment is collected to correlate with sales.

Game Development Company Gaming A predominantly mobile-based game development company. While they are a mid-sized organization, they have partnered with media giants on various gaming projects Part 1: What They Did | In-game Analytics Challenge As a game development studio, they wanted to do in-game analytics to understand their players more and what they do in the games Solution Azure HDInsight (MapReduce and Storm), Service Bus, SQL Server for reporting Collects telemetry and logging data to gain in-game analytics: How many players using the game How many players invited their friends How far along did players get into the tutorial How many attempts did they make on one level/stage In-game Analytics Media tonic

BK1 Game Development Company Part 2: How They Did It | In-game Analytics How They Did It Collect data from games in Azure Blobs Game sends telemetry/logging data as JSON files Contains every action of user in the game Data is pushed to Azure Service Bus as real-time Tens of Gigabytes of data captured daily HDInsight picks up real-time data and processes From Service Bus, HDInsight processes using Apache Storm and MapReduce Constantly running experiments to determine insight A/B testing In-game metrics and analytics Spin up 32-node cluster nightly for four hours Output sent to SQL Server for BI Transfer data to SQL Server for BI In-game Analytics Service Bus SQL Server On-premises

A game development studio that wanted to do in-game analytics to understand their players more and what they do in their games. They chose Azure HDInsight including Storm in HDInsight so they can do near real-time in-game analytics of their users. Now, they can understand how many players are playing, how many are referring the game, how difficult a game level is, etc.

Typical Big Data Use Cases Smart meter monitoring Equipment monitoring Advertising analysis Life sciences research Fraud detection Healthcare outcomes Weather forecasting Natural resource exploration Social network analysis Churn analysis Traffic flow optimization Legal discovery Telemetry IT infrastructure optimization

HADOOP SHINES WHEN…. Data exploration, analytics and reporting, new data-driven actionable insights Rapid iterating Unknown unknowns Flexible scaling Data driven actions for early competitive advantage or first to market Low number of direct, concurrent users Low cost data archival

HADOOP ANTI-PATTERNS…. Replace system whose pain points don’t align with Hadoop’s strengths OLTP needs adequately met by an existing system Known data with a static schema Many end users Interactive response time requirements (becoming less true) Your first Hadoop project + mission critical system

APPENDIX

CLOUD STORAGE Blobs + WASB Open source access from Hadoop to Azure Storage Blobs, flexible use Azure Data Lake Store HDFS, Virtually unlimited scale, intelligent data storage, enterprise grade security, flexible use Optimized proprietary formats like SQL Server, HBase Rich feature set around specific scenarios Data Factory Ingest, transform, move, process, analyze data – ELT, ETL, EHL

COMPUTE Non-Relational Flexible format and code Hadoop on Linux or Windows HDInsight (100% Apache: Hive, Pig, Storm, HBase, Spark….) Any Hadoop distro on IaaS VMs Scale-out technologies like MongoDB, Cassandra, Qubole on IaaS Polybase or Hadoop Region on APS Relational Rich feature set, optimized for specific scenarios Azure Data Lake Analytics Ad hoc analytics with virtually unlimited scale, YARN U-SQL -.NET unified with SQL Machine Learning SparkML, R, Azure Machine Learning Data Factory Ingest, transform, move, process, analyze data – ELT, ETL, EHL

USE CASES Exploration Fail fast iteration Scale out Unknown unknowns Fast time to insight

Your choice in analytics Real-time, more history, fast ingestion ODBC makes Hive and Spark “just another data source” Experimentation via “fail fast” iteration Enables the business user … And new expectations around latency IT ADDS UP TO MORE OPTIONS 19

AZURE HAS SO MUCH MORE Go straight to the business code Scale storage and compute separately Open Source Linux Managed and unmanaged services Hybrid On-demand and 24x7 options SQL