Download presentation
Presentation is loading. Please wait.
Published byAnthony Jenkins Modified over 9 years ago
1
Hadoop on Azure 101 What is the Big Deal? Dennis Mulder Solution Architect Microsoft Corporation
2
Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, dmulder@microsoft.com DesignAssessContactPilots Engage
4
SocialMobility mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in 2012 1/2 of companies expect to use internal social network apps in 2012 2.7 zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 Big data Cloud Four megatrends will dominate the next decade
5
mobile apps will be downloaded in 2012 = 91% of organizations expect to spend on mobile devices in 2012 1/2 of companies expect to use internal social network apps in 2012 2.7 zettabytes in 2012 >80% of new apps in 2012 will be distributed/ deployed on clouds 32% of businesses are likely to invest in BI and analytics in 2012 from infrastructure to application platforms The strategic focus in the cloud will shift in 2012 In 2012, mobile devices will outship PCs by more than 2:1 and generate more revenue than PCs for the first time 85 BILLION Social networking will follow not just people but also appliances, devices and products 34% of CIOs say technology as a service (cloud) will have the most profound effect on the CIO role in the future 2/3 of mobile apps developed in 2012 will integrate with analytics offerings 49% of CIOs rank BI as the top project priority for 2012 SocialMobility Big data Microsoft is embracing these megatrends Cloud
6
How will technology megatrends enable you to save money, drive innovation, grow your business, and attract and retain customers? Rethinking and evolving business strategies Social Big data Mobility Cloud
7
Why Big Data?
11
Internet of things Audio / Video Log Files Text/Image Social Sentiment Data Market Feeds eGov Feeds Weather Wikis / Blogs Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates WEB 2.0 Mobile Advertisin g CollaborationeCommerce Digital Marketing Search Marketing Web Logs Recommendation s ERP / CRM Sales Pipeline Payables Payroll Inventory Contacts Deal Tracking Terabytes (10E12) Gigabytes (10E9) Exabytes (10E18) Petabytes (10E15) Velocity - Variety - variability Volume 1980 190,000$ 2010 0.07$ 1990 9,000$ 2000 15$ Storage/GB ERP / CRM WEB 2.0 Internet of things
12
Example Scenarios
14
Excess Data Logs ETL Some Data Data Warehouse
15
Raw Data “Store it All” Cluster Raw Data “Store it All” Cluster Data Warehouse Logs
16
Understanding the Basics Move the Compute to the Data
17
Hadoop Distributed Architecture
18
Server Files Server
19
RUNTIME Code
20
MapReduce – Workflow
21
Map tasks 21 53705$6553705$3053705$1554235$7554235$2202115$1502115$1544313$1044313$2544313$55 553705$15 644313$10 553705$65 054235$22 902115$15 644313$25 310025$95 844313$55 253705$30 102115$15 454235$75 710025$60 MapperMapper MapperMapper 454235$75 710025$60 253705$30 102115$15 10025$60 553705$65 054235$22 553705$15 644313$10 310025$95 844313$55 902115$15 644313$25 10025$95 DataNode3 DataNode2 DataNode1 Blocks of the Sales file in HDFS Group By Group By (custId, zipCode, amount) One output bucket per reduce task
22
Reducer Reduce tasks Reducer 53705$6554235$7554235$22 10025$95 44313$55 10025$60 MapperMapper 53705$3053705$1502115$1502115$1544313$1044313$25 MapperMapper 53705$6553705$30 53705$15 44313$10 44313$25 10025$95 44313$55 10025$6054235$75 54235$22 02115$15 02115$15 Sort Sort Sort 53705$65 53705$30 53705$15 44313$10 44313$25 44313$55 10025$95 10025$60 54235$75 54235$22 02115$15 02115$15 SUM 10025$155 44313$90 53705$110 54235$97 02115$30 Done! Shuffle
23
MapReduce – Workflow
24
HD Insight
25
Front end Stream Layer Partition Layer Name Node de Data Node Front end HDFS API DFS (1 Data Node per Worker Role) and Compute Cluster Azure Storage (ASV) … Azure Blob Storage
27
Distributed Storage (HDFS) Query (Hive) Distributed Processing (MapReduce) HDINSIGHT / HADOOP Eco-System Legend Red = Core Hadoop Blue = Data processing Purple = Microsoft integration points and value adds Orange = Data Movement Green = Packages
28
Hive, Pig, Mahout, Cascading, Scalding, Scoobi, Pegasus… C#, F# Map/Reduce, LINQ to Hive,.NET management clients JavaScript Map/Reduce, Browser hosted console, Node.js management clients PowerShell, Cross Platform CLI tools
30
TRADITIONAL RDBMSMAPREDUCE Data Size Access Updates Structure Integrity Scaling DBA Ratio
32
Deploying and Interacting With HDInsight Service demo
36
http://www.windowsazure.com/ http://hadoop.apache.org/ Nuget: http://nuget.org/packages?q=hadoophttp://nuget.org/packages?q=hadoop Hadoop SDK: http://hadoopsdk.codeplex.comhttp://hadoopsdk.codeplex.com
37
Windows Azure Center of Excellence Spotlight Pilots Assessment Architecture and Design Guidance Modern AppsGlobal Scale Design Sessions Global Services Team 10 Senior Cloud Architects Dennis Mulder US, EMEA, APAC 8 Pilots Cloud Apps Champs Services Dennis Mulder, Solution Architect, dmulder@microsoft.com DesignAssessContactPilots Engage
38
© 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.