1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering, Cloudera, Inc. The Future of Data Management with Hadoop and the.

Slides:



Advertisements
Similar presentations
Big Data Training Course for IT Professionals Name of course : Big Data Developer Course Duration : 3 days full time including practical sessions Dates.
Advertisements

Drive Data Quality at Your Company: Create a Data Lake George Corugedo Chief Technology Officer & Co-Founder.
MICROSOFT BIG DATA. WHAT IS BIG DATA? How do I optimize my fleet based on weather and traffic patterns? SOCIAL & WEB ANALYTICS LIVE DATA FEEDS ADVANCED.
SAS solutions SAS ottawa platform user society nov 20th 2014.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Advance Analytics Capabilities
Why Spark on Hadoop Matters
© 2009 VMware Inc. All rights reserved Big Data’s Virtualization Journey Andrew Yu Sr. Director, Big Data R&D VMware.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
An Information Architecture for Hadoop Mark Samson – Systems Engineer, Cloudera.
© 2010 VMware Inc. All rights reserved Confidential VMware Vision Jarod Martin Senior Solutions Engineer.
SQL Server 2014 Enterprise Edition Brad Jarocki Adam Bogobowicz Matt Haynes.
Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
Amadeus Travel Intelligence ‘Monetising’ big data sets
Big Data – Are You Ready? December 15 th, 2011 ICT Forum December 15 th, 2011 Frank Zervos Technology Director SEE Oracle Hellas.
BIG DATA – WHAT’S THE BIG DEAL The call would start soon, please be on mute. Thanks for your time and patience.
TITLE SLIDE: HEADLINE Presenter name Title, Red Hat Date For Red Hat, it's 1994 all over again Sarangan Rangachari VP and GM, Storage and Big Data Red.
Prescriptive Analytics
SQL on Hadoop. Todays agenda Introduction Hive – the first SQL approach Data ingestion and data formats Impala – MPP SQL.
Business Intelligence: The Next Big Thing (Really!) John Bair CTO, Ajilitee Sep 14, 2012 Presented to TDWI St. Louis Chapter.
May 23nd 2012 Matt Mead, Cloudera
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Page 1 © Hortonworks Inc – All Rights Reserved Hortonworks Naser Ali UK Building Energy Management Group Hadoop: A Data platform for businesses.
Copyright © 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13 1 John Caulfield.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
Optimize your Open Data 5 Best Practices for Designing Data-Driven Apps ​ Glenn Hess ​ Federal Sales Engineer ​ Actuate, Inc.
SQL Server 2014: The Data Platform for the Cloud.
August 27, 2008 Platform Market, Business & Strategy.
1 CERN Data Analytics Workshop Presenters: Tolga Alpagot | Director, Technology.
Wrangling Customer Usage Data with Hadoop Clearwire – Thursday, June 27 th Carmen Hall – IT Director Mathew Johnson – Sr. IT Manager.
4G-LTE: Enhancing Efficiency in Organizations. Factors Impacting Digitization Processes and Systems January Powerful Platforms and Devices Storage.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark Cluster Monitoring 2.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
IoT, Big Data and Emerging Technologies
Click to add text TWA New Job Types with Tivoli Workload Scheduler for Applications 8.6 TWS Education.
Data and SQL on Hadoop. Cloudera Image for hands-on Installation instruction – 2.
1 Apache Spark and Its Role in the Enterprise Data Hub Mike Olson, Chief Strategy Officer,
Accumulus Delivers Enterprise Class Subscription Billing and Automation Solutions for Gaming, Retail, and More on the Scalable Microsoft Azure Platform.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
1 ©2015 Talend Inc Talend VAR program Presentation.
Nov 2006 Google released the paper on BigTable.
Breaking points of traditional approach What if you could handle big data?
LIMPOPO DEPARTMENT OF ECONOMIC DEVELOPMENT, ENVIRONMENT AND TOURISM The heartland of southern Africa – development is about people! 2015 ICT YOUTH CONFERENCE.
AZURE DISTRIBUTED DATA Storage, HDInsight Hadoop, Azure Data Lake.
The VERSO Product Returns Portal Incorporates Office 365 Outlook and Excel Add-Ins to Create Seamless Workflow for All Participating Users OFFICE 365 APP.
1 Global Hadoop Market Forecast 2014 –2020 Global Hadoop Market Forecast 2014 –2020 Occams Business Research & Consulting.
1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering The Future of Data Management with Hadoop and the Enterprise Data.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Data Science Hadoop YARN Rodney Nielsen. Rodney Nielsen, Human Intelligence & Language Technologies Lab Outline Classical Hadoop What’s it all about Hadoop.
What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters.
Peter Idoine Managing Director Oracle New Zealand Limited.
Cloudera Enterprise on Microsoft Azure. Data is driving business transformation CUSTOMER & CHANNEL DATA-DRIVEN PRODUCTS SECURITY, RISK & COMPLIANCE In.
Microsoft Partner since 2011
Big Data for the SQL Eye Cindy Look, it’s SQL! SELECT score, fun FROM toDo WHERE type = 'they pay me for
Microsoft Ignite /28/2017 6:07 PM
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Data Analytics Challenges Some faults cannot be avoided Decrease the availability for running physics Preventive maintenance is not enough Does not take.
Business Insights Play briefing deck.
OMOP CDM on Hadoop Reference Architecture
Data Platform and Analytics Foundational Training
Hadoopla: Microsoft and the Hadoop Ecosystem
WEBINAR The Rise Of Insights Services
Hadoop Clusters Tess Fulkerson.
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024Low Power Wide Area Network.
Hadoop Market
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Network Attached Storage Market to reach $20bn by 2024: Global Market Insights.
Floor & Decor Outlets of America – Prophix story
XtremeData on the Microsoft Azure Cloud Platform:
Presentation transcript:

1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering, Cloudera, Inc. The Future of Data Management with Hadoop and the Enterprise Data Hub

2 © Cloudera, Inc. All rights reserved.

3 ©2014 Cloudera, Inc. All rights reserved. Cloudera Snapshot Founded2008, by former employees of Employees Today900+ World Class Support24x7 Global Staff Pro-active & Predictive Support Programs Mission CriticalThousands of Enterprise Users Over ~600 Paying Subscription Customers The Largest EcosystemOver Partners Cloudera UniversityOver 100,000+ Trained Open Source LeadersCloudera Employees are Leading Developers & Contributors Total Capital Raised$1B+ (from Intel, Google, Dell, T. Rowe Price, Accel, Greylock) MissionHelp Organizations Leverage the Power of All Their Data to Ask Bigger Questions.

4 © Cloudera, Inc. All rights reserved. A Big Data Revolution is happening as we speak Industrial RevolutionData Revolution

5 © Cloudera, Inc. All rights reserved. Data Drives Industries Financial ServicesPublic Sector Healthcare Telecommunications Retail Optimize network performanceMoney laundering detection Cyber security detection Product recommendations Personalized medicine

6 © Cloudera, Inc. All rights reserved. Data Drives Business SalesOperations Product Marketing Customer Satisfaction Increase conversions by 2%Convert 5% more leads Reduce fraud by 3% Reduce churn by 1% Increase user adoption by 10%

7 © Cloudera, Inc. All rights reserved. Why is Big Data Happening Now? Everything that can be measured will be measured. Employees and customers expect more personal interactions, but not at the cost of their privacy. The age of “segment of 1”. The most innovative companies embrace experimentation, predictive analytics and agility. InstrumentationPersonalizationAdvanced Analytics

8 © Cloudera, Inc. All rights reserved. Data is fueling this opportunity Web/Mobile Clickstream Social Media Sensor Networks Audio, Image & Video

9 © Cloudera, Inc. All rights reserved. Access to diverse analysis techniques SQL Video & Voice Processing Text Sentiment Analysis Social Graph Analysis

10 © Cloudera, Inc. All rights reserved. People require analytics “80% of CEOs cite data mining and analytics as strategically important.” PWC CEO Survey

11 © Cloudera, Inc. All rights reserved. UNSTRUCTURED DATA * Source: IDC trillion gigabytes of data was created in 2011* More than 90% is unstructured data Data volume doubles every year 10,000 0 GB of Data (IN BILLIONS) Big Data is Getting Bigger & More Multi-structured STRUCTURED DATA

12 © Cloudera, Inc. All rights reserved. Hadoop Changes the Game: Storage & Compute Together ©2014 Cloudera, Inc. All rights reserved. The Hadoop Way The Old Way $30,000+ per TB Expensive & Unattainable Hard to scale Network is a bottleneck Only handles relational data Difficult to add new fields & data types Expensive, Special purpose, “Reliable” Servers Expensive Licensed Software Network Data Storage (SAN, NAS) Compute (RDBMS, EDW) $300-$1,000 per TB Affordable & Attainable Scales out forever No bottlenecks Easy to ingest any data Agile data access Commodity “Unreliable” Servers Hybrid Open Source Software Compute (CPU) Memory Storage (Disk) z z z z

13 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. The Legacy Way: Bringing Data to Applications Can’t Get a 360 View Many special-purpose systems Moving data around No complete views Can’t Retain Valuable Data Leaving data behind Risk and compliance High cost of storage Can’t Meet ETL SLAs Up-front modeling Transforms slow Transforms lose data Can’t Ask New Questions Existing systems strained No agility “BI backlog” SERVERSMARTSEDWSDOCUMENTSSTORAGESEARCHARCHIVE ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSEXTERNAL DATA SOURCES

14 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. The Agile Way: Bringing Applications to Data SERVERSMARTSEDWSDOCUMENTSSTORAGESEARCHARCHIVE ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSESTERNAL DATA SOURCES Consolidated Architecture Bring applications to data Combine different workloads on common data (i.e. SQL + Search) True analytic agility Active Archive Full fidelity original data Indefinite time, any source Lowest cost storage 1 1 Scalable Transformations One source of data for all analytics Persist state of transformed data Significantly faster & cheaper 2 2 Agile Exploration Simple search + BI tools “Schema on read” agility Reduce BI user backlog requests 3 3

15 © Cloudera, Inc. All rights reserved. Hadoop is more than just Apache Hadoop Present Core Hadoop (HDFS, MR) HBase ZooKeeper Core Hadoop Hive Pig Mahout HBase ZooKeeper Core Hadoop Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Core Hadoop Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Spark Impala Solr Kafka Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Parquet Sentry Spark Impala Solr Kafka Flume Bigtop Oozie MRUnit HCatalog Sqoop Whirr Avro Hive Pig Mahout HBase ZooKeeper Core Hadoop +YARN

16 © Cloudera, Inc. All rights reserved. Cloudera Enterprise powered by Apache Hadoop A new kind of data platform One place for unlimited any-type data Unified, multi-framework data access Key Advantages: High performance Enterprise system and data management Secure by default Open source, Open standards Security and Administration Unlimited Storage Proces s Discov er Model Serve Deployment Flexibility On-Premises Appliances Engineered Systems Public Cloud Private Cloud Hybrid Cloud

17 © Cloudera, Inc. All rights reserved. Data Drives Travel/Leisure Customer Segmentation Marketing Campaign Testing Regulatory Compliance

18 © Cloudera, Inc. All rights reserved. Data Drives Social

19 © Cloudera, Inc. All rights reserved. Data Drives Manufacturing Predictive maintenance Goods classification

20 © Cloudera, Inc. All rights reserved. Data Drives Healthcare Population Health Patient Monitoring Chronic Disease Management

21 © Cloudera, Inc. All rights reserved. MEDIA / ENTERTAINMENT Viewers / advertising effectiveness ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization HEALTH CARE Patient sensors, monitoring, EHRs Quality of care FINANCIAL SERVICES Risk & portfolio analysis New products CONSUMER PACKAGED GOODS Sentiment analysis of what’s hot, customer service TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment RETAIL Consumer sentiment Optimized marketing EDUCATION & RESEARCH Experiment sensor analysis LIFE SCIENCES Clinical trials Genomics AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location- based advertising HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis UTILITIES Smart Meter analysis for network capacity OIL & GAS Drilling exploration sensor analysis LAW ENFORCEMENT & DEFENSE Threat analysis, Social media monitoring, Photo analysis Big Data takes on a lot of questions

22 © Cloudera, Inc. All rights reserved. 22 A Fortune 500 company specializing in agriculture and genomics can automate data-driven R&D decisions to reduce time to market from years to months. Ask Bigger Questions: How do we feed the world? ©2013 Cloudera, Inc. All rights reserved.22

23 © Cloudera, Inc. All rights reserved. Thank you! Alexander Bibighaus| Director of Engineering