1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering The Future of Data Management with Hadoop and the Enterprise Data.

Slides:



Advertisements
Similar presentations
Syncsort Data Integration Update Summary Helping Data Intensive Organizations Across the Big Data Continuum Hadoop – The Operating System.
Advertisements

The Internet of Riedwaan Bassadien Platform Strategy Manager Microsoft Everything Your things.
Drive Data Quality at Your Company: Create a Data Lake George Corugedo Chief Technology Officer & Co-Founder.
MICROSOFT BIG DATA. WHAT IS BIG DATA? How do I optimize my fleet based on weather and traffic patterns? SOCIAL & WEB ANALYTICS LIVE DATA FEEDS ADVANCED.
SAS solutions SAS ottawa platform user society nov 20th 2014.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Why Spark on Hadoop Matters
© 2009 VMware Inc. All rights reserved Big Data’s Virtualization Journey Andrew Yu Sr. Director, Big Data R&D VMware.
Cloudera & Hadoop Use Cases Rob Lancaster | Omer Trajman "Big Data"... Applications From Enterprises to Individuals.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
An Information Architecture for Hadoop Mark Samson – Systems Engineer, Cloudera.
Transform + analyze Visualize + decide Capture + manage Dat a.
Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark 2.
Hadoop Ecosystem Overview
SM STRATA PRESENTATION Tim Garnto - SVP Engineering, edo Interactive Rob Rosen – Big Data Field Lead, Pentaho.
TITLE SLIDE: HEADLINE Presenter name Title, Red Hat Date For Red Hat, it's 1994 all over again Sarangan Rangachari VP and GM, Storage and Big Data Red.
SQL on Hadoop. Todays agenda Introduction Hive – the first SQL approach Data ingestion and data formats Impala – MPP SQL.
Megabytes Gigabytes Terabytes Petabytes Purchase detail Purchase record Payment record Purchase detail Purchase record Payment record ERP CRM.
Megabytes Gigabytes Terabytes Petabytes Purchase detail Purchase record Payment record Purchase detail Purchase record Payment record ERP CRM.
May 23nd 2012 Matt Mead, Cloudera
Smart Cities & Smart Utility
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Page 1 © Hortonworks Inc – All Rights Reserved Hortonworks Naser Ali UK Building Energy Management Group Hadoop: A Data platform for businesses.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
© 2013 IBM Corporation Version 1.0 The New Eye Insight through Big Data and Analytics: A Case Study on Citizen Sentiment Analysis Sandipan Sarkar, Executive.
Optimize your Open Data 5 Best Practices for Designing Data-Driven Apps ​ Glenn Hess ​ Federal Sales Engineer ​ Actuate, Inc.
Hadoop tutorials. Todays agenda Hadoop Introduction and Architecture Hadoop Distributed File System MapReduce Spark Cluster Monitoring 2.
IoT, Big Data and Emerging Technologies
Enabling data management in a big data world Craig Soules Garth Goodson Tanya Shastri.
Click to add text TWA New Job Types with Tivoli Workload Scheduler for Applications 8.6 TWS Education.
Data and SQL on Hadoop. Cloudera Image for hands-on Installation instruction – 2.
1 Apache Spark and Its Role in the Enterprise Data Hub Mike Olson, Chief Strategy Officer,
1 © Cloudera, Inc. All rights reserved. Partner Solution Overview 1 Partner Logo Full Color Partner Logo Full Color.
Hadoop IT Services Hadoop Users Forum CERN October 7 th,2015 CERN IT-D*
1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering, Cloudera, Inc. The Future of Data Management with Hadoop and the.
Nov 2006 Google released the paper on BigTable.
Zhangxi Lin Texas Tech University
LIMPOPO DEPARTMENT OF ECONOMIC DEVELOPMENT, ENVIRONMENT AND TOURISM The heartland of southern Africa – development is about people! 2015 ICT YOUTH CONFERENCE.
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Denver ● SPT 104 ● March 1-3, 2016.
AZURE DISTRIBUTED DATA Storage, HDInsight Hadoop, Azure Data Lake.
Big Data Yuan Xue CS 292 Special topics on.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
BIG DATA. Big Data: A definition Big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database.
Data Science Hadoop YARN Rodney Nielsen. Rodney Nielsen, Human Intelligence & Language Technologies Lab Outline Classical Hadoop What’s it all about Hadoop.
What is it and why it matters? Hadoop. What Is Hadoop? Hadoop is an open-source software framework for storing data and running applications on clusters.
Cloudera Enterprise on Microsoft Azure. Data is driving business transformation CUSTOMER & CHANNEL DATA-DRIVEN PRODUCTS SECURITY, RISK & COMPLIANCE In.
Data Analytics and Hadoop Service in IT-DB Visit of Cloudera - April 19 th, 2016 Luca Canali (CERN) for IT-DB.
Microsoft Partner since 2011
A Brave New (connected) World – IoT& DX November 2015 Mark Walker – AVP Sub
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Microsoft Ignite /28/2017 6:07 PM
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Data Analytics Challenges Some faults cannot be avoided Decrease the availability for running physics Preventive maintenance is not enough Does not take.
Building a Better Connected World
OMOP CDM on Hadoop Reference Architecture
Connected Infrastructure
Data Platform and Analytics Foundational Training
Connected Living Connected Living What to look for Architecture
9/10/2018 Largest US Healthcare Dataset in Hadoop enables Patient-level Analytics in Near Real Time September 28, 2016 Navdeep Alam Director of Data.
Smart Building Solution
Hadoop and Analytics at CERN IT
Industrial IoT Derive business value from the Internet of Things, People and Services Ronald Binkofski General Manager Microsoft MC CIS.
Smart Building Solution
Hadoopla: Microsoft and the Hadoop Ecosystem
Connected Living Connected Living What to look for Architecture
Connected Infrastructure
Big Data Young Lee BUS 550.
Presentation transcript:

1 © Cloudera, Inc. All rights reserved. Alexander Bibighaus| Director of Engineering The Future of Data Management with Hadoop and the Enterprise Data Hub

2 © Cloudera, Inc. All rights reserved. Big Data is revolutionizing how businesses think Industrial RevolutionData Revolution

3 © Cloudera, Inc. All rights reserved. Helped 4+ million homes save over $320 Million for subscribers in energy bills Combined diverse data sets including streaming utility & sensor data in Cloudera Enterprise Improved usage insights help engage customers resulting in changes in energy usage Improve Products & Services Efficiency

4 © Cloudera, Inc. All rights reserved. MEDIA / ENTERTAINMENT Viewers / advertising effectiveness ON-LINE SERVICES / SOCIAL MEDIA People & career matching Website optimization HEALTH CARE Patient sensors, monitoring, EHRs Quality of care FINANCIAL SERVICES Risk & portfolio analysis New products CONSUMER PACKAGED GOODS Sentiment analysis of what’s hot, customer service TRAVEL & TRANSPORTATION Sensor analysis for optimal traffic flows Customer sentiment RETAIL Consumer sentiment Optimized marketing EDUCATION & RESEARCH Experiment sensor analysis LIFE SCIENCES Clinical trials Genomics AUTOMOTIVE Auto sensors reporting location, problems COMMUNICATIONS Location- based advertising HIGH TECHNOLOGY / INDUSTRIAL MFG. Mfg quality Warranty analysis UTILITIES Smart Meter analysis for network capacity OIL & GAS Drilling exploration sensor analysis LAW ENFORCEMENT & DEFENSE Threat analysis, Social media monitoring, Photo analysis Big Data is pervasive

5 © Cloudera, Inc. All rights reserved. What is data suddenly big? Web/Mobile Clickstream Social Media Sensor Networks Audio, Image & Video Video & Voice Processing Text Sentiment Analysis Social Graph Analysis

6 © Cloudera, Inc. All rights reserved. UNSTRUCTURED DATA * Source: IDC trillion gigabytes of data was created in 2011* More than 90% is unstructured data Data volume doubles every year 10,000 0 GB of Data (IN BILLIONS) Big Data is Getting Bigger & More Multi-structured STRUCTURED DATA

7 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. The Established Way: Bringing Data to Applications Can’t Get a 360 View Many special-purpose systems Moving data around No complete views Can’t Retain Valuable Data Leaving data behind Risk and compliance High cost of storage Can’t Meet ETL SLAs Up-front modeling Transforms slow Transforms lose data Can’t Ask New Questions Existing systems strained No agility “BI backlog” SERVERSMARTSEDWSDOCUMENTSSTORAGESEARCHARCHIVE ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSEXTERNAL DATA SOURCES

8 © Cloudera, Inc. All rights reserved. A modern data architecture is needed to drive success from data

9 © Cloudera, Inc. All rights reserved. ©2014 Cloudera, Inc. All rights reserved. The Hadoop Way: Bringing Applications to Data SERVERSMARTSEDWSDOCUMENTSSTORAGESEARCHARCHIVE ERP, CRM, RDBMS, MACHINESFILES, IMAGES, VIDEOS, LOGS, CLICKSTREAMSESTERNAL DATA SOURCES Consolidated Architecture Bring applications to data Combine different workloads on common data (i.e. SQL + Search) True analytic agility Active Archive Full fidelity original data Indefinite time, any source Lowest cost storage 1 1 Scalable Transformations One source of data for all analytics Persist state of transformed data Significantly faster & cheaper 2 2 Agile Exploration Simple search + BI tools “Schema on read” agility Reduce BI user backlog requests 3 3

10 © Cloudera, Inc. All rights reserved. Hadoop Ecosystem: An Open Platform NEW PROJECTS EXISTING PROJECTS *CDH SUPPORTED Core Hadoop (HDFS, MapReduce) Solr Pig Core Hadoop HBase ZooKeeper Solr Pig Core Hadoop Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig Core Hadoop Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Knox Flink Parquet Sentry Spark Tez Impala Kafka Drill Flume Bigtop Oozie HCatalog Hue Sqoop Avro Hive Mahout HBase ZooKeeper Solr Pig YARN Core Hadoop Kudu* RecordService* Ibis* Falcon Knox Flink Parquet* Sentry* Spark* Tez Impala* Kafka* Drill Flume* Bigtop* Oozie* Hcatalog* Hue* Sqoop* Avro* Hive* Mahout* Hbase* ZooKeeper* Solr* Pig* YARN* Core Hadoop*

11 © Cloudera, Inc. All rights reserved. By 2017, Gartner “Predicts 2015: Big Data Challenges Move From Technology to the Organization” – November 2014 of big data projects will fail to go beyond the pilot phase 60% Through 2018, of deployed data lakes will be useless as they are overwhelmed with information assets captured for uncertain use cases. 90%

12 © Cloudera, Inc. All rights reserved. Big Data and the Technology Adoption Cycle According to FirstMark VC, Big Data is beginning the Early Majority

13 © Cloudera, Inc. All rights reserved. Where does the road go? Maturation Focus on AI Applications Specialized Use Case Support

14 © Cloudera, Inc. All rights reserved. In Healthcare, IoT can enable cutting the costs of chronic disease treatment by as much as 50 percent Source: McKinsey & Co - Customer Journey Analytics & Big Data, 2013 Source: McKinsey Analysis, The Internet of Things: Mapping the value beyond the hype, June 2015

15 © Cloudera, Inc. All rights reserved. End-to-end view of data is helping save lives by detecting sepsis early enough for successful treatment Has saved 100s of lives already & reduced hospital readmissions Centralized data from many systems available in a secure environment 2PB+ in multi-tenant environment supporting 100s of clients IMPROVE PRODUCTS & SERVICES EFFICIENCY

16 © Cloudera, Inc. All rights reserved. Thank you!

17 © Cloudera, Inc. All rights reserved. Data is Transforming Business DRIVE CUSTOMER INSIGHTS IMPROVE PRODUCTS & SERVICES EFFICIENCY LOWER BUSINESS RISKS