Big Data at The Speed of Business

Slides:



Advertisements
Similar presentations
R and HDInsight in Microsoft Azure
Advertisements

Big Data Management and Analytics Introduction Spring 2015 Dr. Latifur Khan 1.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
Running Hadoop-as-a-Service in the Cloud
Chapter 14 The Second Component: The Database.
TOPIC 1: GAINING COMPETITIVE ADVANTAGE WITH IT (CONTINUE) SUPPLY CHAIN MANAGEMENT & BUSINESS INTELLIGENCE.
Amadeus Travel Intelligence ‘Monetising’ big data sets
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Database Systems – Data Warehousing
© Hortonworks Inc Hortonworks Page 1. © Hortonworks Inc Big Data Changes the Game Megabytes Gigabytes Terabytes Petabytes Purchase detail.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Big Data Analytics Large-Scale Data Management Big Data Analytics Data Science and Analytics How to manage very large amounts of data and extract value.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Big Data Analytics with Excel Peter Myers Bitwise Solutions.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Microsoft Partner since 2011
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Microsoft Ignite /28/2017 6:07 PM
Data Analytics (CS40003) Introduction to Data Lecture #1
CNIT131 Internet Basics & Beginning HTML
Connected Infrastructure
AuraPortal Cloud Helps Empower Organizations to Organize and Control Their Business Processes via Applications on the Microsoft Azure Cloud Platform MICROSOFT.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Device Maintenance and Management, Parental Control, and Theft Protection for Home Users Made Easy with Remo MORE and Power of Azure MICROSOFT AZURE APP.
Smart Building Solution
of Analytics, WiFi and Experiences for Retailers
Vidcoding Introduces Scalable Video and TV Encoding in the Cloud at an Affordable Price by Utilizing the Processing Power of Azure Batch MICROSOFT AZURE.
Mike Gualtieri, Principal Analyst
Published Date: 14th October 2013
IBM Tivoli Web Site Analyzer Training Document
Free Cloud Management Portal for Microsoft Azure Empowers Enterprise Users to Govern Their Cloud Spending and Optimize Cloud Usage and Planning MICROSOFT.
Gather Valuable Customer Data
Trial.iO Makes it Easy to Provision Software Trials, Demos and Training Environments in the Azure Cloud in One Click, Without Any IT Involvement MICROSOFT.
Smart Building Solution
The INTERNET VALUE CHAIN
Insurance Fraud Analytics in the Cloud with Saama and Microsoft Azure
Connected Infrastructure
© 2016 Global Market Insights, Inc. USA. All Rights Reserved Fuel Cell Market size worth $25.5bn by 2024Low Power Wide Area Network.
Hadoop Market
Blinkfire Analytics Uses the Microsoft Azure Cloud Platform’s Power to Recognize and Measure Media Value and Impact for Teams, Leagues, and Brands MICROSOFT.
NGAGE Intelligence Leverages Microsoft Azure Platform to Provide Essential Analytics for Hybrid SharePoint Server/Office 365 Environments MICROSOFT AZURE.
SocialBoards Self-Service, Multichannel Support Ticket Notifications in Microsoft Office 365 Groups Help Customer Care Teams to Provide Better Care OFFICE.
Microsoft Azure Platform Powers New Elements Constellation Software Suite to Deliver Invaluable Insights From Your Data for Marketing and Sales MICROSOFT.
Be Better: Achieve Customer Service Excellence and Create a Lean RMA and Returns Process with Renewity RMA and the Power of Microsoft Azure MICROSOFT AZURE.
Scalable SoftNAS Cloud Protects Customers’ Mission-Critical Data in the Cloud with a Highly Available, Flexible Solution for Microsoft Azure MICROSOFT.
American Brush Manufactures Association
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
Voice Analytics on Microsoft Azure Allows Various Customers to Get the Most Out of Conversations with Clients Through Efficient Content Analysis MICROSOFT.
Through the Microsoft Azure Platform, TARGIT Decision Suite Enables Organizations to Analyze Critical Data, Giving Them the Courage to Act MICROSOFT AZURE.
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Accelerate Your Self-Service Data Analytics
TruRating: Mass Point-of-Payment Customer Rating System Uses the Power of Microsoft Azure to Store and Analyze Millions of Ratings for Business Owners.
Cloud Analytics for Microsoft Azure
XtremeData on the Microsoft Azure Cloud Platform:
Overview of big data tools
Improve Patient Experience with Saama and Microsoft Azure
Technical Capabilities
Big Data Analysis in Digital Marketing
Big DATA.
Pitch Deck.
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Customer 360.
UNIT 6 RECENT TRENDS.
The Intelligent Enterprise and SAP Business One
Big Data.
Presentation transcript:

Big Data at The Speed of Business Eric Mizell – emizell@hortonworks.com Geoff Lunsford – geoff@terracottatech.com

Agenda Atlanta Big Data Users Group In-Memory Data Management Hadoop Right Tool for the Job

Big Data = Transactions + Interactions + Observations User Generated Content Mobile Web SMS/MMS Sentiment External Demographics HD Video, Audio, Images Speech to Text Product/Service Logs Social Interactions & Feeds Business Data Feeds Petabytes User Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates Increasing Data Variety and Complexity Web logs WEB Offer history A/B testing Dynamic Pricing Affiliate Networks Search Marketing Behavioural Targeting Dynamic Funnels Terabytes Segmentation Offer details Customer Touches Support Contacts CRM Gigabytes Purchase detail Purchase record Payment record ERP Life used to be simple and very transactional in nature Early 90’s, ERP: transactions count your sales by customer by location Late 90’s – the age of segmentation and targeted offers. Merge customer operations with marketing Now, life is more complex, connected, and interactional in nature! Digital marketing enables measurement of interactions across channels Social networks, mobile commerce, and user-generated content increases the TYPES and VOLUMES of data which is generated by system:system communication and data exhaust from customer behavior like click-stream While most definitions of Big Data focus on the new forms of unstructured data flowing through businesses with new levels of “volume, velocity, variety, and complexity”, I tend to answer the question using a simple equation: Big Data = Transactions + Interactions + Observations ERP, SCM, CRM, and transactional Web applications are classic examples of systems processing Transactions. Highly structured data in these systems is typically stored in SQL databases. Interactions are about how people and things interact with each other or with your business. Web Logs, User Click Streams, Social Interactions & Feeds, and User-Generated Content are classic places to find Interaction data. Observational data tends to come from the “Internet of Things”. Sensors for heat, motion, pressure and RFID and GPS chips within such things as mobile devices, ATM machines, and even aircraft engines provide just some examples of “things” that output Observation data. Megabytes Source: Contents of above graphic created in partnership with Teradata, Inc.

DATA is Expanding Exponentially Exploding Data creates Business problem that prevailing technology platforms cannot address This is preventing Enterprises from quickly extracting business value from this data Example - Faster processing, more transactions, faster analysis, To address this data management infrastructure must simultaneously address the Volume – Scale to massive amounts of Data Velocity – Handle high speed read, write and update Variety – Structured and Unstructured Value – Easy way to extract Value A confluence of interconnected forces is generating and utilizing VAST AMOUNTS OF DATA

Big Data | Vast & Growing Every minute of everyday… 204 Million emails are sent. Company’s on facebook receive 34,722 “Likes”. Over 100,000 Tweets are sent out. Consumers spend $272,070.00 shopping online 571 new websites are created Google receives 2 million search requests 47,000 apps are downloaded from Apple And we’re not even talking about Flickr, YouTube, Instagram, Foursquare and all the different blog post that are created. Now think that the digital universe is projected to be 50 times bigger by 2020. Source: IDC report

What is a Data Driven Business? DEFINITION Better use of available data in the decision making process RULE Key metrics derived from data should be tied to goals PROVEN RESULTS Firms that adopt Data-Driven Decision Making have output and productivity that is than what would be expected given their investments and usage of information technology* 1110010100001010011101010100010010100100101001001000010010001001000001000100000100010010010001000010111000010010001000101001001011110101001000100100101001010010011111001010010100011111010001001010000010010001010010111101010011001001010010001000111 * “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?” Brynjolfsson, Hitt and Kim (April 22, 2011)

Big Data: Optimize Outcomes at Scale Media Content Intelligence Detection Finance Algorithms Advertising Performance Fraud Prevention Retail / Wholesale Inventory turns Manufacturing Supply chains Healthcare Patient outcomes Education Learning outcomes Government Citizen services optimize optimize optimize optimize optimize Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.

1. 2. 3. The Road Ahead… Big Data is pervasive But if you can’t extract value, Big Data is just a problem to be managed 2. 3. Value relies on Big Data being fast and agile Big Data is pervasive But without the ability to extract value from it, Big Data is just a problem to be managed To be valuable, Big Data must be fast Since most would agree that data is not yet fast enough, we’re going to have to change the way we think about and manage data

Big Data Holds Big Value for Enterprises BIG DATA Opportunities Making Better Informed Decisions Discovering Hidden Insights Automating Business Processes Generate Revenue Strategies, Recommendations Forensics, Patterns, Trends Complex Events, Translation Know What Your Customers Want and When Business Analytics Value Proposition ACCESS MANAGE ANALYZE ACT

In-Memory Data Management

BIG DATA FOR FAST BUSINESS Terracotta is… The 1st Choice Platform For High-value, High-velocity Data For Businesses with High-value Big Data Challenges BIG DATA FOR FAST BUSINESS

3 What is BigMemory? BigMemory Our platforms are powered by BigMemory, the in-memory solution of choice for high-value data among progressive enterprises. No one can touch our predictable performance and high availability at Big Data scale.

Why is this a Game-Changing Innovation? Data in Database WITHOUT BigMemory Slow  Expensive Complex Data in Memory WITH BigMemory Fast  Cost-efficient Simplified Data in Memory Data in BigMemory Data in Database Lets talk a little more about Why BigMemory is a Big Deal and potentially significant for you… BigMemory allows you to Store data where it’s used: in memory, where the application runs. BigMemory brings to in-memory data management all the capabilities needed to perform at scale. There is no simpler way to get predictably fast access to big volumes of in-memory data. Without BigMemory Applications can’t effectively use all the memory available in today’s servers. With BigMemory You can max out the biggest servers on the market moving terabytes of your high-value data into memory. Capacity is unlimited. You can scale up AND scale out, all while maintaining high-performance data access. We make it possible to store and access 100’s of Gbs , even terabyte's of your enterprise data , without sacrificing the enterprise class capabilities you are accustomed to. It’s very fast – Its In Memory - Microsecond data access—or 100x faster than disk-based, network-accessed stores (Like Database) Cost Efficient - Memory is cheap and abundant – Servers with 96GB of RAM are less than $5K Its Simple - We have a provide a standard open interface that is used by millions of developers And allow you to add scale and capabilities though simple configuration. Why BigMemory? Exploding data volumes –Enterprises need more, simple access, at high performance, on very large volumes of data. Traditional technologies (databases, data warehouses) cannot handle thee requirements. Customers want to harness the business value on massive volumes of enterprise data, in real-time or near real-time.

Terracotta Commercial Products Universal Messaging Low latency Universal Messaging Quartz Scheduler Enterprise Java Job Scheduler Web Sessions Plug-in for enterprise-grade web session management CEP Complex Event Processing In-memory Data Management platform for enterprise Big Data

BigMemory Performance at Any Scale SCALE UP BIG Memory Application Commodity Server Real-time access to massive amounts of business data MORE Data Users Customers Transactions QUICKER Processing Analysis Services Decisions GO BIG. GO FAST. SCALE OUT BigMemory is our flagship product. The name says it all. BigMemory stores “big” amounts of data in machine memory for ultra-fast access. BigMemory snaps into enterprise applications to deliver high-speed performance at any scale. BigMemory completely changes what our customers can do—whether that’s doubling order throughput with real-time processing, cutting risk analysis from 45 minutes to 45 seconds, or launching a new service streaming TV to iPads... What can your business do with high-volume, real-time data access? In the old architecture, everything was stored in a database. The amount of data is growing so fast, and the useful life of that data is coming down very fast. You don't have the time to look for (the data we need) in a database.

Powerful Enterprise-Class Features Bigger and Faster Powerful Enterprise-Class Features 4.0 Hadoop-Ready Continuous Uptime

Step 1: Pull all of your key data in-memory for ultra-fast data access and management.

Step 2: Connect all of your data sources and devices with low latency universal messaging.

Step 3: Add real-time analytics and action for instant alerts, insights and responses.

Hadoop

What is Hadoop Key Characteristics Storage Key Characteristics Scalable Efficiently store and process petabytes of data Linear scale driven by additional processing and storage Reliable Redundant storage Failover across nodes and racks Flexible Store all types of data in any format Apply schema on analysis and sharing of the data Economical Use commodity hardware Open source software guards against vendor lock-in Open Source data management with scale-out storage & distributed processing HDFS Distributed across “nodes” Natively redundant Name node tracks locations Processing Map Reduce Splits a task across processors “near” the data & assembles results Self-Healing, High Bandwidth Clustered Storage

What is a Hadoop “Distribution” A complimentary set of open source technologies that make up a complete data platform Tested and pre-packaged to ease installation and usage Collects the right versions of the components that all have different release cycles and ensures they work together WebHCat WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Page 9 © Hortonworks Inc. 2012

Hadoop in Enterprise Data Architectures Existing Business Infrastructure Web New Tech Datameer Tableau Terracotta Karmasphere Platfora Splunk IDE & Dev Tools ODS & Datamarts Applications & Spreadsheets Visualization & Intelligence Web Applications Operations Discovery Tools EDW Low Latency/NoSQL Custom Existing Templeton WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Big Data Sources (transactions, observations, interactions) CRM ERP financials Social Media Exhaust Data logs files

Key Capability in Hadoop: Late binding With traditional ETL, structure must be agreed upon far in advance, is difficult to change, and leaves data on the floor. WEB LOGS, CLICK STREAMS MACHINE GENERATED OLTP ETL Server Data Mart / EDW Client Apps Store Transformed Data ETL With Hadoop, capture all data and structure data as business needs evolve. DATA SERVICES OPERATIONAL SERVICES HORTONWORKS DATA PLATFORM HADOOP CORE WEB LOGS, CLICK STREAMS MACHINE GENERATED OLTP Data Mart / EDW Client Apps Extract & Load Dynamically Apply Transformations Hadoop

And Many New Data Access Methods Machine learning ETL, data modeling, exploration SQL Graph processing Distributed, high volume reads/writes of small number of records APACHE GIRAPH APACHE HIVE APACHE PIG APACHE HBASE APACHE MAHOUT Batch processing Distributed Storage & Processing HDFS MAP REDUCE Package and test in OS, Plus R for statistical analysis Mahout for recommendation engine Storm, Apache S4 for stream processing Many commercial offerings …

Hadoop Now, Next, and Beyond Apache community, including Hortonworks investing to improve Hadoop: Make Hadoop an open, extensible, and enterprise viable platform Enable more applications to run on Apache Hadoop Hadoop 2.0 Tez Optimized processing framework Falcon Data Management Knox Secure Access Stinger Interactive Query

Your Fastest On-ramp to Enterprise Hadoop™! http://hortonworks.com/products/hortonworks-sandbox/ The Sandbox lets you experience Apache Hadoop from the convenience of your own laptop – no data center, no cloud and no internet connection needed! The Hortonworks Sandbox is: A free download: http://hortonworks.com/products/hortonworks-sandbox/ A complete, self contained virtual machine with Apache Hadoop pre-configured A personal, portable and standalone Hadoop environment A set of hands-on, step-by-step tutorials that allow you to learn and explore Hadoop For anyone looking to get their hands on Hadoop, we have recently introduced the Hadoop Sandbox program which enables users to download a full instance of HDP together with guided tutorials covering both development and administration topics. Go download your Sandbox today!!!

The Right Tool for The Right Job

Put it all together: To create real value for the business.

Use Case: Delivery Logistics

Use Case: Delivery Logistics Sensor, Weather, Traffic Data streamed into CEP Engine GPS, fuel level, vehicle temp, outside temp, etc. Real-Time Events, Alerts, Dashboards Geo fencing, vehicle breakdown, visibility All data sent to Hadoop Machine Learning, Predictive Maintenance, Prescriptive Analytics, Long Term Storage Reporting Roll up results and load into Hive View with Traditional BI Tools Data shipped back to CEP Engine Continuous Improvement (Alerting, decision making)

Use Case: Real-Time Fraud Prevention

Use Case: Real-Time Fraud Prevention All Transactions are checked real-time in memory BigMemory stores X rolling days of transaction history and fraud rules All transactions sent to Hadoop Detection of fraud patterns, analyze spending habits, long term storage, sell customer spend patterns to business Reporting Roll up results and load into Hive View with Traditional BI Tools Data shipped back to BigMemory Continuous Improvement (new patterns, individual spend habits)