Download presentation
Presentation is loading. Please wait.
1
Big Data at The Speed of Business
Eric Mizell – Geoff Lunsford –
2
Agenda Atlanta Big Data Users Group
In-Memory Data Management Hadoop Right Tool for the Job
3
Big Data = Transactions + Interactions + Observations
User Generated Content Mobile Web SMS/MMS Sentiment External Demographics HD Video, Audio, Images Speech to Text Product/Service Logs Social Interactions & Feeds Business Data Feeds Petabytes User Click Stream Sensors / RFID / Devices Spatial & GPS Coordinates Increasing Data Variety and Complexity Web logs WEB Offer history A/B testing Dynamic Pricing Affiliate Networks Search Marketing Behavioural Targeting Dynamic Funnels Terabytes Segmentation Offer details Customer Touches Support Contacts CRM Gigabytes Purchase detail Purchase record Payment record ERP Life used to be simple and very transactional in nature Early 90’s, ERP: transactions count your sales by customer by location Late 90’s – the age of segmentation and targeted offers. Merge customer operations with marketing Now, life is more complex, connected, and interactional in nature! Digital marketing enables measurement of interactions across channels Social networks, mobile commerce, and user-generated content increases the TYPES and VOLUMES of data which is generated by system:system communication and data exhaust from customer behavior like click-stream While most definitions of Big Data focus on the new forms of unstructured data flowing through businesses with new levels of “volume, velocity, variety, and complexity”, I tend to answer the question using a simple equation: Big Data = Transactions + Interactions + Observations ERP, SCM, CRM, and transactional Web applications are classic examples of systems processing Transactions. Highly structured data in these systems is typically stored in SQL databases. Interactions are about how people and things interact with each other or with your business. Web Logs, User Click Streams, Social Interactions & Feeds, and User-Generated Content are classic places to find Interaction data. Observational data tends to come from the “Internet of Things”. Sensors for heat, motion, pressure and RFID and GPS chips within such things as mobile devices, ATM machines, and even aircraft engines provide just some examples of “things” that output Observation data. Megabytes Source: Contents of above graphic created in partnership with Teradata, Inc.
4
DATA is Expanding Exponentially
Exploding Data creates Business problem that prevailing technology platforms cannot address This is preventing Enterprises from quickly extracting business value from this data Example - Faster processing, more transactions, faster analysis, To address this data management infrastructure must simultaneously address the Volume – Scale to massive amounts of Data Velocity – Handle high speed read, write and update Variety – Structured and Unstructured Value – Easy way to extract Value A confluence of interconnected forces is generating and utilizing VAST AMOUNTS OF DATA
5
Big Data | Vast & Growing
Every minute of everyday… 204 Million s are sent. Company’s on facebook receive 34,722 “Likes”. Over 100,000 Tweets are sent out. Consumers spend $272, shopping online 571 new websites are created Google receives 2 million search requests 47,000 apps are downloaded from Apple And we’re not even talking about Flickr, YouTube, Instagram, Foursquare and all the different blog post that are created. Now think that the digital universe is projected to be 50 times bigger by 2020. Source: IDC report
6
What is a Data Driven Business?
DEFINITION Better use of available data in the decision making process RULE Key metrics derived from data should be tied to goals PROVEN RESULTS Firms that adopt Data-Driven Decision Making have output and productivity that is than what would be expected given their investments and usage of information technology* * “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?” Brynjolfsson, Hitt and Kim (April 22, 2011)
7
Big Data: Optimize Outcomes at Scale
Media Content Intelligence Detection Finance Algorithms Advertising Performance Fraud Prevention Retail / Wholesale Inventory turns Manufacturing Supply chains Healthcare Patient outcomes Education Learning outcomes Government Citizen services optimize optimize optimize optimize optimize Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.
8
1. 2. 3. The Road Ahead… Big Data is pervasive
But if you can’t extract value, Big Data is just a problem to be managed 2. 3. Value relies on Big Data being fast and agile Big Data is pervasive But without the ability to extract value from it, Big Data is just a problem to be managed To be valuable, Big Data must be fast Since most would agree that data is not yet fast enough, we’re going to have to change the way we think about and manage data
9
Big Data Holds Big Value for Enterprises
BIG DATA Opportunities Making Better Informed Decisions Discovering Hidden Insights Automating Business Processes Generate Revenue Strategies, Recommendations Forensics, Patterns, Trends Complex Events, Translation Know What Your Customers Want and When Business Analytics Value Proposition ACCESS MANAGE ANALYZE ACT
10
In-Memory Data Management
11
BIG DATA FOR FAST BUSINESS
Terracotta is… The 1st Choice Platform For High-value, High-velocity Data For Businesses with High-value Big Data Challenges BIG DATA FOR FAST BUSINESS
12
3 What is BigMemory? BigMemory
Our platforms are powered by BigMemory, the in-memory solution of choice for high-value data among progressive enterprises. No one can touch our predictable performance and high availability at Big Data scale.
13
Why is this a Game-Changing Innovation?
Data in Database WITHOUT BigMemory Slow Expensive Complex Data in Memory WITH BigMemory Fast Cost-efficient Simplified Data in Memory Data in BigMemory Data in Database Lets talk a little more about Why BigMemory is a Big Deal and potentially significant for you… BigMemory allows you to Store data where it’s used: in memory, where the application runs. BigMemory brings to in-memory data management all the capabilities needed to perform at scale. There is no simpler way to get predictably fast access to big volumes of in-memory data. Without BigMemory Applications can’t effectively use all the memory available in today’s servers. With BigMemory You can max out the biggest servers on the market moving terabytes of your high-value data into memory. Capacity is unlimited. You can scale up AND scale out, all while maintaining high-performance data access. We make it possible to store and access 100’s of Gbs , even terabyte's of your enterprise data , without sacrificing the enterprise class capabilities you are accustomed to. It’s very fast – Its In Memory - Microsecond data access—or 100x faster than disk-based, network-accessed stores (Like Database) Cost Efficient - Memory is cheap and abundant – Servers with 96GB of RAM are less than $5K Its Simple - We have a provide a standard open interface that is used by millions of developers And allow you to add scale and capabilities though simple configuration. Why BigMemory? Exploding data volumes –Enterprises need more, simple access, at high performance, on very large volumes of data. Traditional technologies (databases, data warehouses) cannot handle thee requirements. Customers want to harness the business value on massive volumes of enterprise data, in real-time or near real-time.
14
Terracotta Commercial Products
Universal Messaging Low latency Universal Messaging Quartz Scheduler Enterprise Java Job Scheduler Web Sessions Plug-in for enterprise-grade web session management CEP Complex Event Processing In-memory Data Management platform for enterprise Big Data
15
BigMemory Performance at Any Scale
SCALE UP BIG Memory Application Commodity Server Real-time access to massive amounts of business data MORE Data Users Customers Transactions QUICKER Processing Analysis Services Decisions GO BIG. GO FAST. SCALE OUT BigMemory is our flagship product. The name says it all. BigMemory stores “big” amounts of data in machine memory for ultra-fast access. BigMemory snaps into enterprise applications to deliver high-speed performance at any scale. BigMemory completely changes what our customers can do—whether that’s doubling order throughput with real-time processing, cutting risk analysis from 45 minutes to 45 seconds, or launching a new service streaming TV to iPads... What can your business do with high-volume, real-time data access? In the old architecture, everything was stored in a database. The amount of data is growing so fast, and the useful life of that data is coming down very fast. You don't have the time to look for (the data we need) in a database.
16
Powerful Enterprise-Class Features
Bigger and Faster Powerful Enterprise-Class Features 4.0 Hadoop-Ready Continuous Uptime
17
Step 1: Pull all of your key data in-memory for ultra-fast data access and management.
18
Step 2: Connect all of your data sources and devices with low latency universal messaging.
19
Step 3: Add real-time analytics and action for instant alerts, insights and responses.
20
Hadoop
22
What is Hadoop Key Characteristics
Storage Key Characteristics Scalable Efficiently store and process petabytes of data Linear scale driven by additional processing and storage Reliable Redundant storage Failover across nodes and racks Flexible Store all types of data in any format Apply schema on analysis and sharing of the data Economical Use commodity hardware Open source software guards against vendor lock-in Open Source data management with scale-out storage & distributed processing HDFS Distributed across “nodes” Natively redundant Name node tracks locations Processing Map Reduce Splits a task across processors “near” the data & assembles results Self-Healing, High Bandwidth Clustered Storage
23
What is a Hadoop “Distribution”
A complimentary set of open source technologies that make up a complete data platform Tested and pre-packaged to ease installation and usage Collects the right versions of the components that all have different release cycles and ensures they work together WebHCat WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Page 9 © Hortonworks Inc. 2012
24
Hadoop in Enterprise Data Architectures
Existing Business Infrastructure Web New Tech Datameer Tableau Terracotta Karmasphere Platfora Splunk IDE & Dev Tools ODS & Datamarts Applications & Spreadsheets Visualization & Intelligence Web Applications Operations Discovery Tools EDW Low Latency/NoSQL Custom Existing Templeton WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Big Data Sources (transactions, observations, interactions) CRM ERP financials Social Media Exhaust Data logs files
25
Key Capability in Hadoop: Late binding
With traditional ETL, structure must be agreed upon far in advance, is difficult to change, and leaves data on the floor. WEB LOGS, CLICK STREAMS MACHINE GENERATED OLTP ETL Server Data Mart / EDW Client Apps Store Transformed Data ETL With Hadoop, capture all data and structure data as business needs evolve. DATA SERVICES OPERATIONAL SERVICES HORTONWORKS DATA PLATFORM HADOOP CORE WEB LOGS, CLICK STREAMS MACHINE GENERATED OLTP Data Mart / EDW Client Apps Extract & Load Dynamically Apply Transformations Hadoop
26
And Many New Data Access Methods
Machine learning ETL, data modeling, exploration SQL Graph processing Distributed, high volume reads/writes of small number of records APACHE GIRAPH APACHE HIVE APACHE PIG APACHE HBASE APACHE MAHOUT Batch processing Distributed Storage & Processing HDFS MAP REDUCE Package and test in OS, Plus R for statistical analysis Mahout for recommendation engine Storm, Apache S4 for stream processing Many commercial offerings …
27
Hadoop Now, Next, and Beyond
Apache community, including Hortonworks investing to improve Hadoop: Make Hadoop an open, extensible, and enterprise viable platform Enable more applications to run on Apache Hadoop Hadoop 2.0 Tez Optimized processing framework Falcon Data Management Knox Secure Access Stinger Interactive Query
28
Your Fastest On-ramp to Enterprise Hadoop™!
The Sandbox lets you experience Apache Hadoop from the convenience of your own laptop – no data center, no cloud and no internet connection needed! The Hortonworks Sandbox is: A free download: A complete, self contained virtual machine with Apache Hadoop pre-configured A personal, portable and standalone Hadoop environment A set of hands-on, step-by-step tutorials that allow you to learn and explore Hadoop For anyone looking to get their hands on Hadoop, we have recently introduced the Hadoop Sandbox program which enables users to download a full instance of HDP together with guided tutorials covering both development and administration topics. Go download your Sandbox today!!!
29
The Right Tool for The Right Job
30
Put it all together: To create real value for the business.
31
Use Case: Delivery Logistics
32
Use Case: Delivery Logistics
Sensor, Weather, Traffic Data streamed into CEP Engine GPS, fuel level, vehicle temp, outside temp, etc. Real-Time Events, Alerts, Dashboards Geo fencing, vehicle breakdown, visibility All data sent to Hadoop Machine Learning, Predictive Maintenance, Prescriptive Analytics, Long Term Storage Reporting Roll up results and load into Hive View with Traditional BI Tools Data shipped back to CEP Engine Continuous Improvement (Alerting, decision making)
33
Use Case: Real-Time Fraud Prevention
34
Use Case: Real-Time Fraud Prevention
All Transactions are checked real-time in memory BigMemory stores X rolling days of transaction history and fraud rules All transactions sent to Hadoop Detection of fraud patterns, analyze spending habits, long term storage, sell customer spend patterns to business Reporting Roll up results and load into Hive View with Traditional BI Tools Data shipped back to BigMemory Continuous Improvement (new patterns, individual spend habits)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.