© Hortonworks Inc. 2013 Speaker: Jamie Engesser, Hortonworks Big Data: Making Sense of it All! Big Data is everywhere. We see it on commercials. We hear.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Chapter 1 Business Driven Technology
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
© Hortonworks Inc Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg.
Observation Pattern Theory Hypothesis What will happen? How can we make it happen? Predictive Analytics Prescriptive Analytics What happened? Why.
GLOBAL E-BUSINESS AND COLLABORATION
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
1 Chapter 7 IT Infrastructures Business-Driven Technology
Fraud Detection in Banking using Big Data By Madhu Malapaka For ISACA, Hyderabad Chapter Date: 14 th Dec 2014 Wilshire Software.
Electronic Commerce Systems
Global E-business and Collaboration
Chapter 9 e-Commerce Systems.
McGraw-Hill/Irwin Copyright © 2008, The McGraw-Hill Companies, Inc. All rights reserved.
Top 10 Strategic Technology Trends for 2013 A Channel Partners Slide Show … as highlighted at.
Amadeus Travel Intelligence ‘Monetising’ big data sets
Business Intelligence: The Next Big Thing (Really!) John Bair CTO, Ajilitee Sep 14, 2012 Presented to TDWI St. Louis Chapter.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Confidential 1 MAP Value Proposition.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Page 1 © Hortonworks Inc – All Rights Reserved Hortonworks Naser Ali UK Building Energy Management Group Hadoop: A Data platform for businesses.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
USING HADOOP & HBASE TO BUILD CONTENT RELEVANCE & PERSONALIZATION Tools to build your big data application Ameya Kanitkar.
1.Knowledge management 2.Online analytical processing 3. 4.Supply chain management 5.Data mining Which of the following is not a major application.
Opening Keynote Presentation An Architecture for Intelligent Trading  Alessandro Petroni – Senior Principal Architect, Financial Services, TIBCO Software.
Chapter 2: Global E-Business and Collaboration Dr. Andrew P. Ciganek, Ph.D.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
Beyond Call Recording: Speech Improves Quality Assurance Larry Mark Chief Technology Officer SER Solutions, Inc.
© 2012 Datameer, Inc. All rights reserved. Page 1 © 2012 Datameer, Inc. All rights reserved. Hadoop in Financial Services Adam Gugliciello, Solutions Engineer.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
7-1 Management Information Systems for the Information Age Copyright 2004 The McGraw-Hill Companies, Inc. All rights reserved Chapter 7 IT Infrastructures.
BUSINESS DRIVEN TECHNOLOGY
Data Science and Big Data Analytics Chap1: Intro to Big Data Analytics
Lecture 3 Strategic E-Marketing Instructor: Hanniya Abid
ITGS Databases.
PUTTING MANAGED FILE TRANSFER IN PERSPECTIVE May 2015 Derek E. Brink, CISSP, Vice President and Research Fellow IT Security and IT GRC.
+ Big Data IST210 Class Lecture. + Big Data Summary by EMC Corporation ( More videos that.
Information Systems in Organizations Managing the business: decision-making Growing the business: knowledge management, R&D, and social business.
Project Management May 30th, Team Members Name Project Role Gint of Communications Sai
2015 NetSymm Overview NETSYMM OVERVIEW December
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
MAR Capability Overview Deck Protean Analytics.
Foundations of Information Systems in Business
The VERSO Product Returns Portal Incorporates Office 365 Outlook and Excel Add-Ins to Create Seamless Workflow for All Participating Users OFFICE 365 APP.
Machine Learning. Definition Machine learning is a subfield of computer science that evolved from the study of pattern recognition and computational.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
1© 2015 IBM Corporation Unlocking the power of the API economy Client Briefing Nov.
Chapter 11 Information Systems Within the Organization.
Course : Study of Digital Convergence. Name : Srijana Acharya. Student ID : Date : 11/28/2014. Big Data Analytics and the Telco : How Telcos.
Revision Chapter 1/2/3. Management Information Systems CHAPTER 1: INFORMATION IN BUSINESS SYSTEMS TODAY How information systems are transforming business.
Copyright © 2013 Dorling Kindersley (India) Pvt. Ltd. Management Information Systems: Managing the Digital Firm, 12eAuthors: Kenneth C. Laudon and Jane.
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
E-Marketing Strategic E-Marketing and Performance Metrics 2-1.
Business Insights Play briefing deck.
Data Platform and Analytics Foundational Training
Online Retailing The consumer is not primarily price-driven when shopping on the Internet but instead considers brand name, trust, reliability, delivery.
Connected Maintenance Solution
Connected Maintenance Solution
Enabling Scalable and HA Ingestion and Real-Time Big Data Insights for the Enterprise OCJUG, 2014.
Voice Analytics on Microsoft Azure Allows Various Customers to Get the Most Out of Conversations with Clients Through Efficient Content Analysis MICROSOFT.
TruRating: Mass Point-of-Payment Customer Rating System Uses the Power of Microsoft Azure to Store and Analyze Millions of Ratings for Business Owners.
XtremeData on the Microsoft Azure Cloud Platform:
E-BUSINESS E-Business is the powerful business environment that is
Big DATA.
Analytics, BI & Data Integration
Customer 360.
UNIT 6 RECENT TRENDS.
Presentation transcript:

© Hortonworks Inc Speaker: Jamie Engesser, Hortonworks Big Data: Making Sense of it All! Big Data is everywhere. We see it on commercials. We hear it in conversations over coffee. It is an expanding topic in the boardroom. The hype is palpable but what is real and better yet, how does it affect the status quo? At the center of the big data discussion is Apache Hadoop, a next-generation enterprise data platform that allows you to capture, process and share the enormous amounts of new, multi-structured data that doesn't fit into traditional systems. In this session we will discuss: ?The evolution of Apache Hadoop and Hadoop's role within enterprise data architectures ?The relationship between Hadoop and existing data infrastructures such as the enterprise data warehouse Use-cases and best practices on how to incorporate Apache Hadoop into your big data strategy ?Common patterns of Hadoop Use, Refine, Explore and Enrich and how they impact the Financial Services industry Page 1

© Hortonworks Inc Jamie Engesser VP, Solutions Engineering Page 2 Big Data: Making Sense of it all!

© Hortonworks Inc Data Driven Business? Facts not Intuition! Page Data driven decisions are better decisions – its as simple as that. Using big data enables managers to decide on the basis of evidence rather than intuition. For that reason it has the potential to revolutionize management Harvard Business Review October 2012

© Hortonworks Inc Web giants proved the ROI in data products applying data science to large amounts of data Page 4 Amazon: 35% of product sales come from product recommendations Netflix: 75% of streaming video results from recommendations Prediction of click through rates

© Hortonworks Inc Page 5

© Hortonworks Inc Page 6

© Hortonworks Inc Enterprise Data Scale Page 7 Organizations are redefining data strategies due to the requirements of the evolving Enterprise Data Warehouse (EDW). Hadoop is in the center around those strategies.

The Need for Hadoop Allows semi-structured, unstructured and structured data to be processed in a way to create new insights of significant business value. Instead of looking at samples of data or small sections of data, organizations can look at large volumes of data to get new perspective and make business decisions with higher degree of accuracy. Reducing latency in business is critical for success. The massive scalability of big data systems allow organizations to process massive amounts of data in a fraction of the time required for traditional systems. Self-healing, extremely scalable, highly available environment with cost- effective commodity hardware. Traditional Database SCALE (storage & processing) Hadoop Platform NoSQL MPP Analytics EDW

© Hortonworks Inc Cost per Terabyte Adoption Page 9 Size of Bubble Equal to Cost Effectiveness Of Solution Source: Think Big Analytics Hadoop has the capability to process extremely large volumes of data, much faster and at a fraction of the cost of traditional data systems.

© Hortonworks Inc A little history… it’s 2005

© Hortonworks Inc A Brief History of Apache Hadoop Page Focus on INNOVATION 2005: Yahoo! creates team under E14 to work on Hadoop Focus on OPERATIONS 2007: Yahoo team extends focus to operations to support multiple projects & growing clusters Yahoo! begins to Operate at scale Enterprise Hadoop Apache Project Established Hortonworks Data Platform STABILITY 2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with 24 key Hadoop engineers from Yahoo

© Hortonworks Inc Storage Apache Hadoop: Center of Big Data Strategy Open Source data management with scale-out storage & distributed processing Page 12 HDFS Distributed across “nodes” Natively redundant Name node tracks locations Processing Map Reduce Splits a task across processors “near” the data & assembles results Self-Healing, High Bandwidth Clustered Storage Key Characteristics Scalable –Efficiently store and process petabytes of data –Linear scale driven by additional processing and storage Reliable –Redundant storage –Failover across nodes and racks Flexible –Store all types of data in any format –Apply schema on analysis and sharing of the data Economical –Use commodity hardware –Open source software guards against vendor lock-in

© Hortonworks Inc OSCloudVMAppliance Enterprise Hadoop Distribution Page 13 PLATFORM SERVICES HADOOP CORE DATA SERVICES OPERATIONAL SERVICES Manage & Operate at Scale Store, Process and Access Data Enterprise Readiness: HA, DR, Snapshots, Security, … HORTONWORKS DATA PLATFORM (HDP) Distributed Storage & Processing Hortonworks Data Platform (HDP) Enterprise Hadoop The 100% open source and complete distribution Enterprise grade, proven and tested at scale Ecosystem endorsed to ensure interoperability HDFSYARN (in 2.0) WEBHDFSMAP REDUCE HCATALOG HIVEPIG HBASE SQOOP FLUME OOZIE AMBARI

© Hortonworks Inc Where does it fit in the enterprise? Page 14

© Hortonworks Inc Existing Data Architecture Page 15 APPLICATIONS DATA SYSTEMS TRADITIONAL REPOS RDBMSEDWMPP DATA SOURCES OLTP, POS SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR Traditional Sources (RDBMS, OLTP, OLAP) DEV & DATA TOOLS BUILD & TEST Business Analytics Custom Applications Enterprise Applications

© Hortonworks Inc Existing Data Architecture Page 16 APPLICATIONS DATA SYSTEMS TRADITIONAL REPOS RDBMSEDWMPP DATA SOURCES OLTP, POS SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR Traditional Sources (RDBMS, OLTP, OLAP) DEV & DATA TOOLS BUILD & TEST Business Analytics Custom Applications Enterprise Applications New Sources (web logs, , sensor data, social media) ?

© Hortonworks Inc An Emerging Data Architecture Page 17 APPLICATIONS DATA SYSTEMS TRADITIONAL REPOS RDBMSEDWMPP DATA SOURCES MOBILE DATA OLTP, POS SYSTEMS OPERATIONAL TOOLS MANAGE & MONITOR Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, , sensor data, social media) DEV & DATA TOOLS BUILD & TEST Business Analytics Custom Applications Enterprise Applications

© Hortonworks Inc Interoperating With Your Tools Page 18 APPLICATIONS DATA SYSTEMS TRADITIONAL REPOS DEV & DATA TOOLS OPERATIONAL TOOLS Viewpoint Microsoft Applications DATA SOURCES MOBILE DATA OLTP, POS SYSTEMS Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, , sensor data, social media)

© Hortonworks Inc Patterns of Use Page 19

© Hortonworks Inc Operational Data Refinery Page 20 DATA SYSTEMS DATA SOURCES Capture Capture all data – New and Traditional Process Parse, cleanse, apply structure & transform Exchange Push to existing data warehouse for use with existing analytic tools 2 3 Refine Explore Enric h 2 APPLICATIONS Collect data and apply a known algorithm to it in trusted operational process TRADITIONAL REPOS RDBMSEDWMPP Business Analytics Custom Applications Enterprise Applications Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, , sensor data, social media)

© Hortonworks Inc Big Data Exploration & Visualization Page 21 DATA SYSTEMS DATA SOURCES Refine Explore Enrich APPLICATIONS 1 Capture Capture all data Process Parse, cleanse, apply structure & transform Exchange Explore and visualize with analytics tools supporting Hadoop 2 3 Collect data and perform iterative investigation for value 3 2 TRADITIONAL REPOS RDBMSEDWMPP 1 Business Analytics Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, , sensor data, social media) Custom Applications Enterprise Applications

© Hortonworks Inc Application Enrichment Page 22 DATA SYSTEMS DATA SOURCES RefineExplore Enrich APPLICATIONS 1 Capture Capture all data Process Parse, cleanse, apply structure & transform Exchange Incorporate data directly into applications 2 3 Collect data, analyze and present salient results for online apps TRADITIONAL REPOS RDBMSEDWMPP Traditional Sources (RDBMS, OLTP, OLAP) New Sources (web logs, , sensor data, social media) Custom Applications Enterprise Applications NOSQL

© Hortonworks Inc Use Cases Page 23

© Hortonworks Inc Hadoop Handles Five New Data Types Not Suited for Relational Databases Data TypeDefinition Sentiment Data on opinions, emotions, and attitudes contained in social media, blogs, news, product reviews, and customer support interactions. Examples: Twitter, Facebook, LinkedIn, website comments Clickstream A virtual trail that a user leaves behind while visiting a website. A clickstream is a record of a user's activity on the Internet. For users not logged in to the site, this data may be captured using cookies. Examples: pages visited, length of visit per page, flows between pages, interaction with web forms, bounce rates Sensor / Machine-generated Data that is automatically created from a computer process, application, or other machine without the intervention of a human. Examples: smart electric meters, network event logs, manufacturing QA Geo tracking Location data from connected devices whose position is determined using GPS or by triangulation from cell towers. Examples: oil and gas exploration, first responders, defense Free-form text Information that does not have a pre-defined or predictable data model or does not fit well into relational tables. Unstructured information is typically text-heavy, but may contain data such as dates, numbers, and facts as well. Examples: CRM record, books, documents, metadata, health records, video, e- mail messages, or Web page content

© Hortonworks Inc Key use-cases in Finance/Insurance Trading Analysis: –How do I predict Trading trends based on market sentiment? –How do I test Algorithms against years vs days of data? Fraud detection: –Detect illegal credit card activity and alert bank/consumer –Detect illegal insurance claims Customer risk profiling: –How likely is this customer to pay back his mortgage? –How likely is this customer to get sick? Internal fraud detection (compliance): –Is this employee accessing financial information they are not allowed to access? Page 25

© Hortonworks Inc Twenty Hadoop Enterprise Use Cases Page 26 VerticalUse CaseData Type Financial Services New Account Risk ScreensText Fraud PreventionSystem Logs Horizontal View of Trading RiskSystem Logs 360° View of the CustomerClickstream, Machine, Text Data Combination for Insurance ClaimsGeographic, Text Telecom Ingestion and Storage of Call Detail Records (CDRs)Machine Throttling Network Bandwidth IntelligentlySystem Logs Next Product to Buy (NPTB)Clickstream, Geographic Network and Product Adoption PatternsSystem Logs Preventative Maintenance and RepairsMachine Retail 360° View of the CustomerClickstream, Sensor Analyze Brand SentimentSentiment Supply Chain and LogisticsGeographic Website and eCommerce OptimizationClickstream Optimize Store Layout Using Sensor DataSensor Manufacturing Supply Chain and LogisticsSensor Sensor Data for Assembly Line Quality AssuranceSensor Ingest and Storage of Call Detail Records (CDRs)Machine Preventative Maintenance and RepairsMachine Identify Product Defects in the Social Media StreamSentiment

© Hortonworks Inc Financial Services Solutions Page

© Hortonworks Inc Enterprise Fin. Serve Vision Page 28 Data Reservoir Internal Data LOB Specific (Private) External Data (Public) Third Party Data (Private) Classic Data Integration & ETL Data Movement ETL Security Access / Encryption De-Identification Masking / Tokenization Aggregation Materialized Views 1234 Encrypt Firewall Distillery -Data Movement -Security (Access / Encryption) -De-Identification -Aggregation -Data Inventory / Catalog Distillery -Data Movement -Security (Access / Encryption) -De-Identification -Aggregation -Data Inventory / Catalog Explore Harvest Visualization MPP / SQL Map Reduce Analytics Output -Data -Models -Algorithms -Dashboard -Reports -Apps Output -Data -Models -Algorithms -Dashboard -Reports -Apps Catalog Inventory 5

© Hortonworks Inc “Data Reservoir” Architecture Financial Services all Check Systems Trading EWS Banker Notes Acct Tables LOAD Sqoop Flume Table meta- data repo HCatalog Table meta- data repo HCatalog Web HDFS Data transformation Pig & Hive Data transformation Pig & Hive REFINE VISUALIZE EDW ML BI/Vis ual Page 29

Data Refinery Primary use case is to provide Capital Markets a common landing zone for structured and unstructured data Reduce ETL operational complexity by providing data sources and applications a single point of ingest and export for batch- driven information Offload EDW processing as desired: process and analyze data on HDP and transfer only what is required to the EDW Schema-on-read (HCatalog) instead of Schema-on-write (RDBMS) allows for retention of raw data and exportation in multiple and convenient schemas Storage and Processing cluster performs double duty as a centralized repository for audit and other regulatory requirements (OSC, SEC-CAT, etc)

Trade Analytics Holding Trade data in an EDW is not a long term feasible strategy Trades have thin margins, long-term analysis on trade performance can yield value Trade algorithms generate a great deal of exhaust data (order trails, logs) that can be analyzed to choose and improve algorithms Models used for real-time analytics or actions can be derived and maintained by ongoing large data- set analytics made possible by HDP Analyze Trade Exception Management in one location

Risk Assessment Risk Assessment is more focused on events/activities rather than assets; effective risk analysis needs to be based on activity modeling from as much data as possible Risk Assessment algorithms can be run on full event data rather than aggregate totals Computationally expensive “What If” predictive modeling scenarios can be executed frequently over data sets from multiple business lines Store all client and trade information in a single location; linking related data becomes a data processing task rather than a data standardization task Execute more Risk Assessment algorithms more often

Analytical Use Cases Build on Simple Store Use Case Trade Analytics -Improve Trade Algorithm Performance -Assess Trade Performance for Individuals and suggest actions -Retain data for analysis and audit in a single location DataTrades, Settlements, Exceptions, Client Information, Client Txn History, other market APIs Key Architecture HDP Cluster, CEP/HBase, Flume Ingestion Framework AnalyticsMahout/R, External BI via Hive/ODBC Risk Management -Calculate Risk based on “what- if” scenarios by portfolio -Create models for real-time usage on entire trade data sets -Execute computationally intensive optimization scenarios to mitigate risk and provide cut-off parameters to trade algorithms DataTrades, Settlements, Exceptions, Client Information, Client Txn History, other market APIs Key Architecture HDP Cluster, CEP/HBase, Flume Ingestion Framework AnalyticsMahout/R, External BI via Hive/ODBC Data Refinery on HDP

Increasingly complex use cases can be added over others… Data Refinery on HDP Risk Management Trade Analytics Predictive Portfolio Models Trade Algorithm Optimization

© Hortonworks Inc Fraud Prevention Business Problem Financial institutions are always at risk of fraud. Today’s bank robbers use data more than masks and guns. Many are sophisticated criminals that test the company’s risk controls, looking for vulnerabilities to exploit. This testing leaves a trail of activity that is often too subtle for any individual to notice and “fraudsters” know to keep their illegal activity below a level where the institution, its algorithms, or law enforcement would take notice. Hadoop can detect those subtle patterns. Solution Financial services companies can use Hortonworks Data Platform (HDP) to reduce the cost to detect fraudulent behavior, and keep more fraudsters from stealing their money. Fraud-detection algorithms on top of Hadoop can examine more data for a longer period of time. Risk teams can pull from the HDP data lake to examine years of detailed financial transactions, IP addresses, and account records. This makes their fraud applications smarter than the many thieves who elude them today. Financial Services System Logs Page 35

© Hortonworks Inc Rogue Trading Financial traders make large authorized transactions on behalf of their companies. In a few high-publicity events, certain traders learned those systems so well that they were able unauthorized trades that went undetected until the institution faced catastrophic losses. In 1995, Nick Gleason lost more than £800 million and drove Barings Bank out of business. More recently, rogue traders at Société Générale and JPMorgan Chase created billions of dollars of losses. Hortonworks Data Platform (HDP) can help identify trading patterns like Gleason’s: those too subtle or complex for any human minder to easily identify. Data irregularities between data managed by different systems can raise red flags if found early on. Often, retaining more data for longer can expose previously hidden patterns. HDP satisfies the need analyze data from application logs, s and other exhaust data to conduct an investigation and prevent massive losses. Financial Services Data Refinery Page 36

© Hortonworks Inc Combine Structured & Unstructured Data for Insurance Claims Much of the valuable information stored in an insurance company’s claim systems is text-based and unstructured. The ability to extract that data and combine it with structured data gives the insurance provider new tools to more accurately measure risk. They can use that insight to protect themselves from moral hazard, offer rate discounts to low-risk customers and even offer new insurance products that they could not underwrite previously, when data was scarcer. For example, sensor data now helps insurance companies more accurately assess the risk incurred by a driver based on actual driving patterns, rather than the average miles driven by someone in that demographic. But all of these new capabilities require the insurance company to store and process more data, from more sources, for longer. Hortonworks Data Platform (HDP) is ideally suited to deliver greater insight and confidence that comes from managing more data. Page Financial Services Data Refinery

© Hortonworks Inc Set Interest Rates to Maximize Spread on Deposits Retail banks make revenue by attracting deposits and then loaning those deposits at an interest higher than the rate they pay to depositors. This is the “spread.” Banks sometimes find that the profitability of particular deposit accounts is declining. Deposit pricing (the interest rate that the bank offers to customers) may be out of line with the market, leading to risk customer attrition or declining margins. A pricing strategy known as “retire and replace” manages this risk. The bank launches a new deposit product and ceases to offer the retired product. Some customers may switch from the retired account to the replacement product, but this occurs at a manageable rate. Hortonworks Data Platform (HDP) helps bank product managers set the profit-maximizing interest rate for the replacement product and respond more intelligently to their competitors’ price response. Page Financial Services Data Refinery

© Hortonworks Inc Accelerate Processing of Loan Documentation Banks originate, process and settle loan transactions with large volumes of unstructured data in paper documents, faxes and s. Multiple teams may handle a particular loan file, contributing to errors. Lenders have responded to the changing regulatory environment by instituting more audits and internal controls. Unfortunately, auditors are often forced to manually search paper and electronic files to locate documents. This causes additional delay, which reduces both servicing margins on a loan portfolio and customer satisfaction. Hortonworks Data Platform (HDP) brings much of that unstructured data into view. By facilitating the entire process of finding and analyzing data, HDP reduces errors and improves operational efficiency within the loan processing team. Financial Services Data Refinery Page 39

© Hortonworks Inc Protect the Brand with Sentiment Analysis Sentiment analysis, for example, tells companies what people are feeling about their brands. The availability of social media sources creates a relatively new way to gauge public sentiment. Solid sentiment analysis can be considered straightforward, as the data resides outside the firm and is therefore not bound by organizational boundaries. Hortonworks Data Platform (HDP) can store and process the social media feeds that make sentiment analysis possible. The resulting insight can help the company understand its customers better and make sure that its brand and service levels match what those customers expect. Page Financial Services New Analytic Applications

© Hortonworks Inc Becoming Data Driven Page 41

© Hortonworks Inc Path to Becoming Big Data Driven “Simply put, because of big data, managers can measure, and hence know, radically more about their businesses, and directly translate that knowledge into improved decision making & performance.” - Erik Brynjolfsson and Andrew McAfee Key Considerations for a Data Driven Business 1.Large web properties were born this way, you will have to adapt a strategy 2.Start with a project tied to a key objective or KPI – Don’t OVER engineer 3.Make sure your Big Data strategy “fits” your organization and grow it over time 4.Don’t do big data just to do big data – you can get lost in all that data

© Hortonworks Inc Thank You! Questions & Answers Page 43