© Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg.

Slides:



Advertisements
Similar presentations
Using the Self Service BMC Helpdesk
Advertisements

SharePoint Forms All you ever wanted to know about forms but were afraid to ask.
LeadManager™- Internet Marketing Lead Management Solution May, 2009.
iRequestManager for MediMizer X3
Complete Event Log Viewing, Monitoring and Management.
1.3.1.G1 © Family Economics & Financial Education – Revised October 2004 – Consumer Protection Unit – Identity Theft Funded by a grant from Take Charge.
© 2013 IBM Corporation October 4, 2013 IT Analytics and Big Data IBM Solutions Paul Smith (Smitty) Service Management Architect.
CONDO MANAGER The Leader in Association Accounting and Management Software Mailing Address: P.O. Box Charlotte, North Carolina Web Site
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
SAS solutions SAS ottawa platform user society nov 20th 2014.
A unique way to connect your extension cords without the traditional problems.
Web Defacement Anh Nguyen May 6 th, Organization Introduction How Hackers Deface Web Pages Solutions to Web Defacement Conclusions 2.
It’s always better live. MSDN Events Security Best Practices Part 2 of 2 Reducing Vulnerabilities using Visual Studio 2008.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
Chapter 14 The Second Component: The Database.
Loupe /loop/ noun a magnifying glass used by jewelers to reveal flaws in gems. a logging and error management tool used by.NET teams to reveal flaws in.
Electronic Banking BY Bahaa Abas Noor abo han. Definition * e-banking is defined as: …the automated delivery of new and traditional banking products and.
Security Guidelines and Management
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future June COPYRIGHT ©2015 SAPIENT CORPORATION.
Databases & Data Warehouses Chapter 3 Database Processing.
Business Overview Who Is ROCKETinfo?. The Business Rocketinfo is a Web 2.0 Company focusing on providing Web-based information. The goal is to provide.
WEB ANALYTICS Prof Sunil Wattal. Business questions How are people finding your website? What pages are the customers most interested in? Is your website.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Page 1 © Hortonworks Inc – All Rights Reserved Hortonworks Naser Ali UK Building Energy Management Group Hadoop: A Data platform for businesses.
C8: Enterprise Integration Patterns in Sonic ™ ESB Stefano Picozzi Solutions Architect.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
Chapter 11 Databases.
Chapter 11 Databases. 11 Chapter 11: Databases2 Chapter Contents  Section A: File and Database Concepts  Section B: Data Management Tools  Section.
Net Optics Confidential and Proprietary Net Optics appTap Intelligent Access and Monitoring Architecture Solutions.
What’s New in SSIS with SQL 2008 Bret Stateham Training Manager Vortex Learning Solutions blogs.netconnex.com.
About Dynamic Sites (Front End / Back End Implementations) by Janssen & Associates Affordable Website Solutions for Individuals and Small Businesses.
User Manager Pro Suite Taking Control of Your Systems Joe Vachon Sales Engineer November 8, 2007.
Chapter © 2012 Pearson Education, Inc. Publishing as Prentice Hall.
© 2012 Datameer, Inc. All rights reserved. Page 1 © 2012 Datameer, Inc. All rights reserved. Hadoop in Financial Services Adam Gugliciello, Solutions Engineer.
Presentation Path  Introduction to Ved Consultancy and OpenText  Current Challenges  The Valued Customers and Sectors  Our Solutions  Demo. Together,
Web Analytics Unit 4-1(2005 Fall) Managing the Digital Enterprise By Professor Michael Rappa.
Web Analytics Basic 6-Step Process Based on content from: /od/loganalysis/a/web_analy tics.htm.
Creating New Business Value with Big Data Attivio Active Intelligence Engine®
March 2014 Basic Content Management Tuffolo Group Perspective TUFFOLO.
Instant Information Access With Magnify Search Dr. Rado Kotorov Technical Director Strategic Product Mgt.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
INNOV-10 Progress® Event Engine™ Technical Overview Prashant Thumma Principal Software Engineer.
Yair Grindlinger, CEO and Co-Founder Do you know who your employees are sharing their credentials with? Do they?
WebFOCUS Magnify: Search Based Applications Dr. Rado Kotorov Technical Director of Strategic Product Management.
Copyright 2004 John Wiley & Sons, Inc Information Technology: Strategic Decision Making For Managers Henry C. Lucas Jr. John Wiley & Sons, Inc Dinesh.
Cyberspace Law Committee Meeting, August 3, 2012 Big Data Lois Mermelstein The Law Office of Lois D. Mermelstein
MANAGED SECURITY TESTING PROACTIVELY MANAGING VULNERABILITIES.
Christian Stiller Technical Account Manager SOA-23: Enterprise Integration Patterns in Sonic ™ ESB.
The overview How the open market works. Players and Bodies  The main players are –The component supplier  Document  Binary –The authorized supplier.
Protecting Yourself from Fraud including Identity Theft Advanced Level.
Introduction Web analysis includes the study of users’ behavior on the web Traffic analysis – Usage analysis Behavior at particular website or across.
Session 11: Cookies, Sessions ans Security iNET Academy Open Source Web Development.
© Hortonworks Inc Speaker: Jamie Engesser, Hortonworks Big Data: Making Sense of it All! Big Data is everywhere. We see it on commercials. We hear.
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
The Payment Card Industry Data Security Standard (PCI DSS) is a proprietary information security standard for organizations that handle branded credit.
Microsoft Ignite /28/2017 6:07 PM
Hadoop in the Wild CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
Using Robotic Process Automation to Create a Digital Workforce Jeff Chandler, Sales Engineer, Kofax.
9/24/2017 7:27 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN.
Data mining in web applications
Protecting a Tsunami of Data in Hadoop
Connected Infrastructure
Connected Maintenance Solution
Connected Maintenance Solution
Connected Infrastructure
Senior Solutions Architect, MongoDB Inc.
Creating New Business Value with Big Data
Protecting Your Credit Identity
Presentation transcript:

© Hortonworks Inc Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Hortonworks Twitter: z_oleg

© Hortonworks Inc The Application Development Dilemma Today, application developers devote roughly 80% of their code to persisting roughly 20% of the total data flowing through their applications –80% of the data flowing through our applications is at best lost in rolling log files, at worst never collected -- without ever being analyzed or accounted for –For the remaining 20% we do currently collect – application-level database programming, licensing, storage, administration, and ETL processing have maxed out IT operations budgets and have constrained app development teams from keeping pace with the rate of change in the business Page 2

© Hortonworks Inc Example: Data Available During Ingest Record count Highest/Lowest record length Average record length Compression ratio But with a little more work... Field parsing –Unique values –Unique values per field –Access to values of each field independently from the record –Relatively fast field-based searches, without indexing –Value encoding –Etc… These are cross-cutting concerns! Page 3

How do we address cross-cutting concerns without disturbing the existing process flow? Page 4

© Hortonworks Inc Wire Tap Defined Page 5

© Hortonworks Inc Wire Tap is an Enterprise Integration Pattern Page 6

 Transformer Convert payload or modify headers  Filter Discard messages based on boolean evaluation  Router Determine next channel based on content  Splitter Generate multiple messages from one  Aggregator Assemble a single message from multiple Other Enterprise Integration Patterns Page 7

© Hortonworks Inc The Business Case

© Hortonworks Inc Key Hadoop DATA TYPES 1.Sentiment Understand how your customers feel about your brand and products – right now 2.Clickstream Capture and analyze website visitors’ data trails and optimize your website 3.Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4.Geographic Analyze location-based data to manage operations where they occur 5.Server Logs Research logs to diagnose process failures and prevent security breaches 6.Text Understand patterns in text across millions of web pages, s, and documents Page Value

© Hortonworks Inc Apache Hadoop Enterprise Use Cases Page VerticalUse CaseData Type Financial Services New Account Risk ScreensText, Server Logs Fraud PreventionServer Logs Trading RiskServer Logs Maximize Deposit SpreadText, Server Logs Insurance UnderwritingGeographic, Sensor, Text Accelerate Loan Processing Text Telecom Call Detail Records (CDRs)Machine, Geographic Infrastructure InvestmentMachine, Server Logs Next Product to Buy (NPTB)Clickstream Real-time Bandwidth AllocationServer Logs, Text, Sentiment New Product DevelopmentMachine, Geographic Retail 360° View of the CustomerClickstream, Text Analyze Brand SentimentSentiment Localized, Personalized PromotionsGeographic Website OptimizationClickstream Optimal Store LayoutSensor Manufacturing Supply Chain and LogisticsSensor Assembly Line Quality AssuranceSensor Proactive MaintenanceMachine Crowdsourced Quality AssuranceSentiment

© Hortonworks Inc Fraud Prevention Business Problem Financial institutions are always at risk of fraud Fraudsters test bank systems for vulnerabilities This testing leaves subtle patterns often undetected by bank employees or law enforcement Fraud losses costs banks millions Solution HDP reduces the cost to detect fraudulent activity HDP stores more types of data for longer Analysis of data in the “data lake” exposes fraudulent patterns that would have gone undetected Financial Services Data: Server Logs

12 Credit Request Process Flow - Before Credit Request Processing Credit Request arrives on a Gateway Credit Request is sent over a Channel Credit Request Processor Receives Request Processes the Request Issues a Response

Credit Scoring Fraud Detection Gathering Data Available during Credit Request Process Flow Cross-Cutting Concerns

© Hortonworks Inc Demo

15 Credit Request Processing Flow - After HDP

16 Example: HTTP Header Collection

© Hortonworks Inc Example: Data Available During Ingest Record count Highest/Lowest record length Average record length Compression ratio But with a little more work... Field parsing - unstructured data is not all that unstructured… –Unique values –Unique values per field –Access to values of each field independently from the record –Relatively fast field-based searches, without indexing –Value encoding –Etc… These are cross-cutting concerns! Page 17

© Hortonworks Inc Demo

© Hortonworks Inc Thank You! Questions & @hortonworks Page 19