Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg.

Similar presentations


Presentation on theme: "© Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg."— Presentation transcript:

1 © Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Engineering @ Hortonworks Twitter: tmccuch Oleg Zhurakousky Principal Architect @ Hortonworks Twitter: z_oleg

2 © Hortonworks Inc. 2012 The Application Development Dilemma Today, application developers devote roughly 80% of their code to persisting roughly 20% of the total data flowing through their applications –80% of the data flowing through our applications is at best lost in rolling log files, at worst never collected -- without ever being analyzed or accounted for –For the remaining 20% we do currently collect – application-level database programming, licensing, storage, administration, and ETL processing have maxed out IT operations budgets and have constrained app development teams from keeping pace with the rate of change in the business Page 2

3 © Hortonworks Inc. 2012 Example: Data Available During Ingest Record count Highest/Lowest record length Average record length Compression ratio But with a little more work... Field parsing –Unique values –Unique values per field –Access to values of each field independently from the record –Relatively fast field-based searches, without indexing –Value encoding –Etc… These are cross-cutting concerns! Page 3

4 How do we address cross-cutting concerns without disturbing the existing process flow? Page 4

5 © Hortonworks Inc. 2012 Wire Tap Defined Page 5

6 © Hortonworks Inc. 2012 Wire Tap is an Enterprise Integration Pattern Page 6

7  Transformer Convert payload or modify headers  Filter Discard messages based on boolean evaluation  Router Determine next channel based on content  Splitter Generate multiple messages from one  Aggregator Assemble a single message from multiple Other Enterprise Integration Patterns Page 7

8 © Hortonworks Inc. 2012 The Business Case

9 © Hortonworks Inc. 2013 6 Key Hadoop DATA TYPES 1.Sentiment Understand how your customers feel about your brand and products – right now 2.Clickstream Capture and analyze website visitors’ data trails and optimize your website 3.Sensor/Machine Discover patterns in data streaming automatically from remote sensors and machines 4.Geographic Analyze location-based data to manage operations where they occur 5.Server Logs Research logs to diagnose process failures and prevent security breaches 6.Text Understand patterns in text across millions of web pages, emails, and documents Page Value

10 © Hortonworks Inc. 2013 20 Apache Hadoop Enterprise Use Cases Page VerticalUse CaseData Type Financial Services New Account Risk ScreensText, Server Logs Fraud PreventionServer Logs Trading RiskServer Logs Maximize Deposit SpreadText, Server Logs Insurance UnderwritingGeographic, Sensor, Text Accelerate Loan Processing Text Telecom Call Detail Records (CDRs)Machine, Geographic Infrastructure InvestmentMachine, Server Logs Next Product to Buy (NPTB)Clickstream Real-time Bandwidth AllocationServer Logs, Text, Sentiment New Product DevelopmentMachine, Geographic Retail 360° View of the CustomerClickstream, Text Analyze Brand SentimentSentiment Localized, Personalized PromotionsGeographic Website OptimizationClickstream Optimal Store LayoutSensor Manufacturing Supply Chain and LogisticsSensor Assembly Line Quality AssuranceSensor Proactive MaintenanceMachine Crowdsourced Quality AssuranceSentiment

11 © Hortonworks Inc. 2012 Fraud Prevention Business Problem Financial institutions are always at risk of fraud Fraudsters test bank systems for vulnerabilities This testing leaves subtle patterns often undetected by bank employees or law enforcement Fraud losses costs banks millions Solution HDP reduces the cost to detect fraudulent activity HDP stores more types of data for longer Analysis of data in the “data lake” exposes fraudulent patterns that would have gone undetected Financial Services Data: Server Logs

12 12 Credit Request Process Flow - Before Credit Request Processing Credit Request arrives on a Gateway Credit Request is sent over a Channel Credit Request Processor Receives Request Processes the Request Issues a Response

13 Credit Scoring Fraud Detection Gathering Data Available during Credit Request Process Flow Cross-Cutting Concerns

14 © Hortonworks Inc. 2012 Demo

15 15 Credit Request Processing Flow - After HDP

16 16 Example: HTTP Header Collection

17 © Hortonworks Inc. 2012 Example: Data Available During Ingest Record count Highest/Lowest record length Average record length Compression ratio But with a little more work... Field parsing - unstructured data is not all that unstructured… –Unique values –Unique values per field –Access to values of each field independently from the record –Relatively fast field-based searches, without indexing –Value encoding –Etc… These are cross-cutting concerns! Page 17

18 © Hortonworks Inc. 2012 Demo

19 © Hortonworks Inc. 2012 Thank You! Questions & Answers Follow: @tmccuch, @z_oleg, @hortonworks Page 19


Download ppt "© Hortonworks Inc. 2012 Go beyond debug Wire Tap your App for knowlege with Hadoop Tom McCuch Solution Hortonworks Twitter: tmccuch Oleg."

Similar presentations


Ads by Google