Making Hadoop Ready for the Enterprise Hadoop Summit, June 27, 2013

Slides:



Advertisements
Similar presentations
© 2012 IBM Corporation 1 IBM Cognos 10 family Analytics in the hands of everyone Address all your analytic needs Report, Analyze, Model, Plan and Collaborate.
Advertisements

© 2013 IBM Corporation October 4, 2013 IT Analytics and Big Data IBM Solutions Paul Smith (Smitty) Service Management Architect.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Advance Analytics Capabilities
With the Help of the Microsoft Azure Platform, Devbridge Group Provides Powerful, Flexible, and Scalable Responsive Web Solutions MICROSOFT AZURE ISV PROFILE:
Business Intelligence System September 2013 BI.
Clinic to Cloud Provides an Electronic Medical Records System to Doctors in Australia, Hosted by Highly Secure Microsoft Azure Data Centers MICROSOFT AZURE.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
© 2007 IBM Corporation IBM Information Management Accelerate information on demand with dynamic warehousing April 2007.
© 2010 IBM Corporation Business Analytics software Business Analytics Editable Text Editable Text Editable Text.
Powered by Microsoft Azure, PointMatter Is a Flexible Solution to Move and Share Data between Business Groups and IT MICROSOFT AZURE ISV PROFILE: LOGICMATTER.
© 2012 IBM Corporation Converting Big Data into Big Knowledge.
Gaining Unprecedented Visibility into Microsoft Dynamics CRM with Halo’s Pipeline Advisor, Powered by the Microsoft Azure Cloud Platform MICROSOFT AZURE.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Data-Centric Security and User Access Controls for Hadoop on Microsoft Azure MICROSOFT AZURE APP BUILDER PROFILE: BLUETALON BlueTalon provides data-centric.
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Business Insights Play briefing deck.
AuraPortal Cloud Helps Empower Organizations to Organize and Control Their Business Processes via Applications on the Microsoft Azure Cloud Platform MICROSOFT.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Device Maintenance and Management, Parental Control, and Theft Protection for Home Users Made Easy with Remo MORE and Power of Azure MICROSOFT AZURE APP.
Data Platform and Analytics Foundational Training
Organizations Are Embracing New Opportunities
Data Platform and Analytics Foundational Training
DocFusion 365 Intelligent Template Designer and Document Generation Engine on Azure Enables Your Team to Increase Productivity MICROSOFT AZURE APP BUILDER.
Partner Logo Veropath Offers a Next-Gen Expense Management SaaS Technology Solution, Built Specifically to Harness Big Data Analytics Capabilities in Azure.
Free Cloud Management Portal for Microsoft Azure Empowers Enterprise Users to Govern Their Cloud Spending and Optimize Cloud Usage and Planning MICROSOFT.
SMS+ on Microsoft Azure Provides Enhanced and Secure Text Messaging, with Audit Trail, Scalability, End-to-End Encryption, and Special Certifications MICROSOFT.
Insurance Fraud Analytics in the Cloud with Saama and Microsoft Azure
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Wonderware Online Cost-Effective SaaS Solution Powered by the Microsoft Azure Cloud Platform Delivers Industrial Insights to Users and OEMs MICROSOFT AZURE.
IreckonU Offers a Powerful Hospitality Software Solution, Seamlessly Integrating Existing Hospitality Systems and Services on the Powerful Microsoft Azure.
Creating New Business Value with Big Data
Get Real Value and Insights from Your Data: Biin Solutions Provides Predictive Analytics, IoT, and Business Intelligence with Microsoft Azure Power MICROSOFT.
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
MyHealthDirect’s Enterprise Scheduling Platform, Based on Microsoft Azure, Improves the Patient Experience and Reduces Patient Readmissions MICROSOFT AZURE.
Operationalize your data lake Accelerate business insight
Oscar AP by Massive Analytic: A Precognitive Analytics Platform for Effortless Data-Driven Decisions. Now Available in Azure Marketplace MICROSOFT AZURE.
Yellowfin: An Azure-Compatible Business Intelligence Platform That Connects People with Their Data for Better Decision Making MICROSOFT AZURE APP BUILDER.
Be Better: Achieve Customer Service Excellence and Create a Lean RMA and Returns Process with Renewity RMA and the Power of Microsoft Azure MICROSOFT AZURE.
Scalable SoftNAS Cloud Protects Customers’ Mission-Critical Data in the Cloud with a Highly Available, Flexible Solution for Microsoft Azure MICROSOFT.
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
Voice Analytics on Microsoft Azure Allows Various Customers to Get the Most Out of Conversations with Clients Through Efficient Content Analysis MICROSOFT.
Through the Microsoft Azure Platform, TARGIT Decision Suite Enables Organizations to Analyze Critical Data, Giving Them the Courage to Act MICROSOFT AZURE.
On-Premises, or Deployed in a Hybrid Environment
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Accelerate Your Self-Service Data Analytics
Unitrends Enterprise Backup Solution Offers Backup and Recovery of Data in the Microsoft Azure Cloud for Better Protection of Virtual and Physical Systems.
Introducing Qwory, a Business-to-Business Search Engine That’s Powered by Microsoft Azure and Detects Vital Contact Information for Businesses MICROSOFT.
MARMIND’s New Service Delivers a Single Centralized Marketing Plan That Connects Teams, Campaigns and Outcomes by Using the Power of the Azure Platform.
Dell Data Protection | Rapid Recovery: Simple, Quick, Configurable, and Affordable Cloud-Based Backup, Retention, and Archiving Powered by Microsoft Azure.
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
AdQ is Azure-Powered Pre-Roll Ad Management Software That Improves Pre-Roll Ad Performance, Increases Profits, and Optimizes User Experience MICROSOFT.
One-Stop Shop Manages All Technical Vendor Data and Documentation and is Globally Deployed Using Microsoft Azure to Support Asset Owners/Operators MICROSOFT.
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
XtremeData on the Microsoft Azure Cloud Platform:
AIMS for BizTalk, Built on the Microsoft Azure Platform, Empowers Enterprises to Automate Insight and Analytics and Boost Value Creation MICROSOFT AZURE.
Big Data Young Lee BUS 550.
Improve Patient Experience with Saama and Microsoft Azure
BluSync by ParaBlu Offers Secure Enterprise File Collaboration and Synchronization Solution That Uses Azure Blob Storage to Enable Secure Sharing MICROSOFT.
Guarantee Hyper-V, System Center Performance and Autoscale to Microsoft Azure with Application Performance Control System from VMTurbo MICROSOFT AZURE.
Big DATA.
COMPANY PROFILE: REELWAY
Built on the Powerful Azure Platform, Angoss Helps Businesses Turn Data into Actionable Insights That Reduce Risk, Increase Organizational Performance.
Mark Quirk Head of Technology Developer & Platform Group
Customer 360.
Contract Management Software from ContraxAware Simplify Your Contract Management Process.
Presentation transcript:

Making Hadoop Ready for the Enterprise Hadoop Summit, June 27, 2013 Anjul Bhambhri Vice-President, IBM Big Data Development © 2013 IBM Corporation

Big Data is the next Natural Resource “We have for the first time an economy based on a key resource (Information) that is not only renewable, but self-generating. Running out of it is not a problem, but drowning in it is.” — John Naisbitt 40 ZB Harvesting any resource requires Mining, Refining and Delivering

Imagine the Possibilities… IBM Innovate 2013 9/14/2018 1:24 AM Imagine the Possibilities… You could detect a neonatal infections sooner? What if… 120 children monitored :120K message per sec, billion messages per day Solution 24 hour earlier detection of infections University of Ontario Institute of Technology http://www.youtube.com/watch?v=YosyLqbCrD4 ftp://public.dhe.ibm.com/common/ssi/ecm/en/odc03157usen/ODC03157USEN.PDF [UOIT Case study]   Fifteen million babies are born prematurely every year. Of those, over 1 million die, often in the first month of life. Many of these babies are in ICUs, connected to numerous monitors that measure key statistics such as heart rates, temperature, etc. Until recently, these measurements were only sampled and aggregated into 2-3 readings to indicate the health of the baby. IBM collaborated with UOIT to develop a solution that processes 1000 pieces of information/sec … identifies patterns …correlates this with doctor’s notes and family history… applies predictive analytics … and this has allowed us to spot the onset of an infection 24 hours in advance. Same data … but saved lives. ----------------------------------------------------- To better detect subtle warning signs of complications, clinicians need to gain greater insight into the moment-by-moment condition of neonatal infants in a ICU. Fifteen million babies, one in 10 births, are born prematurely every year, a global project suggests led by the WHO. Of those over 1 million die, often in the first 30 days of life – a terrible tragedy. Yet, many of these babies are in NICUs, connected to all sorts of monitors that measure key statistics such as their heart rates, skin temperature, respiration, etc. These measurements add up to 90M/patient/day, yet most of this data is just sampled periodically and written into the patient record, not used for its predictive value. IBM and UOIT developed first-of-its-kind, analytics solution using stream-computing to capture and analyze real-time data from medical monitors, alerting hospital staff to potential health problems before patients manifest clinical signs of infection or other issues. Early warning gives caregivers the ability to proactively deal with potential complications—such as detecting infections in premature infants up to 24 hours before they exhibit symptoms. Solution monitors 120 children analyzing 120K message per second, billions of messages per day. Trials expanding beyond Canada to include hospitals in US, China and Australia. Big Data enabled doctors from University of Ontario to apply neonatal infant monitoring to predict infection in ICU 24 hours in advance Drury Design Dynamics

Constant Contact Transforming Marketing Campaign Effectiveness with IBM Big Data Analyze 35 billion annual emails to guide customers on best dates & times to send emails for maximum response Benefits 40 times improvement in analysis performance 15-25% performance increase in customer email campaigns Analysis time reduced from hours to seconds

Automobile and Manufacturing Quality Control and Customer Satisfaction IBM 9/14/2018 Automobile and Manufacturing Quality Control and Customer Satisfaction In-flexibility and scalability limitations of existing IT solutions has been a inhibitor to competitive advantage. A new solution is needed to improve quality and operational efficiency Inventory control of parts Manufacturing equipment and assembly line data Warranty and services data from dealers Telemetry data from vehicles Next generation of Enterprise Data Warehouse: SA_Big_Data_NYC_Feb_18_v10 5

Transactional & Application Data New Opportunities with Big Data & Analytics Transactional & Application Data Machine Data Enterprise Content Social Data Big Data and Technology Platform © 2013 IBM Corporation

New Opportunities with Big Data & Analytics Data Scientist Business Analyst User Roles and Analytics Big Data and Technology Platform © 2013 IBM Corporation

Big Data and Technology Platform New Opportunities with Big Data & Analytics Enrich info base Improve customer interaction Reduce risk Gain efficiency and scale Optimize and monetize New Outcomes Roles and Analytics Big Data and Technology Platform © 2013 IBM Corporation

Emerging Pattern of Big Data Implementation Ingestion and Real-time Analytic Zone Analytics and Reporting Zone Ingest Filter, Transform Correlate, Classify Warehousing Zone Enterprise Warehouse Data Marts Query Engines Cubes Data Sinks Extract, Annotate Descriptive, Predictive Models Connectors Landing and Analytics Sandbox Zone Hive/HBase Col Stores Widgets Discovery, Visualizer Search Analytics MapReduce Indexes, facets Documents In Variety of Formats Models Metadata and Governance Zone Repository, Workbench Ingest 9

The 5 Key Use Cases Big Data Exploration Find, visualize, understand all big data to improve decision making Enhanced 360o View of the Customer Extend existing customer views (MDM, CRM, etc) by incorporating additional internal and external information sources Security/Intelligence Extension Lower risk, detect fraud and monitor cyber security in real-time Operations Analysis Analyze a variety of machine data for improved business results Data Warehouse Augmentation Integrate big data and data warehouse capabilities to increase operational efficiency

Big Data Platform and Application Framework Analytic Applications Speed time to value with analytic and application accelerators BI / Reporting Exploration / Visualization Functional App Industry App Predictive Analytics Content Analytics BI / Reporting Gather, extract and explore data using best of breed visualization IBM Big Data Platform Analyze streaming data and large data bursts for real-time insights Visualization & Discovery Applications & Development Systems Management Cost-effectively analyze petabytes of structured and unstructured information Accelerators Information Integration & Governance Hadoop System Stream Computing Data Warehouse Contextual Discovery Index and federated discovery for contextual collaborative insights Deliver deep insight with advanced in-database analytics and operational analytics Govern data quality and manage information lifecycle Cloud | Mobile | Security 11

Enterprise Capabilities on Hadoop Key Platform Requirements Built-in analytics Enterprise-grade capabilities Integrated with enterprise software Ease of installation and management Reference hardware configurations World-class support Full open source compatibility Business benefits Quicker time-to-value Reduced operational risk Enhanced business knowledge with flexible analytical platform Leverages and complements existing software investments Visualization & Exploration Development Tools Advanced Engines Connectors Workload Optimization Administration & Security Open source components IBM-certified Apache Hadoop 12

Big Data needs SQL Hadoop Application Big SQL Engine Hadoop HiveTables HBase tables CSV Files Data Sources SQL Language JDBC / ODBC Driver JDBC / ODBC Server Most existing applications in the enterprise use SQL SQL bridges the chasm between existing apps and Big Data SQL access to all data stored in Hadoop Via JDBC/ODBC Using rich standard SQL Intelligently leverage Map/Reduce parallelism OR direct access for achieving low-latency

Text Analytics: Getting measurable insights Most of the world’s data is in unstructured or semi-structured text. Social media is rife with discussions about products and services Company Internal Information is locked in blobs, description fields, and sometimes even discarded How do you get a metrics based understanding of facts from unstructured text? Healthcare Analytics: E-Medical records, hospital reports Public Sectors Case files, police records, emergency calls… Automotive Quality Insight: Tech notes, call logs, online media Insurance Fraud: Insurance claims Social Media for Marketing: twitter, facebook, blogs, forums Over 80% of stored information is unstructured* Structural analysis Mining and visualization

How Text Analytics Works Football World Cup 2010, one team distinguished themselves well, losing to the eventual champions 1-0 in the Final. Early in the second half, Netherlands’ striker, Arjen Robben, had a breakaway, but the keeper for Spain, Iker Casilas made the save. Winger Andres Iniesta scored for Spain for the win. World Cup 2010 Highlights Arjen Robben Striker Netherlands Iker Casilas Keeper Spain Andres Iniesta Winger Spain

Text Analytics Language and Runtime Offline Runtime Dominant Cost is CPU General-Purpose Linguistic Parsers Dictionaries Role Dict Select Company Join Development Environment Extracted Objects Role Join Select Company Dict AQL Extractor Text Analytics Runtime create view Employment as select R.jobType as jobType, C.name as companyName from Company C, Role R where Follows(R.jobType, C.name, 0, 20) and ContainsDict('EmpAssociation.dict', RightContext(R.jobType,10)); Cost-based optimization … Role Select Join Company Dict Input Documents High-throughput Small memory footprint Declarative SQL-like language Discovery tools for AQL development

Enterprise Data Tools Business User Data Scientist Business Analyst Developer Administrator 17

Security and compliance in Big Data environments Who is running specific big data requests? What map-reduce jobs are they running? Are these jobs part of an authorized program list accessing the data? Is there an exceptional number of file permission exceptions? Structured Big Data Platform Unstructured Streaming Hadoop Cluster Clients Taps for Hadoop Collects and streams audit data to Collector Provides visibility for HDFS, MapReduce, RPC, Oozie, HBase, etc. Securely stores audit data collected by TAPs Provides analytics, reporting & compliance workflow automation 18

Data Archiving and Masking on Hadoop Mask confidential data to avoid data breach & meet privacy compliance Protect confidential data while preserving analytics Support compliance with privacy regulations Cost-effective query-able archiving Manage, apply retention policies for compliance Enable business users to query on Hot, Warm and Cold data Data Archiving Database Hadoop Data Masking JASON MICHAELS ROBERT SMITH Mask Before Masking After Masking Mask in-database Extract Mask in Hadoop Archive & Purge Load Query-able Auditable Restorable Data Complete Business Objects Data Integrity Schema, Metadata Retention Policies Archive files Compress

Introducing pureData for Hadoop – BigInsights Appliance Simplified Experience Designed for easy and quick deployment Built-in tools designed for users to derive value quickly Easy connectivity to common data warehouse systems Built-in Expertise Enables ‘what-if analysis’ and advanced analytics Supports structured, semi-structured, and unstructured data Built-in text processing engine and library of annotators to analyze large volumes of text-based information Data can be used in its native format eliminating need to pre-define and map structures Integration by Design InfoSphere BigInsights software, cluster management, and IBM System x® servers Automatic parallelization and resource optimization to scale economically Enterprise-class security and platform management 20

Breadth of capabilities From Getting Starting to Enterprise Deployment: InfoSphere BigInsights Brings Hadoop to the Enterprise PureData for Hadoop Appliance simplicity for the enterprise * Pre-announced Enterprise class Enterprise Edition Sold by # of terabytes managed Quick Start features PLUS: Accelerators Enterprise Integration Production support Production-ready features Free download, non-production Quick Start Edition Big Sheets Text Analytics Big SQL Workload optimization/ Query support Dev tools Connectors Mgmt tools IBM Hadoop Core Basic Edition Free download Web-based mgmt console Jaql Integrated install Apache Hadoop Breadth of capabilities © 2013 IBM Corporation 21

Streams - Real Time Analytics 22 22

InfoSphere Data Explorer – delivering insights at the point of impact Providing unified, real-time access and fusion of big data unlocks greater insight and ROI Discovery & navigation Clustering & categorization Contextual intelligence Easy-to-deploy applications All at the scale required for today’s big data challenges Data access & integration Index structured & unstructured data—in place Support existing security Federate to external sources Leverage MDM, governance, and taxonomies Create unified view of ALL information for real-time monitoring Improve customer service & reduce call times Increase productivity & leverage past work increasing speed to market Analyze customer data to unlock true customer value Identify areas of information risk & ensure data compliance 23

Organizations are Building Big Data Applications on Data Explorer Warehouse Structured Enterprise Data Streams Data in motion Data Explorer App Builder BigInsights Data at rest Data Explorer Semi- & unstructured enterprise data 24

Get Started on Your Big Data Journey Today Get Educated IBM Big Data: ibm.com/bigdata IBMBigDataHub.com BigDataUniversity.com Get Your Hands on Big Data Download Quick Start ibm.co\QuickStart 25

THINK