Steve Vertica - Hewlett Packard Enterprise

Slides:



Advertisements
Similar presentations
Syncsort Data Integration Update Summary Helping Data Intensive Organizations Across the Big Data Continuum Hadoop – The Operating System.
Advertisements

Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1.
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
Jason Houle Vice President, Travel Operations Lixto Travel Price Intelligence 2.0.
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
What we know or see What’s actually there Wikipedia : In information technology, big data is a collection of data sets so large and complex that it.
Smart Grid Big Data: Automating Analysis of Distribution Systems Steve Pascoe Manager Business Development E&O - NISC.
An Introduction To Big Data For The SQL Server DBA.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
1 Cloud-Native Data Warehousing Bob Muglia. 2 Scenarios with affinity for cloud Gartner 2016 Predictions: By 2018, six billion connected things will be.
The Derivitec Risk Portal Provides Powerful, Cost-Effective Risk Management Solutions, Powered by Azure, that Deploy in Minutes MICROSOFT AZURE ISV PROFILE:
Leverage Big Data With Hadoop Analytics Presentation by Ravi Namboori Visit
Bhakthi Liyanage SQL Saturday Atlanta 15 July 2017
Connected Infrastructure
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Device Maintenance and Management, Parental Control, and Theft Protection for Home Users Made Easy with Remo MORE and Power of Azure MICROSOFT AZURE APP.
Organizations Are Embracing New Opportunities
Data Platform and Analytics Foundational Training
Big Data is a Big Deal!.
HPE Big Data Platform Software Portfolio.
5/9/2018 7:28 AM © Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS.
Connected Living Connected Living What to look for Architecture
Smart Building Solution
Hadoop and Analytics at CERN IT
JD Edwards EnterpriseOne In-Memory Sales Advisor
Connected Maintenance Solution
Barracuda Networks Creates Next-Generation Security Solutions That Enable Customers to Accelerate Their Adoption of Microsoft Azure MICROSOFT AZURE APP.
Zhangxi Lin, The Rawls College,
Gather Valuable Customer Data
Spark Presentation.
Smart Building Solution
Connected Maintenance Solution
Connected Living Connected Living What to look for Architecture
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Wonderware Online Cost-Effective SaaS Solution Powered by the Microsoft Azure Cloud Platform Delivers Industrial Insights to Users and OEMs MICROSOFT AZURE.
Measure Effectiveness of Communication, Engage Your Employees, and Bridge Communication Gaps with Sparrow App and Power of Microsoft Azure MICROSOFT AZURE.
Connected Infrastructure
Hosted on Azure, LoginRadius’ Customer Identity
SmartHOTEL Solutions Powered by Microsoft Azure Provide Hoteliers with Comprehensive, One-Stop Automated Management of All Booking Channels MICROSOFT AZURE.
Get Real Value and Insights from Your Data: Biin Solutions Provides Predictive Analytics, IoT, and Business Intelligence with Microsoft Azure Power MICROSOFT.
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
Operationalize your data lake Accelerate business insight
Running on the Powerful Microsoft Azure Platform,
Oscar AP by Massive Analytic: A Precognitive Analytics Platform for Effortless Data-Driven Decisions. Now Available in Azure Marketplace MICROSOFT AZURE.
Designed for Big Data Visual Analytics, Zoomdata Allows Business Users to Quickly Connect, Stream, and Visualize Data in the Microsoft Azure Platform MICROSOFT.
Yellowfin: An Azure-Compatible Business Intelligence Platform That Connects People with Their Data for Better Decision Making MICROSOFT AZURE APP BUILDER.
Scalable SoftNAS Cloud Protects Customers’ Mission-Critical Data in the Cloud with a Highly Available, Flexible Solution for Microsoft Azure MICROSOFT.
MasterDoc Organizes, Shares Electronic Patient Records for General Practitioners and Their Staff Members, Thanks to the Microsoft Azure Cloud MICROSOFT.
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Data Security for Microsoft Azure
Accelerate Your Self-Service Data Analytics
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
MyCloudIT Enables Partners to Drive Their Cloud Profitability Using CSP-Enabled Desktop Hosting Automation with Microsoft Azure and Office 365 MICROSOFT.
MyAppFree, Powered by Microsoft Azure, Lets Global Users Discover and Download Tested and Handpicked Windows Apps and Games for Free MICROSOFT AZURE ISV.
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
Appcelerator Arrow: Build APIs in Minutes. Connect to Any Data Source
Cloud Analytics for Microsoft Azure
XtremeData on the Microsoft Azure Cloud Platform:
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Improve Patient Experience with Saama and Microsoft Azure
Committed to delivering winning solutions
IBM Power Systems.
Guarantee Hyper-V, System Center Performance and Autoscale to Microsoft Azure with Application Performance Control System from VMTurbo MICROSOFT AZURE.
Big DATA.
Zendos Tecnologia Utilizes the Powerful, Scalable
Pitch Deck.
COMPANY PROFILE: REELWAY
Customer 360.
Copyright © JanBask Training. All rights reserved Get Started with Hadoop Hive HiveQL Languages.
Architecture of modern data warehouse
Presentation transcript:

Steve Sarsfield @SteveSarsfield Vertica - Hewlett Packard Enterprise Preparing your organization to derive insight from the internet of things Steve Sarsfield @SteveSarsfield Vertica - Hewlett Packard Enterprise March 2017

11/28/2017 3:17 AM Driving customer demand for a smarter and more personalized product experience Predictive maintenance Fraud detection Electronic health records Presenter Name Customer support Product recommendations

Challenges Handling more data Time to deliver analytics Costs of License Tuning Costs Skills to leverage new tools

The future belongs to those who analyze without limits With analytics free from closed infrastructure and narrow deployment options Traditional data warehouse lock-in Cloud analytics deployment lock-in Hadoop and open source

HPE Vertica All built on the same trusted and proven HPE Vertica Core SQL Engine HPE Vertica In the Cloud Get up and running quickly in the cloud Flexible, enterprise-class cloud deployment options The HPE Vertica Portfolio Regardless of how our customers want to consume and deploy Vertica, we have them covered. Most importantly, the entire Vertica Portfolio is based on the same, trusted, field-proven Vertica SQL engine and rich analytical functionality. So, whether customers need to access Big Data analytics via the cloud either as SaaS or run on select Amazon hardware, on-premise, or co-located Hadoop, no one provides the breadth of functionality and consumption models as HPE Vertica! HPE Vertica Enterprise Columnar storage and advanced compression Maximum performance and scalability Core HPE Vertica SQL Engine Advanced Analytics Open ANSI SQL Standards ++ R, Python, Java, Spark. Scala In-database machine learning HPE Vertica for SQL on Hadoop Native support for ORC and Parquet Support for industry-leading distributions No helper node or single point of failure

The appeal of Vertica Requirement Proof Extreme Optimization Columnar design for high performance analytics Aggressive compression Scalable to petabyte scale Total Cost of Ownership Simply and predictable pricing No penalty for additional hardware or connected users Ready for your Enterprise SQL compliant to 100% of the TPC-DS benchmark queries Secure and ACID compliant No single point of failure Open and Compatible Open platform – Standards compliant SQL, Python, Java Working with open source community on Spark, Hadoop, Kafka, etc.

Bridging the gap between high cost legacy EDWs and Hadoop data lakes Legacy Electronic Data Warehouse Declining performance at scale Built on aging technology Expensive w/ proprietary hardware Limited deployment options Data Lakes Low-cost storage of Big Data Some analytics capabilities Holding area for certain data

Complexity – Example: Analytics Ready for Internet of Things R, Python and Custom Analytics Goal Deliver analysis of critical data at the source of the data and provide faster time to insight Access rich custom and predictive analytics in your favorite languages and tools, including R, Python, and custom functions. Live Aggregate Projections Speed up queries that rely on resource-intensive aggregate functions like SUM, MIN/MAX, COUNT and Top-K Pattern Matching Find matching subsequences of events, compare the frequency of event patterns Event Windows Break a sequence into subsequences based on certain events or changes Event Series JOINS Correlate events across streams when the times do not line up SQL-99 Full ANSI SQL compliant

How to fill analysis gaps Customer Segmentation Channel & Location Analysis Net Profit Revenue Geospatial Data Types Geospatial – There are no native SQL on hadoop function in any of the solutions. However, you can bring in a solution like SpatialHadoop, a MapReduce extension to Apache Hadoop designed specially to work with spatial data. Very time-consuming Data types – Vertica supports date, time and many more data types than SQL on Hadoop solution In–place JOINs – On Vertica allows you to JOIN data that is sitting in your Vertica data warehouse with data that is sitting in an ORC or Parquet file in Hadoop. In other solutions, you must move the data. Time Series Gap - Since both time and the state of data within a time series are continuous, it can be challenging to evaluate SQL queries over time. Input records often occur at non-uniform intervals, which can create gaps. To solve this problem Vertica provides: 1) Gap-filling functionality, which fills in missing data points; 2) Interpolation scheme, which constructs new data points within the range of a discrete set of known data points. This is not available on Hadoop solutions Event window - Event-based windows let you break time series data into windows that border on significant events within the data. This is especially relevant in financial data where analysis often focuses on specific events as triggers to other activity. Sessionization - Sessionization, a special case of event-based windows, is a feature often used to analyze click streams, such as identifying web browsing sessions from recorded web clicks. Time series gap analysis Event window functions Sessionization Statistical functions In-place JOINs With some solutions, you may be required to fill the gaps with Spinning together two or more open source projects Moving and copying big data Using Generic Data Types Data munging Custom Code

Perhaps the ultimate architecture is all-inclusive Apache Spark, Hadoop and Kafka HPE Vertica Optimal Use Case Deep Analysis Massive scale Many concurrent users Challenges: After transformation is done in Spark, need faster load of data into Vertica for SQL Analytics Supply data to Spark machine learning analytics Solution: Vertica open-source connector to Apache Spark Benefits: Fast, scalable data transfer, exploiting Vertica’s parallelism, HDFS connectivity, and fluency with open source data formats Optimized query where processing is pushed down to Vertica Spark users can benefit from Vertica’s very advanced SQL analytics Features: Analyze-in-place without data movement via native ORC and Parquet readers Any Hadoop Run ON the Hadoop cluster or ON Vertica cluster Features: Vertica performs optimized data load from Spark Spark runs queries on Vertica data Kafka Spark Optimal Use Case Small, fast running queries ETL and complex event processing Operational analytics Hadoop Optimal Use Case Data lake Warm, cold storage Data discovery ETL Features: Share data between applications that support Kafka Data streaming into Vertica

Average annual benefit: $3,014,583 11/28/2017 3:17 AM “The choice was simple: the change to Vertica was much more cost effective than scaling their current Oracle system, while offering a much improved performance to execute very complex analytics use cases” ROI: 351% Payback: 4 months Average annual benefit: $3,014,583 Presenter Name

Suunto – Internet of Things (IoT) Suunto enriches extreme-sport experience with IoT wearable analytics on Vertica Challenges User data needed to be instantly gathered and compared to not only the user’s own historical data, but also data from specific sub-segments of Suunto users who have completed similar performance and demographic characteristics Management and analysis of 20 to 30 million individual training sessions per month in near real-time Results Ability to organize over 1 billion data measurements and cluster sub-segments of this data in a manner that allows for meaningful performance evaluation and improvement based upon real user data Combine a variety of data measurements from devices and users to instantly provide summary performance data, peer comparisons, and training regimens Vertica has enabled Suunto to provide a new level of value that feeds the competitive nature of Suunto’s customers. HPE Confidential 12 28 November 2017 “Vertica helps provides analytics so that athletes can train better and achieve more.”

Checklist for preparing for IoT Open up your systems (not just open source) Reconsider expensive legacy Solutions that scale Consider differing analytical workloads Skills to leverage new tools

Think outside the box - New York Genome Develop algorithms to find molecular cause of diseases Deal with errors in DNA sequencing Share results to community of scientists Compare tumor DNA to patient’s blood to find variant Precision medicine Suggest drugs to interfere with mutation Specific cancer drugs Arthritis Alzheimer's Parkinson's Asthma Diabetes Autism Cancer G C A T 2-3 years old Collaboration between 17 or so hopitals – so that they wouldn’t all have to start their own genome center Cornell, NYU, Stoneybrook medical, NY stem cell foundation Gene Sequencing Data 3 Billion letters 150 GB per person stored raw data 450 GB with analytics Cancer genome – just under a TB Compare to Reference Gene

Thank you Steve.Sarsfield@hpe.com Community Edition my.vertica.com 11/28/2017 3:17 AM Community Edition my.vertica.com Thank you Presenter Name Steve.Sarsfield@hpe.com