Download presentation
Presentation is loading. Please wait.
Published byHugh Reynolds Modified over 9 years ago
1
© 2015 IBM Corporation IBM SPSS Statistics prepared by: Dennis Buttera, Curriculum Advisor IBM Academic Partnerships
2
© 2015 IBM Corporation 2 Our point of view on Hadoop Every organization sees Hadoop as providing an open-source, rapidly-evolving platform that is capable of collecting and economically storing a very large corpus of highly variable types of data and making it available. And yet most organizations are not yet fully realizing the value of Hadoop due to the lack of skills data scientists and developers to extract the valuable insight, or the complexity to scale the Hadoop environment In order to drive Hadoop adoption we see organizations requires advances: To have the most powerful analytics in their hands To distill experience and build skills to drive time to value faster To easily incorporate Hadoop into a broader data architecture
3
© 2015 IBM Corporation 3 What we are announcing at Strata on Feb 17 IBM Open Platform with Apache Hadoop IBM BigInsights for Apache Hadoop Sponsoring new global training program for data scientists Open Data Platform initiative Avnet Enabled Hadoop
4
© 2015 IBM Corporation 4 IBM BigInsights for Apache Hadoop Three new modules to get the most out from Hadoop IBM BigInsights Analyst will include IBM’s SQL engine and IBM’s intuitive spreadsheet and visualizations to find data quickly and easily. On average, millions of SQL queries are run each year. With BigInsights Analyst, the efficiency of these queries has been shown in some cases to improve by approximately 2x to 4x on Apache Hadoop depending on the shuffle size. The ANSI compliant SQL means queries can run unchanged against Hive, HBase and relational databases. IBM BigInsights Data Scientist will deliver a new machine-learning engine that automatically tunes its performance over large-scale data to find interesting patterns– plus over a dozen industry-specific algorithms such as Decision Trees, PageRank and Clustering to help tackle complex problems out of the box. It will also provide native support for open source R statistical computing helping clients leverage their existing R algorithms, or gain from the more than 4,500 freely available statistics packages from the R community. IBM BigInsights Enterprise Management will introduce new management tools for clients to realize faster time to results. Designed to help allocate resources and optimize workflows, these tools will allow deployments that can scale to large numbers of users and clusters, and will help satisfy high workload demand. These tools will provide multi-tenancy and multi-instance support in a cluster. STAC Report™at http://www.stacresearch.com/node/15370http://www.stacresearch.com/node/15370
5
© 2015 IBM Corporation 5 IBM BigInsights Data Scientist Accelerates Data to Value with Less Code RHIPE implementationRHadoop implementationBig R implementation Coding in R like it was meant to be coded. Not embedding foreign code like Java.
6
© 2015 IBM Corporation 6 ANSI Compatible SQL with 4X the Query Speed on Apache Hadoop IBM BigInsights Analyst Familiar worksheets for large-scale datasets Web-based and Simple to use UI for Big Data Analytics
7
© 2015 IBM Corporation 7 IBM BigInsights Enterprise Management Multi-tenancy and optimized workflows in a Hadoop Cluster “In jobs derived from production Hadoop traces at Facebook, IBM ® Platform TM Symphony accelerated Hadoop by an average of 7.3x.”
8
© 2015 IBM Corporation 8 Proof of Concept: IBM BigInsights on the Cloud to discover insights around specific business concerns. Objective Improved repeat shopping conversion rates, greater customer engagement, higher total revenue period-on-period. Intended Benefits Transition to a customer interest-based marketing approach. Combine multiple data sets to build holistic view of customer. Model sales performance to environmental and multi-channel attribution. Results 541% improvement in revenue Success in building customer-interest based integrated data set Disproved some long held beliefs related to weather driving sales in online channel / establish dichotomy between retail and online receptively to marketing Objective Improved repeat shopping conversion rates, greater customer engagement, higher total revenue period-on-period. Intended Benefits Transition to a customer interest-based marketing approach. Combine multiple data sets to build holistic view of customer. Model sales performance to environmental and multi-channel attribution. Results 541% improvement in revenue Success in building customer-interest based integrated data set Disproved some long held beliefs related to weather driving sales in online channel / establish dichotomy between retail and online receptively to marketing A Large Online Retailer gained a 541% improvement in revenue
9
© 2015 IBM Corporation 9 Text Analytics POSIX Distributed Filesystem Multi-workload, Multi-tenant scheduling IBM BigInsights Enterprise Management System ML on Big R Distributed R Business Analyst Data Scientist IBM Open Platform with Apache Hadoop Developer Administrator IBM BigInsights Data Scientist IBM BigInsights Analyst Big SQL Big Sheets Big SQL BigSheets IBM BigInsights for Apache Hadoop IBM BigInsights for Apache Hadoop Three new user-centric modules founded on an Open Data Platform
10
© 2015 IBM Corporation 10 IBM is Founder Member in Open Data Platform Initiative The Open Data Platform Initiative (ODP) is a shared industry effort focused on promoting and advancing the state of Apache Hadoop and Big Data technologies for the enterprise. ODP aims to accelerate the delivery of Big Data solutions by providing a well-defined core platform to target. Test, Certify, and Standardize the core components of a new “Open Data Platform” of select Apache Software Foundation (ASF) projects to provide a foundation for which Big Data solutions providers can build upon. Initially Apache Hadoop (HDFS, YARN, MapReduce) and Apache Ambari (Provisioning, Management, and Monitoring) Support for community development and outreach activities.
11
© 2015 IBM Corporation 11 IBM Sponsors New Data Science Curriculum at Big Data University A skills shortage is the major obstacle to adoption of big data & analytics technologies. IBM is proud to sponsor Big Data University and its new curriculum for Programming for Analytics and Data Science. Big Data University delivers free online courses to a community of over 230,000 registered participants around the world. IBM sponsors and engages in the community to raise skills in the market IBM fosters and supports rapidly growing number of enrolled participants Big Data University currently offers a number of courses for free including: Hadoop Fundamentals, SQL Access on Hadoop, BigSheets, Hbase for Real-Time Access, Hive, Streams, IBM BLU Acceleration, Data Mining with R, Application Development, Pig, and many more in multiple languages and online.
12
© 2015 IBM Corporation 12 How is IBM different? Hadoop and the Analytics Ecosystem There are very few companies who can truly say they are early contributors to Hadoop and are still innovating today. IBM is one of only a handful who have continued to advance Hadoop from its early days as a single Apache Project to more than the dozen projects that exist today. IBM’s core strength is its deep knowledge of the inner workings of Hadoop and its strategic value in the Enterprise. IBM continues its involvement with open source by joining the Open Data Consortium to ensure stability of Hadoop as a foundation for Big Data & Analytics. In-Hadoop Analytics IBM invented SQL over 40 years ago and is still in use today as the linga franca of data querying for its simplistic syntax and ubiquity across organizations. Client Success with Hadoop Organizations rely on IBM to solve their most difficult analytics problems based on our depth of expertise and domain leadership in software, hardware, analytics and research. IBM brings 100,000 trained analytics professionals, over 200 customers, and a 24 billion investment into the platform.
13
© 2015 IBM Corporation 13 What sets our offering apart? An Open Hadoop for an Expanding Ecosystem We have decoupled our Hadoop distribution from the core value components. Performance Improvements at Every Level We have applied our deep knowledge of distributed computing, query optimization, and workflow controllers to increase performance 4X to 11X compared to our next nearest competitor. IBM Hadoop is Production Ready IBM provides an Analytics Platform that incorporates Hadoop is a first class citizen in the broader analytic architecture to remove the barrier of querying across, or moving artifacts to and from, other environments.
14
© 2015 IBM Corporation 14
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.