Big Data Analytics Module 4 – Data Mining and Predictive Analytics Including Mahout Saptak Sen, Microsoft Bill Ramos, Advaiya.

Slides:



Advertisements
Similar presentations
CS525: Special Topics in DBs Large-Scale Data Management
Advertisements

Suggested Course Outline Cloud Computing Bahga & Madisetti, © 2014Book website:
Roger Breu SQL Server PDW Solution Sales Microsoft Western Europe Microsoft Solutions for Big Data | Oct 17th 2013 From Numbers.
MICROSOFT BIG DATA. WHAT IS BIG DATA? How do I optimize my fleet based on weather and traffic patterns? SOCIAL & WEB ANALYTICS LIVE DATA FEEDS ADVANCED.
FAST FORWARD WITH MICROSOFT BIG DATA Vinoo Srinivas M Solutions Specialist Windows Azure (Hadoop, HPC, Media)
Introducing Apache Mahout Scalable Machine Learning for All! Grant Ingersoll Lucid Imagination.
The United States Postal Service processed over 150 billion pieces of mail in 2013—far too much for efficient human sorting. But as recently.
SQL Server 2014 Enterprise Edition Brad Jarocki Adam Bogobowicz Matt Haynes.
Delivering on one of the old dreams of Microsoft co-founder Bill Gates: Computers that can see, hear and understand. John Platt Distinguished scientist.
Microsoft Big Data Essentials Module 1 - Introduction to Big Data
Big Data Analytics Module 2 – Data Visualizations with Power View and Power Map Saptak Sen, Microsoft Bill Ramos, Advaiya.
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
BIG DATA – WHAT’S THE BIG DEAL The call would start soon, please be on mute. Thanks for your time and patience.
Peter Myers Bitwise Solutions Pty Ltd. Predictive Analytics PresentationExplorationDiscovery Passive Interactive Proactive Business Insight Canned.
Server & Tools Business
This presentation was scheduled to be delivered by Brian Mitchell, Lead Architect, Microsoft Big Data COE Follow him Contact him.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli
Committed to Deliver….  We are Leaders in Hadoop Ecosystem.  We support, maintain, monitor and provide services over Hadoop whether you run apache Hadoop,
SQL Server 2014: The Data Platform for the Cloud.
Apache Mahout Industrial Strength Machine Learning Jeff Eastman.
CS525: Big Data Analytics Machine Learning on Hadoop Fall 2013 Elke A. Rundensteiner 1.
Consul- ting Services Outsour- cing Services Techno- logy Services Local Profes- sional Services Competence Centers Business Intelligence WebTech SAP.
Advanced Analytics on Hadoop Spring 2014 WPI, Mohamed Eltabakh 1.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Azure Machine Learning: From design to integration Peter Myers M355.
Apache Mahout Qiaodi Zhuang Xijing Zhang.
I need our systems to think. I need them to learn and I need them to present issues and problems and anomalies to the employees, to the managers. Adam.
Server & Tools Business
Guided By Ms. Shikha Pachouly Assistant Professor Computer Engineering Department 2/29/2016.
Mining of Massive Datasets Edited based on Leskovec’s from
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Denver ● SPT 104 ● March 1-3, 2016.
Azure Machine Learning Introduction to Azure ML. Setting Expectations This presentation is for you if…  you hear the buzzword “Machine Learning” and.
Harnessing Big Data with Hadoop Dipti Sangani; Madhu Reddy DBI210.
Andrej Tozon ANT Andrej Tozon | Internet of Things (IoT)
The United States Postal Service processed over 150 billion pieces of mail in 2013—far too much for efficient human sorting. But as recently as 1997,
Apache Mahout Industrial Strength Machine Learning Jeff Eastman.
Apache Hadoop on Windows Azure Avkash Chauhan
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Unlock your Big Data with Analytics and BI on Office365 Brian Culver ● SharePoint Fest Seattle● BI102 ● August 18-20, 2015.
Book web site:
Image taken from: slideshare
Bhakthi Liyanage SQL Saturday Atlanta 15 July 2017
Connected Infrastructure
Smart Building Solution
Machine Learning overview Chapter 18, 21
Industrial Strength Machine Learning Jeff Eastman
Introducing Apache Mahout
Make Predictions Using Azure Machine Learning Studio
Smart Building Solution
Insurance Fraud Analytics in the Cloud with Saama and Microsoft Azure
Introduction to R Programming with AzureML
Connected Infrastructure
Azure ML and Cognitive Services
9/13/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Big Data Analytics in Parallel Systems
07 | Analyzing Big Data with Excel
Azure Machine Learning 101
Introduction to Azure Machine Learning Studio
Advanced Analytics. Advanced Analytics What is Machine Learning?
Getting Started with Microsoft Azure Machine Learning
Server & Tools Business
כריית נתונים.
Machine Learning overview Chapter 18, 21
Server & Tools Business
Server & Tools Business
Introducing Apache Mahout
Server & Tools Business
Getting Started with Microsoft Azure Machine Learning
Presentation transcript:

Big Data Analytics Module 4 – Data Mining and Predictive Analytics Including Mahout Saptak Sen, Microsoft Bill Ramos, Advaiya

Overview of predictive analytics & data mining How Microsoft supports predictive analytics How Mahout fits into the picture Demos Agenda

Data Mining

Recommenda- tion engines Advertising analysis Weather forecasting for business planning Social network analysis IT infrastructure and web app optimization Legal discovery and document archiving Pricing analysis Fraud detection Churn analysis Equipment monitoring Location-based tracking and services Personalized Insurance Predictive analytics should address the likelihood of something happening in the future, even if it is just an instant later*

Rich data mining algorithms, for clustering, classification, forecasting through time series analysis, and more Rich developer experience

Ease of use through Excel Rich data mining algorithms for clustering, prediction, forecasting, market basket analysis, and more Scalable through integration with SSAS

MenuData Mining Analyze Key InfluencersNaïve Bayes Detect CategoriesClustering Fill From ExampleLogical Regression ForecastTime Series Highlight ExceptionsClustering Scenario Analysis – Goal SeekLogical Regression Scenario Analysis – What IfLogical Regression Prediction CalculatorLogical Regression Shopping Basket AnalysisAssociation Rules

Windows Azure HDInsight Microsoft Excel (Mining Add-in) Microsoft Excel Excel Data Mining Add-in Serving LayerSpeed LayerBatch Layer Flat files (.txt,.dat,.xlsx, etc.)

Mahout

Scalable machine learning algorithms on Hadoop platform Algorithms for clustering, classification, and batch-based collaborative filtering using the map/reduce paradigm Supports a wide range of use cases—from spam filtering, to fraud detection, to recommendations for books or movies ClusteringRecommenders Vector Similarity Pattern Mining Classification RegressionGenetic Dimension Reduction Matrices Collocations

Flat files (.txt,.dat,.xlsx, etc.) Running Mahout job on Hadoop Command Window to get output file Convert to Mahout input Hadoop Command Window Output file Serving LayerSpeed LayerBatch Layer Windows Azure HDInsight HDInsight Consoles

Questions?