Download presentation
Presentation is loading. Please wait.
Published byJoan May Modified over 9 years ago
1
Big Data Analytics Module 4 – Data Mining and Predictive Analytics Including Mahout Saptak Sen, Microsoft Bill Ramos, Advaiya
2
Overview of predictive analytics & data mining How Microsoft supports predictive analytics How Mahout fits into the picture Demos Agenda
3
Data Mining
4
Recommenda- tion engines Advertising analysis Weather forecasting for business planning Social network analysis IT infrastructure and web app optimization Legal discovery and document archiving Pricing analysis Fraud detection Churn analysis Equipment monitoring Location-based tracking and services Personalized Insurance Predictive analytics should address the likelihood of something happening in the future, even if it is just an instant later*
5
Rich data mining algorithms, for clustering, classification, forecasting through time series analysis, and more Rich developer experience
7
Ease of use through Excel Rich data mining algorithms for clustering, prediction, forecasting, market basket analysis, and more Scalable through integration with SSAS
8
MenuData Mining Analyze Key InfluencersNaïve Bayes Detect CategoriesClustering Fill From ExampleLogical Regression ForecastTime Series Highlight ExceptionsClustering Scenario Analysis – Goal SeekLogical Regression Scenario Analysis – What IfLogical Regression Prediction CalculatorLogical Regression Shopping Basket AnalysisAssociation Rules
9
Windows Azure HDInsight Microsoft Excel (Mining Add-in) Microsoft Excel Excel Data Mining Add-in Serving LayerSpeed LayerBatch Layer Flat files (.txt,.dat,.xlsx, etc.)
10
Mahout
11
Scalable machine learning algorithms on Hadoop platform Algorithms for clustering, classification, and batch-based collaborative filtering using the map/reduce paradigm Supports a wide range of use cases—from email spam filtering, to fraud detection, to recommendations for books or movies ClusteringRecommenders Vector Similarity Pattern Mining Classification RegressionGenetic Dimension Reduction Matrices Collocations
12
Flat files (.txt,.dat,.xlsx, etc.) Running Mahout job on Hadoop Command Window to get output file Convert to Mahout input Hadoop Command Window Output file Serving LayerSpeed LayerBatch Layer Windows Azure HDInsight HDInsight Consoles
14
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.