Download presentation
Presentation is loading. Please wait.
Published byEthel Brown Modified over 9 years ago
1
Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli Ltd rafal@projectbotticelli.com @rafaldotnet
2
Objectives The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
3
Register on projectbotticelli.com
4
Big data, or just complex data? velocity variety complexity volume Data interpretingpreparing
5
Today’s big data, tomorrow’s little data Complexity vs. current capabilities FAA International Flight Service Station, Honolulu, Hawaii, 1964 (Public Domain Image)
6
DomainCommon big data scenarios Financial services Modeling true risk Threat analysis and fraud detection Trade surveillance Credit scoring and analysis Media & Entertainment Recommendation engines Ad targeting Search quality Abuse and click fraud detection Retail Point of sales transaction analysis Customer churn analysis Sentiment analysis Telecommunications Customer churn prevention Network performance optimization Call Detail Record (CDR) analysis Network failure prediction Government Cyber security (botnets, fraud) Traffic congestion and re-routing Environmental monitoring Antisocial monitoring via social media Healthcare Genomics research Cancer research Health pandemics early detection Air quality monitoring
7
Which big data?
9
Low latency Sub-zero processing of large event streams Continuous insight through historical data mining PDW: near real-time insights Real-time with complex event processing
10
Advanced analytics Descriptive & predictive Clustering, neural nets, decision trees, time series, naïve Bayes, sequence clustering, linear and logistic regression Semantic search Conceptual similarities Geospatial Geometry and geography Big data Hadoop, Mahout
12
Apache Hadoop distribution Developed by Hortonworks & Microsoft Integrated with Microsoft BI Microsoft HDInsight
13
Big, fast, or complex data Microsoft HDInsight Tabular OLAP SQL 010101010101010101 1010101010101010 01010101010101 101010101010 Interaction, exploration, reporting, visualisation PDW + Polybase
14
Hadoop principles Practical method for massive parallelisation of analytical data processing
16
Hadoop data
17
Hadoop MapReduce
18
Hadoop cluster Yahoo! Hadoop cluster, about 2007. Source: http://developer.yahoo.com. Picture used with permission.
19
Hadoop cluster Buster Cluster, an early research project by Miles Osborne, University of Edinburgh, School of Informatics. Picture used with permission. http://homepages.inf.ed.ac.uk/miles/
20
Cloud rent-a-Hadoop-cluster, or: “Supercomputer for cents” Windows Azure HD Insight
21
Processing logic in HDInsight 1.6 2.1 3.0
22
Processing logic in HDInsight 3.0 Hadoop 2.2
25
Hadoop data science Mahout 0.9 (not HDInsight 3.0 yet) Collaborative filtering, recommenders, clustering, singular value decomposition, parallel frequent pattern mining, naive Bayes, decision tree
27
Summary projectbotticelli.com BI video tutorials, PPTs, and articlesvideoPPTsarticles 15% Off: 2014SWISS15 Valid in March 2014 only Follow: @rafaldotnet Email: rafal@projectbotticelli.com Discover: rafal.netrafal.net
28
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.