Presentation is loading. Please wait.

Presentation is loading. Please wait.

Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli

Similar presentations


Presentation on theme: "Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli"— Presentation transcript:

1 Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli Ltd rafal@projectbotticelli.com @rafaldotnet

2 Objectives The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.

3 Register on projectbotticelli.com

4 Big data, or just complex data? velocity variety complexity volume Data interpretingpreparing

5 Today’s big data, tomorrow’s little data Complexity vs. current capabilities FAA International Flight Service Station, Honolulu, Hawaii, 1964 (Public Domain Image)

6 DomainCommon big data scenarios Financial services Modeling true risk Threat analysis and fraud detection Trade surveillance Credit scoring and analysis Media & Entertainment Recommendation engines Ad targeting Search quality Abuse and click fraud detection Retail Point of sales transaction analysis Customer churn analysis Sentiment analysis Telecommunications Customer churn prevention Network performance optimization Call Detail Record (CDR) analysis Network failure prediction Government Cyber security (botnets, fraud) Traffic congestion and re-routing Environmental monitoring Antisocial monitoring via social media Healthcare Genomics research Cancer research Health pandemics early detection Air quality monitoring

7 Which big data?

8

9 Low latency Sub-zero processing of large event streams Continuous insight through historical data mining PDW: near real-time insights Real-time with complex event processing

10 Advanced analytics Descriptive & predictive Clustering, neural nets, decision trees, time series, naïve Bayes, sequence clustering, linear and logistic regression Semantic search Conceptual similarities Geospatial Geometry and geography Big data Hadoop, Mahout

11

12 Apache Hadoop distribution Developed by Hortonworks & Microsoft Integrated with Microsoft BI Microsoft HDInsight

13 Big, fast, or complex data Microsoft HDInsight Tabular OLAP SQL 010101010101010101 1010101010101010 01010101010101 101010101010 Interaction, exploration, reporting, visualisation PDW + Polybase

14 Hadoop principles Practical method for massive parallelisation of analytical data processing

15

16 Hadoop data

17 Hadoop MapReduce

18 Hadoop cluster Yahoo! Hadoop cluster, about 2007. Source: http://developer.yahoo.com. Picture used with permission.

19 Hadoop cluster Buster Cluster, an early research project by Miles Osborne, University of Edinburgh, School of Informatics. Picture used with permission. http://homepages.inf.ed.ac.uk/miles/

20 Cloud rent-a-Hadoop-cluster, or: “Supercomputer for cents” Windows Azure HD Insight

21 Processing logic in HDInsight 1.6 2.1 3.0

22 Processing logic in HDInsight 3.0 Hadoop 2.2

23

24

25 Hadoop data science Mahout 0.9 (not HDInsight 3.0 yet) Collaborative filtering, recommenders, clustering, singular value decomposition, parallel frequent pattern mining, naive Bayes, decision tree

26

27 Summary projectbotticelli.com BI video tutorials, PPTs, and articlesvideoPPTsarticles 15% Off: 2014SWISS15 Valid in March 2014 only Follow: @rafaldotnet Email: rafal@projectbotticelli.com Discover: rafal.netrafal.net

28 The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Microsoft makes no warranties, express, implied or statutory, as to the information in this presentation. Portions © 2014 Project Botticelli Ltd & entire material © 2014 Microsoft Corp unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.


Download ppt "Big data analytics Rafal Lukawiecki Strategic Consultant Project Botticelli"

Similar presentations


Ads by Google