Microsoft Machine Learning & Data Science Summit Machine Learning & Data Science Conference 5/29/2018 10:21 AM Microsoft Machine Learning & Data Science Summit September 26 – 27 | Atlanta, GA BR004 © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Putting science into the business of data science Rafal Lukawiecki Data Scientist Project Botticelli Ltd @rafaldotnet rafal@projectbotticelli.com
Objectives Practical, applied data science What does business expect from data science(tists) The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material presented is not certain and may vary based on several factors. Project Botticelli Ltd and Microsoft make no warranties, express, implied or statutory, as to the information in this presentation. Copyright © 2016 Project Botticelli Ltd unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Project Botticelli Ltd or Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE.
Study with projectbotticelli.com Online training PPTs Offer Data Science Data Mining Introduction to BI & Big Data Power BI DAX Cube Design MDX Excel BI PPTs projectbotticelli.com/ppt Offer $50 off annual memberships (exp 31 Oct 2016): 50IGNITE2016
Learn with me! Hands-on 5-day Practical Data Science (SQL+R+Cortana) class: London 24 Oct technitrain.com/coursedetail.php?c=68 Chicago 7 Nov sqlskills.com/rafal Zurich 14 Nov ch.atosconsulting.com/rafal Oslo 21 Nov glasspaper.no/rafal Stockholm 28 Nov cornerstone.se/rafal More info: projectbotticelli.com/courses
Why do my customers need data science?
5/29/2018 Why? https://projectbotticelli.com/knowledge/what-is-advanced-analytics-and-data-science-and-machine-learning--and-what-is-their-value Discover reason behind success, failure Understand customers, products, patterns Accurately plan future Experiment before making decisions Experiment with autonomous decision making (AI) © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Advanced analytics, or data science, or artificial intelligence?
Machine Learning & Data Science Conference 5/29/2018 10:21 AM Prepare Explore, visualise Data science Acquire Advanced analytics © 2015 Microsoft Corporation. All rights reserved. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Data science more than data engineering Scientific method of reasoning applied to data and analytics Hypothesis, experiments, facts, logical reasoning + data engineering
Data science Data wrangling (munging), retrieval + storage Data mining & machine learning Statistics Big data Data science
Machine learning Academic discipline Part of IT and computing Algorithms that detect meaningful patterns in data
Machine learning = data mining? In analytics today: yes Data mining uses machine learning on flat data
AI in practice Machine learning models Business change Autonomous, “intelligent” decisions Business change AI in practice Artificial Intelligence System that perceives its environment and takes actions to maximize its chance of success. Russell & Norvig, 2003
Real-world team for advanced analytics Data expert Data scientist Domain expert Real-world team for advanced analytics
Example: hospital readmissions Does enough data show examples of readmission? Are the predictable patterns of readmitting? Can we reduce readmissions? What is readmission? Example: hospital readmissions
Goal to analytics to goal Start with a (business) goal Express it as a testable hypothesis (H0 + HA, etc) Progressively refine the hypothesis through experiments and further models
How? Define & initialise a model Train model (process cases) 5/29/2018 How? Define & initialise a model Train model (process cases) Validate model Use it: Explore or Deploy Update and revalidate © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Stating the goal State it as a series of hypotheses Iterate from high-level to specifics From “Data can reduce readmissions” to “system X that tracks specific set of patient data including […] reduces readmissions by N%” Experiments may be needed to get data This requires a budget
Demo Hypothesis: hospital length of stay for elderly patients with dementia significantly differs from those not suffering from it
Demo Hypothesis: car servicing satisfaction ratings influence customer churn and future sales
Real world experiments Design Target and control group(s) Run Introduce the change (promo, pricing, product…) Evaluate Statistically compare control and experiment results
But, does it all work? Data Driven Decision-making Run on analytics 5/29/2018 But, does it all work? Data Driven Decision-making Run on analytics The more data-driven a firm is, the more productive it is 5-6% increase (1σ) even when controlling for confounding factors Brynjolfsson, Hitt, Kim, MIT & Penn’s Wharton School, 2011 http://papers.ssrn.com/sol3/Papers.cfm?abstract_id=1819486 © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Putting science into the business of data science Human intuition drives innovation Advanced analytics validates facts Data science enables Data Driven Decisions By 2020, 10% of organisations will have a specialised unit for productizing and commercializing data. [Gartner, 2016]
Summary Scientific method of reasoning underpins data science 5/29/2018 Summary Scientific method of reasoning underpins data science Hypothesis iteration as a method for progressively solving a business problem Chicago: 7 Nov sqlskills.com/rafal projectbotticelli.com BI video tutorials, PPTs, and articles $50 off annual: 50IGNITE2016 projectbotticelli.com/courses Follow: @rafaldotnet Email: rafal@projectbotticelli.com Discover: rafal.net © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
5/29/2018 10:21 AM Copyright © 2016 Project Botticelli Ltd unless noted otherwise. Some slides contain quotations from copyrighted materials by other authors, as individually attributed or as already covered by Project Botticelli Ltd or Microsoft Copyright ownerships. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Project Botticelli Ltd as of the date of this presentation. Because Project Botticelli & Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft and Project Botticelli cannot guarantee the accuracy of any information provided after the date of this presentation. Project Botticelli makes no warranties, express, implied or statutory, as to the information in this presentation. E&OE. © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.