More value from data using Data Mining Allan Mitchell SQL Server MVP.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

Data Mining and SSIS A marriage made in heaven (or Redmond at least) Allan Mitchell SQL Server MVP.
Data Mining (and Machine Learning) With Microsoft Tools Michael Lisin, Plaster Group May 8, 2014.
Solving Problems in ETL using SSIS Allan Mitchell SQL Server MVP
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Introduction to Data Mining with XLMiner
Turning Numbers Into Knowledge Nate Moore MBA, CPA, FACMPE.
Working with Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Business Intelligence components Introduction. Microsoft® SQL Server™ 2005 is a complete business intelligence (BI) platform that provides the features,
The information herein is for informational purposes only and represents the opinions and views of Project Botticelli and/or Rafal Lukawiecki. The material.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining: A Closer Look
Gavin Russell-Rockliff BI Technical Specialist Microsoft BIN305.
Peter Myers Bitwise Solutions Pty Ltd. Predictive Analytics PresentationExplorationDiscovery Passive Interactive Proactive Business Insight Canned.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
Data Mining Techniques
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
DAT336 SQL Server “Yukon” – The Future of Business Intelligence Jason Carlson Product Unit Manager SQL Server Microsoft Corporation Brian Welcker Microsoft.
Overview of Data Mining Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
Mainlining Data Mining: Jim Gray Microsoft Panel talk at ICDE2000 San Diego, 2 Mar 2000.
Lecture 9: Knowledge Discovery Systems Md. Mahbubul Alam, PhD Associate Professor Dept. of AEIS Sher-e-Bangla Agricultural University.
DAT204 Introduction to Data Mining with SQL Server 2000 ZhaoHui Tang Program Manager SQL Server Analysis Services Microsoft Corporation.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
ZhaoHui Tang Program Manager SQL Server Analysis Services Microsoft Corporation DAT205 Advanced Data Mining Using SQL Server 2000.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Turning Numbers Into Knowledge Nate Moore MBA, CPA, FACMPE.
Allan Mitchell SQL Server MVP Konesans Limited ww.SQLIS.com.
Introduction to SQL Server Data Mining Nick Ward SQL Server & BI Product Specialist Microsoft Australia Nick Ward SQL Server & BI Product Specialist Microsoft.
Integration Services in SQL Server 2008 Allan Mitchell SQL Server MVP.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Consul- ting Services Outsour- cing Services Techno- logy Services Local Profes- sional Services Competence Centers Business Intelligence WebTech SAP.
1 STAT 5814 Statistical Data Mining. 2 Use of SAS Data Mining.
Finding Hidden Intelligence with Predictive Analysis of Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Overview of Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
1 Advanced Topics Using Microsoft SQL Server 2005 Integration Services Allan Mitchell – SQLBits – Oct 2007.
DAT377 Data Mining In SQL Server 2000 And SQL Server 2005 (Code Named “Yukon”) Paul Bradley Principal, Data Mining Technology Apollo Data Technologies.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
SharePoint Saturday SharePoint 2010 BI Insights Saturday, 16 th October, 2010 MIC - Kuwait.
Fraud Detection Notes from the Field. Introduction Dejan Sarka –Data.
Data Mining With SQL Server Data Tools Mining Data Using Tools You Already Have.
Developing More Intelligent Applications Using Data Mining Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd
Event Title Event Date. Module 09— Introducing SSAS Data Mining Models Name Title Microsoft Corporation.
Show Me Potential Customers Data Mining Approach Leila Etaati.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Ahmed K. Ezzat, SQL Server 2008 and Data Mining Overview 1 Data Mining and Big Data.
Data Resource Management – MGMT An overview of where we are right now SQL Developer OLAP CUBE 1 Sales Cube Data Warehouse Denormalized Historical.
Practical MSBI(SSIS, SSAS,SSRS) online training. Contact Us: Call: Visit:
Drew Minkin ◦ Past  Analytics Architect at Zilliant  Senior Consultant, Fujitsu  6+ years Microsoft Services  Escalation.
Jeremy Kingry, eBECS | PREDICTIVE INTELLIGENCE AND WHY YOU WANT TO KNOW ABOUT IT.
EDUCAUSE Annual Conference
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Business Intelligence for a Tough Economy: Data Mining
Data Mining in Action: A Case Study
DATA MINING © Prentice Hall.
Data Mining It's not the size of your data it's what you do with it
Business Intelligence Fundamentals: Data Mining
Data Mining in SQL Server 2005
TechEd /28/ :48 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Machine Learning with Weka
Supporting End-User Access
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining in SQL Server 2005
Data mining algorithms
Presentation transcript:

More value from data using Data Mining Allan Mitchell SQL Server MVP

Who am I SQL Server MVP SQL Server Consultant Joint author on Wrox Professional SSIS book Worked with SQL Server since version and Partner of SQL Know How

SQL Know How Dedicated to Microsoft SQL Server We are familiar trusted faces Provide – Consultancy large and small – Training public and private – Mentoring – Business Brain Storming

Today’s Schedule what is data mining (Overview) data mining terminology myths around data mining excel AddIn to Office2007 – Demo Setup – Demo Key Influencers – Demo Categories – Demo Make a Prediction – Demo “Other stuff” – if time Questions and answers

What is Data Mining The process of using statistical techniques to discover subtle relationships between data items, and the construction of predictive models based on them. The process is not the same as just using an OLAP tool to find exceptional items. Generally, data mining is a very different and more specialist application than OLAP, and uses different tools from different vendors. Normally the users are different, too. OLAP vendors have had little success with their data mining efforts. OLAP REPORT

What does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions Query, Reporting, AnalysisData Mining WhatWhy How

Comparative Benefits Predictive Projects versus Nonpredictive Projects Source: IDC, 2003

Data Mining terminology mining structure mining model mining algorithm training dataset testing dataset

SQL Server 2005 Algorithms Decision Trees Clustering Time Series Sequence Clustering Association Naïve Bayes Neural Net Plus: Linear and Logistic Regression

Sequence Clustering Applied to – Click stream analysis – Customer segmentation with sequence data – Sequence prediction Mix of clustering and sequence technologies Group individuals based on their profiles including sequence data

Time Series Applied to – Forecast sales – Web hits prediction – Stock value estimation Patented technique from Microsoft Research Uses regression tree technology to describe and predict series values

Clustering Applied to – Segmentation: Customer grouping, Mailing campaign – Also support classification and regression Expectation Maximization – Probabilistic Clustering K-Means – Distance based Clusters both discrete and continuous values – Discrete values are “binarized” Anomaly detection Check variable independence – “Predict Only” attributes not used for clustering

Clustering Discrete Male Female Son Daughter Parent Age

Clustering Anomaly Detection Male Female Son Daughter Parent Age

dm data flow Cube Historical Dataset New Dataset Data Transform (SSIS) Reporting Mining Models Model Browsing Prediction LOB Application Cube

the steps to a successful model MS BOL

DMX CREATE MINING MODEL CreditRisk (CustID LONG KEY, Gender TEXT DISCRETE, Income LONG CONTINUOUS, Profession TEXT DISCRETE, Risk TEXT DISCRETE PREDICT) USING Microsoft_Decision_Trees CREATE MINING MODEL CreditRisk (CustID LONG KEY, Gender TEXT DISCRETE, Income LONG CONTINUOUS, Profession TEXT DISCRETE, Risk TEXT DISCRETE PREDICT) USING Microsoft_Decision_Trees INSERT INTO CreditRisk (CustId, Gender, Income, Profession, Risk) Select CustomerID, Gender, Income, Profession,Risk From Customers INSERT INTO CreditRisk (CustId, Gender, Income, Profession, Risk) Select CustomerID, Gender, Income, Profession,Risk From Customers Select NewCustomers.CustomerID, CreditRisk.Risk, PredictProbability(CreditRisk) FROM CreditRisk PREDICTION JOIN NewCustomers ON CreditRisk.Gender=NewCustomer.Gender AND CreditRisk.Income=NewCustomer.Income AND CreditRisk.Profession=NewCustomer.Profession Select NewCustomers.CustomerID, CreditRisk.Risk, PredictProbability(CreditRisk) FROM CreditRisk PREDICTION JOIN NewCustomers ON CreditRisk.Gender=NewCustomer.Gender AND CreditRisk.Income=NewCustomer.Income AND CreditRisk.Profession=NewCustomer.Profession

Myths around data mining You have to be a propeller head It’s a new concept. Only works with SSAS cubes

Excel 2007 DMAddin DM visualisation table analysis Create session models/permanent models Connect to ssas for full blown models intuitive interface

Demos setup key Influencers categories Make a prediction other sexy stuff

Resources Loads to be honest (DMX, API to name two things) Big Subject but very sexy

Contact Details