Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining in Industry: Putting Theory into Practice Bhavani Raskutti.

Similar presentations


Presentation on theme: "Data Mining in Industry: Putting Theory into Practice Bhavani Raskutti."— Presentation transcript:

1 Data Mining in Industry: Putting Theory into Practice Bhavani Raskutti

2 Agenda What do analysts in industry actually do? Who are our customers & colleagues? What resources do we use? Who uses analytics in Australian Industry? Case studies Take-home Points

3 What do analysts in industry actually do? Business understanding of complex trends To make strategic & operational decisions Business Problem Data Acquisition & Preparation DAP Problem Definition PD D Deployment Presentation P Mathematical Modelling (Algorithms) Data Matrix MM Initial Development Iterative 90% DAP Decision-making by users Insights via GUI Automation Training Documentation IT Support

4 Who are our customers & colleagues? Customers of Analytics Marketing Design Sales Supply Chain Senior Management Analytics Data Mining Statistical analysis, machine learning Maths/Stats/Science graduates Market Research Behavioural analysis psych/mktg/SocSc graduates Business Intelligence Historical Reporting CS/IT graduates Business/ Corporate Information Technology

5 What resources do we use? Data Extraction – SQL: from databases such as Oracle, DB2, mySQL, … Exploratory/Visualisation – Tableau: Multi-dimensional visual analysis with ability to publish and connectivity to most databases Tableau – Qlikview: Very similar to Tableau, later entrant into Australia Qlikview – Excel: Great for exploration, although businesses use it as the only analysis tool Statistical Modelling Expensive commercial tools used in financial & telecommunications industry. – SAS: Industry leader with broad statistical service offering, but license is expensive SAS – KXEN: Recent entrant, but innovative with particular focus on large datasets & automation. KXEN – Salford systems: Well established leader with focus on regression trees and explainable models. Salford systems – SPSS, Statistica, Matlab: Niche players appealing to certain communities. SPSSStatisticaMatlab Open source or low priced data mining tools: – Weka is open source software issued under the GNU General Public License. Weka – RapidMiner is available under a dual license: GNU licence or a proprietary license. RapidMiner – R is a free software environment for statistical computing and graphics. Needs compilation. R Presentation – Cognos, Business Objects, Tableau, …

6 Who uses analytics in Australian industry? IndustryClustering / Segmentation Classification / Scoring Other Customer/market segmentation Survey analysis Sentiment analysis … Upsell/Cross-sell Fraud detection Credit scoring Location services Churn modelling … Marketing effectiveness Market share understanding Next best offer Asset management … Telecom Finance Wholesale Retail Bio-informatics Government, Utilities, Pharmaceuticals, Manufacturing, Web service providers Consulting firms, Data mining vendors

7 IndustryClustering / Segmentation Classification / Scoring Other Customer/market segmentation Survey analysis Sentiment analysis … Upsell/Cross-sell Fraud detection Credit scoring Location services Churn modelling … Marketing effectiveness Market share understanding Next best offer Asset management … Telecom Finance Wholesale Retail Bio-informatics Government, Utilities, Pharmaceuticals, Manufacturing, Web service providers, … Consulting firms, Data mining vendors, Market research firms, … Who uses analytics in Australian industry?

8 DAP PD D P MM - Sales  demand - Similar products @ similar outlets have similar demand to sales relationship - Anomaly may be due to lack of stock Case Study: Wholesale Industry Increase wholesale sales into major retailers - Quantify demand - Define normalised sell-rate - Define a long term in-stock measure - Define products & outlets that are similar - Weekly SOH & sales for each store & SKU - SKU master - Store master Simple univariate regression in SQL Perform comparisons & find anomalies with stock issues - Self-serve report in Cognos for each sales rep - Presents list of products with opportunities - Opportunities click through to detailed graphs showing demand, sales & stock position of the two products compared

9 Demand In-stock % ·R1 ·R2 Demand Sell Rate Case Study: Wholesale Industry (Cont’d)

10 DAP PD D P MM - Sales  demand - Similar products @ similar outlets have similar demand to sales relationship - Anomaly may be due to lack of stock Increase wholesale sales into major retailers - Quantify demand - Define normalised sell-rate - Define a long term in-stock measure - Define products & outlets that are similar - Weekly SOH & sales for each store & SKU - SKU master - Store master Simple univariate regression in SQL - Self-serve report in Cognos for each sales rep - Presents list of products with opportunities - Opportunities click through to detailed graphs showing demand, sales & stock position of the two products compared - Implementation in SQL & Cognos - DataMarts for reports updated weekly - Documentation on intranet wiki - Training by corporate training team - Support from IT helpdesk Perform comparisons & find anomalies with stock issues Case Study: Wholesale Industry (Cont’d)

11 Agenda What do analysts in industry actually do? Who are our customers & colleagues? What resources do we use? Who uses analytics in Australian Industry Who uses analytics in Australian Industry Case studies Take-home Points

12 Win-back? Stop churn? Upsell? DAP PD D P MM - Winning back customers is hard - Churn is hard to identify and harder to prevent - Upsell to existing customers increases retention & revenue Increase revenue from business customers Imbalanced data – too few examples of take-up for most products - Data aggregation & Interleaving Comparable predictors from revenue - Raw, change from previous, projected - Use values as is & normalised - Binarise using 10 equi-size bins - Satisfaction survey - Service assurance - Demographics - Quarterly revenue from different products for each customer - SVMs to score with likelihood of take-up - Weighting by value of take-up to find high value take-up Excel spread sheet with potential customer list - Take-up likelihood for all modelled products - Last quarter revenue for all products - Implementation in Matlab & C - Different predictive models for over 50 products in 4 segments - Automatic updates every quarter - Used by sales consultants to re- negotiate contracts Create models to predict customers likely to take up a product soon i-5 i-4 i-3 i-2 i-4 i-3 i-2 i-1 i-3 i-2 i-1 i i-1 i i+1 i+2 Predictors Prediction Labels TRAINTRAIN Case Study: Telecommunications Industry

13 Evaluation: Piloted predictive modelling in 2 different regions – Region 1: 9 new opportunities from just 5 products with an increase in revenue of ~400K A$ – Region 2: Opportunities identified were already being processed by sales consultants Conclusion: Predictive modelling better than previous manual process – Identifies more opportunities – Spreads techniques of good sales teams across the whole organisation Deployed in 2004 & still operational For more details, refer to “Predicting Product Purchase Patterns for Corporate Customers” by Bhavani Raskutti & Alan Herschtal in Proceedings of KDD’05, Chicago, Illinois, USA Case Study: Telecommunications Industry (Cont’d)

14 Take-home points Data acquisition & processing phase forms 80-90% of any analytics project Business users are tool agnostic – R, SAS, Matlab, SPSS, … for statistical analysis – Tableau, Cognos, Excel, VB, … for presentation Business adoption of analytics driven by – Utility of application – Ease of decision-making from insights – Ability to explain insights

15 Questions?


Download ppt "Data Mining in Industry: Putting Theory into Practice Bhavani Raskutti."

Similar presentations


Ads by Google