IT Management Case # 8 - A Case on Decision Tree: Customer Churning Forecasting and Strategic Implication in Online Auto Insurance using Decision Tree.

Slides:



Advertisements
Similar presentations
The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Advertisements

Random Forest Predrag Radenković 3237/10
Decision Tree Approach in Data Mining
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Data Mining: A Closer Look Chapter Data Mining Strategies.
1 DATA MINING: DEFINITIONS AND DECISION TREE EXAMPLES Emily Thomas Director of Planning and Institutional Research.
Decision Tree Rong Jin. Determine Milage Per Gallon.
About ISoft … What is Decision Tree? Alice Process … Conclusions Outline.
Basic Data Mining Techniques
Rule induction: Ross Quinlan's ID3 algorithm Fredda Weinberg CIS 718X Fall 2005 Professor Kopec Assignment #3.
A U S T R A L I A ’ S I N T E R N A T I O N A L U N I V E R S I T Y
Data Mining: A Closer Look Chapter Data Mining Strategies (p35) Moh!
Chapter 6 Decision Trees
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
Introduction to Directed Data Mining: Decision Trees
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Data Mining BS/MS Project Decision Trees for Stock Market Forecasting Presentation by Mike Calder.
Data Mining By Jason Baltazar, Phil Cademas, Jillian Latham, Rachel Peeler & Kamila Singh.
Data Mining Techniques
Chapter 1: Introduction to Predictive Modeling 1.1 Applications 1.2 Generalization 1.3 JMP Predictive Modeling Platforms.
Next Generation Techniques: Trees, Network and Rules
Lecture Notes 4 Pruning Zhangxi Lin ISQS
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Decision Trees.
Overview of Data Mining Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Chapter 9 – Classification and Regression Trees
DATA MINING FINAL REPORT Vipin Saini M 許博淞 M 陳昀志 M
Computational Intelligence: Methods and Applications Lecture 19 Pruning of decision trees Włodzisław Duch Dept. of Informatics, UMK Google: W Duch.
Zhangxi Lin ISQS Texas Tech University Note: Most slides are from Decision Tree Modeling by SAS Lecture Notes 5 Auxiliary Uses of Trees.
Using Data Mining Technologies to find Currency Trading Rules A. G. Malliaris M. E. Malliaris Loyola University Chicago Multinational Finance Society,
Neural Networks Automatic Model Building (Machine Learning) Artificial Intelligence.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
1 COMP3503 Inductive Decision Trees with Daniel L. Silver Daniel L. Silver.
Decision Tree Learning Debapriyo Majumdar Data Mining – Fall 2014 Indian Statistical Institute Kolkata August 25, 2014.
Multivariate Data Analysis Chapter 5 – Discrimination Analysis and Logistic Regression.
Decision Trees. Decision trees Decision trees are powerful and popular tools for classification and prediction. The attractiveness of decision trees is.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Business Intelligence and Decision Modeling Week 9 Customer Profiling Decision Trees (Part 2) CHAID CRT.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
APPLICATION OF DATAMINING TOOL FOR CLASSIFICATION OF ORGANIZATIONAL CHANGE EXPECTATION Şule ÖZMEN Serra YURTKORU Beril SİPAHİ.
MKT 700 Business Intelligence and Decision Models Algorithms and Customer Profiling (1)
Ensemble with Neighbor Rules Voting Itt Romneeyangkurn, Sukree Sinthupinyo Faculty of Computer Science Thammasat University.
CHAID. Example: Opening of Cinema/ Children’s Park/Exhibition Center To find consumer responses to opening of Cinema, Children’s park or Exhibition 903.
Copyright © 2010 SAS Institute Inc. All rights reserved. Decision Trees Using SAS Sylvain Tremblay SAS Canada – Education SAS Halifax Regional User Group.
Overview of Methods Data mining techniques What techniques do, examples, advantages & disadvantages.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Survey is a strategy that involves the collection of data from a pre-determined sample.
MKT 700 Business Intelligence and Decision Models Week 8: Algorithms and Customer Profiling (1)
ISQS 7342 Dr. zhangxi Lin By: Tej Pulapa. DT in Forecasting Targeted Marketing - Know before hand what an online customer loves to see or hear about.
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Basic Data Mining Techniques Chapter 3-A. 3.1 Decision Trees.
Eco 6380 Predictive Analytics For Economists Spring 2016 Professor Tom Fomby Department of Economics SMU.
Classification Tree Interaction Detection. Use of decision trees Segmentation Stratification Prediction Data reduction and variable screening Interaction.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
Logistic Regression: Regression with a Binary Dependent Variable.
Data Based Decision Making
A Predictive Model for Student Retention Using Logistic Regression
David L. Olson Department of Management University of Nebraska
Erasmus University Rotterdam
Data Mining for Business Analytics
כריית נתונים.
Multiple Decision Trees ISQS7342
Progress Report Meng-Ting Zhong 2015/9/10.
Decision trees MARIO REGIN.
Chapter 6 Logistic Regression: Regression with a Binary Dependent Variable Copyright © 2010 Pearson Education, Inc., publishing as Prentice-Hall.
Presentation transcript:

IT Management Case # 8 - A Case on Decision Tree: Customer Churning Forecasting and Strategic Implication in Online Auto Insurance using Decision Tree Algorithms

3. 고객이탈 예측 (Data mining) Prediction of churn customers in on-line auto insurqnce :who are leaving our company? ○ ID3 (Interactive Dichotomizer 3) ○ C4.5, C5.0 ○ CART (Classification and Regression Tree) ○ CHARD (Chi-square Automatic Interaction Detection Decision Tree ※ Logistic regression model(LRM) Multivariate discriminant analysis(MDA) Data mining Statistics 3/10

TypeC5.0CHAIDCART Target Class (ouput) Values CategoricalAll # of Branches In each node Multi binary 3. 고객이탈 예측 (Data mining)

4. 의사결정나무 알고리즘의 장단점 Advantages ○ Generate understandable rules ○ Able to perform in rule-oriented domains ○ Easy of calculation at classification time ○ Able to handle continuous and categorical variables ○ Able to indicate best fields clearly Disadvantages ○ error-prone with too many classes ○ computationally expensive to train Decision Tree Algorithm 4/10

Data Collection ○ Sample Data : 13,200 - year 2003 ∼ year 2004, Auto insurance contracts Insurance agents Variables ○ 25 candidate variables - select 15 variables by t-test and chi-square 5/10

6. ` 유의한 변수 도출 Inducing 15 variables including numbers and categorical variables T-test(numbers ) ○ Driver’age ○ Price of car ○ Medical expenses ○ Last premium. ○ Date ○ Year of car ○ comprehensive BI. ○ Zip code ○ Type of car ○ # of air bag ○ Deductible. ○ Gender chi-square(categorical ) Selected No selected Not selelcted 6/10

7. 유의한 변수 data 요약 Variables ○ Depedendent variables : Switch= 1, No Switch = 0 ○ Independent variables : numeric (t-test)or categorical (Chi-square) variables 7/10

8. Sample Test Sample Data Classification ○ Sample Data : 13,200 - Switches: 6,600 / No Switch: 6,600 Test ○ Prediction methods : C5.0, LRM, MDA ○ Training data : 10,560(80%) Holdout data : 2,640(20%) Prediction Accuracy Data setsLRMMDAC5.0 Training data Holdout data /10

9. 예측기법 비교 C5.0 ○ Program : Clementine 8.1 ○ Accuracy from test - Training data : 67.39%, Holdout data : 68.71% ○ Inducing Rules - Switch: 58 Rules - No Switch: 65 Rules LRM ○ Program : SPSS 11.1 ○ Accuracy from test - Training data : 60.0%, Holdout data : 65.3% MDA ○ Program : SPSS 11.1 ○ Accuracy from test - Training data : 59.7%, Holdout data : 59.4% Comparison results ○ C5.0 is superve in predicting churn customers - it can be used to analyze on-line auto insurance and predict Churn customers 9/10

10. 종합 및 적용 가능한 Rule 예시 Conclusion ○ Rule-based analysis of auto insurance market & inducing marketing strategy - reduce churn rate(keep them) Churn Rules 10/10