Presentation is loading. Please wait.

Presentation is loading. Please wait.

Make every interaction count™ Decision Trees: Profiling and Segmentation Sachin Chincholi, Professional Services Starting in 15 minutesStarting in 10 minutesStarting.

Similar presentations


Presentation on theme: "Make every interaction count™ Decision Trees: Profiling and Segmentation Sachin Chincholi, Professional Services Starting in 15 minutesStarting in 10 minutesStarting."— Presentation transcript:

1 Make every interaction count™ Decision Trees: Profiling and Segmentation Sachin Chincholi, Professional Services Starting in 15 minutesStarting in 10 minutesStarting in 5 minutesStarting in 2 minutesStarting now USA: 1 866 793 4279 Austria 0800 28 1673 Belgium: 0800 505 60 Canada: 1 866 270 8076 India 000800 100 6558 Republic of Ireland: 1800 944 607 Netherlands 0800 0233 593 Norway: 800 164 90 Spain 900 801 508 Sweden: 0200 125 679 UK: 0808 109 1441 International: +44 20 8609 1476 Access code 131716 #

2 Portrait Software Copyright 2007CUSTOMER CONFIDENTIAL How to ask a Question

3 Portrait Software Copyright 2007 Decision Trees: Profiling and Segmentation –Presenter: Sachin Chincholi, Professional Services –Audience: Existing Quadstone Users

4 Portrait Software Copyright 2007 Decision Trees for insight + Transparent –Easily understandable by non-statisticians –Sanity check your modelling framework –Is your objective defined correctly? –Are the initial splits plausible? + Fast to build –Quick alert to possible contamination

5 Portrait Software Copyright 2007 Decision Trees for Modeling + Transparent –Easier to get buy-in from the business –Easy to code + Non-parametric –No assumptions about underlying distributions of Analysis Candidates + Non-linear –Allow easy discovery of non-linear patterns (age vs. income) –‘Unstable’ –Different populations give very different trees

6 Portrait Software Copyright 2007 Interpreting a decision tree ≥ 40 The split at Age = 40 is the most predictive < 40 Age #2#3 50.2% of 2030220.1% of 79698 AgeIncome Color is used to show match rates #1 Objective: Response match = 26.2% of 100000 Match rate for the objective over the entire population

7 Portrait Software Copyright 2007 Decision tree build process –Given an objective, Decision Tree Builder will find the most predictive split among all possible splits, with all analysis candidates, given the current binnings –The population is then split into two segments based on this –The same method splits each of the two segments into two further segments –This process continues until the tree is finished, as determined by the tree constraints

8 Portrait Software Copyright 2007 Choice of a decision tree split –Each possible split is assigned a quality value –The splits are ranked: –The quality value depends on the tree type: –Binary outcome tree and classification tree: Information gain –Regression tree: R 2

9 Portrait Software Copyright 2007 0.11 Choice of a decision tree split (2) Objective: Response Level: 1 Age 18203040506065 Income 01000020000300004000050000100000 LoanAmount 0200010000200003000050000100000 MaritalStatus SingleMarriedWidow 0.1040.1050.1210.1450.132 0.1860.2050.1930.1990.156 0.0980.1630.1690.123 0.205 0.111 0.2010.1690.180 Misc. 0.175 0.1000.131

10 Portrait Software Copyright 2007 Splitting criterion –Information = Σ p(c).log(p(c)) –Sum of (proportion C x log(proportion(C)) for all C’s –Equivalent to likelihood-ratio test for comparing two populations –Seeks to separate out classes, while minimising small nodes c=1,n

11 Portrait Software Copyright 2007 Is the decision tree any good (binary case)? Proportion of actual non- matches 1 Proportion of actual matches 0 0.5 1 Gini “curve” 0 Sort by predicted propensity

12 Portrait Software Copyright 2007 Calculating the Gini value Gini = A/B x 100% Gini “curve” A B

13 Portrait Software Copyright 2007 Gini “curves” Perfect modelTotally unpredictive model

14 Portrait Software Copyright 2007 Overfitting Predictive power Complexity (relative to dataset size) apparent actual overfitting *

15 Portrait Software Copyright 2007 Best Practice –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning

16 Portrait Software Copyright 2007 Best Practice –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning

17 Portrait Software Copyright 2007 Best Practise –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning

18 Portrait Software Copyright 2007 Confidence interval for 100 responses… 100010,000100,000 Mean Upper Lower

19 Portrait Software Copyright 2007 Confidence intervals

20 Portrait Software Copyright 2007 What makes a good segment? If this is the average… Is this worth knowing? Is this?

21 Portrait Software Copyright 2007 –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning Best Practice

22 Portrait Software Copyright 2007 Possible splits scale exponentially

23 Portrait Software Copyright 2007 –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning Best Practice

24 Portrait Software Copyright 2007 –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning Best Practise

25 Portrait Software Copyright 2007 –Derive a Training-Test field –Group “too small” categories –Reduce number of categories –Watch number of responses per node –(Watch confidence intervals of prediction) –Auto-pruning Best Practise

26 Portrait Software Copyright 2007 Reporting on your model –Audit the model you build –Monitor future ‘through the door’ populations

27 Portrait Software Copyright 2007 Where to find out more –Quadstone System Support website: http://support.quadstone.com/info/releases/#qs5.3 –Documentation –What’s new in the Quadstone System 5.3 release notes –Updated Quadstone System help (F1) –Updated Quadstone System data-build command and TML reference –Updated Data Build Manager reference –Updated Quadstone System administration reference –Customer-specific release notes –Quadstone System Support –Web Site: http://support.quadstone.com/http://support.quadstone.com/ –Email:support@portraitsoftware.comsupport@portraitsoftware.com –Tel: US 1-800-335-3860; All +44 131 240 3140

28 Portrait Software Copyright 2007Monday, February 22, 2016 Page 28 Portrait Software Copyright 2008 www.portraitsoftware.com Asia Pacific Level 7 15-17 Young Street Sydney NSW 2000 Australia F: +61 2 8004 9600 Questions? EMEA (Headquarters) The Smith Centre, The Fairmile Henley-on-Thames, Oxfordshire, RG9 6AB, United Kingdom T: +44 (0)1491 416600 F: +44 (0)1491 416601 The Americas 125 Summer Street 16 th Floor Boston MA 02110, USA T: +1 617 457-5200 F: +1 617 457-5299 Asia Pacific Level 7 15-17 Young Street Sydney NSW 2000 Australia F: +61 2 8004 9600


Download ppt "Make every interaction count™ Decision Trees: Profiling and Segmentation Sachin Chincholi, Professional Services Starting in 15 minutesStarting in 10 minutesStarting."

Similar presentations


Ads by Google