Building And Interpreting Decision Trees in Enterprise Miner
Getting Up 2 Speed Open up the HMEQ project you worked on last class. You should drop 3 nodes in EM (Input, Insight, and Partition (to separate random training and validation) K:/(common)/tsupra/MARK2042/
Building Decision Trees Add a Tree Node Connect to Data Partition Node
Check Status, Model Role and Measurement
Splitting Criteria: binary target variables default is Ordinal target variables: must use Entropy or Gini. Here, We can use any of the three. These are typical statistical tests. See readings I handed out last class (WebCT).
Close Tree Node. Run it! View the results. Tree with 18 leaves grown based on training data, pruned back to 8 Based on validation. 8-leaf model has accuracy of 89.02% of the Validation set.
Choose View-Tree 10 leaves are visible here. New in EM Version 8.
Tree Options… Follow the tasks below
Colours and Proportion of target value. What did the 0 represent again? Leaves with all zeros will Be green. Individuals who will default on their loan will be Red. Inspect for high percentage of bad loans (red) and good loans (green)
Change the Statistics
Find missing values The branch that contains the Values greater than also Contains the missing values
Select this tab next
View a path to the node Right click an area
Using Tree Options – Default Tree Add New Tree (Default) Make 2 changes to the basic tab. Give This a max and a min set of values 2*25=50 is the RULE. Add the Assessment node And connect.
Close and Save the changes If you didn’t follow the RULE, You won’t be able to save. View the results…
Run it.
View the tree again.
The defaulted tree diagram. Is yours Different?
Running The Assessment Node Run the Assessment Node Select both the Trees
Interpretation View a Lift Chart Results!
Various Charts – what are they saying?
Further Study: See WebCT for more resources. More information on Decision Trees. Assignment 4 also up on WebCT. Group Assignment will be delivered next class.