Download presentation
Presentation is loading. Please wait.
1
Decision trees MARIO REGIN
2
What is a decision tree? General purpose prediction and classification mechanism Emerged at the same time as the nascent fields of artificial intelligence and statistical computation
3
... ... Nomenclature Root Node Branch Level Input Level Nodes
NODE ID: 1 1: 38.2% 0: 61.8% Count: 1309 Level Female Male Gender NODE ID: 23 1: 72.7% 0: 27.3% Count: 466 NODE ID: 24 1: 19.1% 0: 80.9% Count: 843 Input Level ... ... Nodes
4
Main Characteristic Recursive subsetting of a target field of data according to the values of associated input fields or predictors to create partitions, and associated descendent data subsets, that contain progressively similar intra-node target values and progressively dissimilar inter-node values at any given level of the tree. In other words, decision trees reduce the disorder of the data as we go from higher to lower levels of the tree.
5
Creating a Decision Tree
1.- Create the root node 2.- Search through the data set to discover the best partitioning input for the current node. 3.- Use the best input to partition the current node to form branches. 4.- Repeat steps 2 and 3 until one of more possible stop conditions are met.
6
Best Input The selection of the ‘best’ input field is an open subject of active research. Decision trees allow for a variety of computational approaches to input selection. Some possible Approaches to select the best input: High performance predictive model approach. Relationship analysis approach.
7
High Performance Predictive Model Approach
The input that produces the most separation in the variability among the descendent nodes. Emphasis on the selection of high quality partitions that can collectively produce the best overall model.
8
Relationship Analysis Approach
Guide the branching in order to analyst, discover, support or confirm the conditional relations that are assumed to exist among the various inputs and the component nodes that they produce. Emphasis on analyzing the interaction in the formation of the tree
9
Stopping Rule Generally, stopping rules consist of thresholds on diminishing returns (in terms of test statistics) or in a diminishing supply of training cases (minimum acceptable number of observations in a node).
10
Tree Construction
11
? Three Construction ? P N H Y N Y A NODE ID: 1 1: 37.5% 0: 62.5%
Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 ? .5 Cast Shadow Likes Garlic Complexion .6 .68 ? P N H Y N Y A N: 2 Y: 2 N: 3 Y: 0 N: 0 Y: 1 N: 3 Y: 0 N: 2 Y: 3 N: 2 Y: 0 N: 1 Y: 2 N: 2 Y: 1
12
Three Construction NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y
Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y NODE ID: 2 1: 50.0% 0: 50.0% Count: 4 NODE ID: 3 1: 00.0% 0: 100% Count: 3 NODE ID: 4 1: 100% 0: 00.0% Count: 1
13
Three Construction P N H Y A NODE ID: 1 1: 37.5% 0: 62.5% Count: 8
Cast Shadow Likes Garlic Complexion Vampire ? Yes Pale No Healthy Average NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y NODE ID: 2 1: 50.0% 0: 50.0% Count: 4 NODE ID: 3 1: 00.0% 0: 100% Count: 3 NODE ID: 4 1: 100% 0: 00.0% Count: 1 Likes Garlic Complexion .5 P N H Y A N: 2 Y: 0 N: 0 Y: 2 N: 1 Y: 0 N: 0 Y: 1 N: 1 Y: 1
14
Three Construction NODE ID: 1 1: 37.5% 0: 62.5% Count: 8 Gender ? N Y
1: 37.5% 0: 62.5% Count: 8 Gender ? N Y NODE ID: 2 1: 50.0% 0: 50.0% Count: 4 NODE ID: 3 1: 00.0% 0: 100% Count: 3 NODE ID: 4 1: 100% 0: 00.0% Count: 1 Garlic Y N NODE ID: 5 1: 00.0% 0: 100% Count: 2 NODE ID: 6 1: 100% 0: 00.0% Count: 2
15
Evaluation
16
Survival in Titanic Sinking
NODE ID: 1 1: 38.2% 0: 61.8% Count: 1309 Male with missing Age Female Male Gender NODE ID: 23 1: 72.7% 0: 27.3% Count: 466 NODE ID: 24 1: 19.1% 0: 80.9% Count: 843 Age Age <30.75 or missing ≥30.75 ≥9.5 or missing <9.5 NODE ID: 30 1: 67.8% 0: 32.2% Count: 314 NODE ID: 31 1: 82.9% 0: 17.1% Count: 152 NODE ID: 33 1: 58.1% 0: 41.9% Count: 43 NODE ID: 34 1: 17.0% 0: 83.0% Count: 800
17
Unequal partition in age
Relationship NODE ID: 1 1: 38.2% 0: 61.8% Count: 1309 Female Male Women First Gender NODE ID: 23 1: 72.7% 0: 27.3% Count: 466 NODE ID: 24 1: 19.1% 0: 80.9% Count: 843 Children First Age Age <30.75 or missing ≥30.75 ≥9.5 or missing <9.5 NODE ID: 30 1: 67.8% 0: 32.2% Count: 314 NODE ID: 31 1: 82.9% 0: 17.1% Count: 152 NODE ID: 33 1: 58.1% 0: 41.9% Count: 43 NODE ID: 34 1: 17.0% 0: 83.0% Count: 800 Unequal partition in age
18
Characteristics of Decision Trees
Successive partitioning results in the presentation of a tree-like visual display with a top node and descendent branches. Branch partitions may be two-way or multiway branches. Partitioning fields may be nominal, ordinal, or interval measurement levels. The final result can be a class or a number
19
Characteristics of Decision Trees
Missing values.- Can be grouped with other values or have their own partition. Symmetry.- Descendent nodes can be balanced and symmetrical, employing a matching set of predictors with each level of the subtree. Asymmetry.- Descendent nodes can be unbalanced in that subnode partitions could be based on the most powerful predictor for each node .
20
Advantages of Decision trees
The created decision trees can detect and visually present contextual effects. There are easy to understand. The resulting model is a white box. Flexibility Cut-points for the same input can be different in each node. Missing values are allowed. Numerical and nominal data can be used as input. Output can be nominal or numerical.
21
Disadvantages of Decision Trees
Deep trees tend to over-fit Poor generalization
22
Multi Trees constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.
23
Creation Method The idea is to create multiple uncorrelated trees
Select a random subset of the training set Create a tree Ti with these random subset When creating the partitions of this tree use only a random subset of the inputs to search for the best input.
24
Evaluation If the output is a class (Classification)
Evaluate the sample in all the multiple trees Each tree votes for one class The selected class is the most voted class If the output is a number (Regression) Final result is the average of each result
25
Extra. Computation of Entropy
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.