Download presentation
Presentation is loading. Please wait.
1
Artificial Intelligence 7. Decision trees
Japan Advanced Institute of Science and Technology (JAIST) Yoshimasa Tsuruoka
2
Outline What is a decision tree? How to build a decision tree
Entropy Information Gain Overfitting Generalization performance Pruning Lecture slides
3
Decision trees Chapter 3 of Mitchell, T., Machine Learning (1997)
Disjunction of conjunctions Successfully applied to a broad range of tasks Diagnosing medical cases Assessing credit risk of loan applications Nice characteristics Understandable to human Robust to noise
4
A decision tree Concept: PlayTennis Outlook Humidity Wind Sunny
Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes
5
Classification by a decision tree
Instance <Outlook = Sunny, Temperature = Hot, Humidity = High, Wind = Strong> Outlook Sunny Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes
6
Disjunction of conjunctions
(Outlook = Sunny ^ Humidity = Normal) v (Outlook = Overcast) v (Outlook = Rain ^ Wind = Weak) Outlook Sunny Overcast Rain Humidity Wind Yes High Normal Strong Weak No Yes No Yes
7
Problems suited to decision trees
Instanced are represented by attribute-value pairs The target function has discrete target values Disjunctive descriptions may be required The training data may contain errors The training data may contain missing attribute values
8
Training data Day Outlook Temperature Humidity Wind PlayTennis D1
Sunny Hot High Weak No D2 Strong D3 Overcast Yes D4 Rain Mild D5 Cool Normal D6 D7 D8 D9 D10 D11 D12 D13 D14
9
Which attribute should be tested at each node?
We want to build a small decision tree Information gain How well a given attribute separates the training examples according to their target classification Reduction in entropy Entropy (im)purity of an arbitrary collection of examples
10
Entropy If there are only two classes In general,
11
Information Gain The expected reduction in entropy achieved by splitting the training examples
12
Example
13
Coumpiting Information Gain
Humidity Wind High Normal Weak Strong
14
Which attribute is the best classifier?
Information gain
15
Splitting training data with Outlook
{D1,D2,…,D14} [9+,5-] Outlook Sunny Overcast Rain {D1,D2,D8,D9,D11} [2+,3-] {D3,D7,D12,D13} [4+,0-] {D4,D5,D6,D10,D14} [3+,2-] Yes ? ?
16
Overfitting Growing each branch of the tree deeply enough to perfectly classify the training examples is not a good strategy. The resulting tree may overfit the training data Overfitting The tree can explain the training data very well but performs poorly on new data
17
Alleviating the overfitting problem
Several approaches Stop growing the tree earlier Post-prune the tree How can we evaluate the classification performance of the tree for new data? The available data are separated into two sets of examples: a training set and a validation (development) set
18
Validation (development) set
Use a portion of the original training data to estimate the generalization performance. Original training set Training set Validation set Test set Test set
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.