Download presentation
Presentation is loading. Please wait.
Published byEric Owen Modified over 9 years ago
1
1 Decision Trees ExampleSkyAirTempHumidityWindWaterForecastEnjoySport 1SunnyWarmNormalStrongWarmSameYes 2SunnyWarmHighStrongWarmSameYes 3RainyColdHighStrongWarmChangeNo 4SunnyWarmHighStrongCoolChangeYes 5CloudyWarmHighWeakCoolSameYes 6CloudyColdHighWeakCoolSameNo
2
2 Decision Trees Sky AirTemp SunnyRainyCloudy WarmCold Yes No YesNo (Sky = Sunny) (Sky = Cloudy AirTemp = Warm)
3
3 Decision Trees Sky AirTemp SunnyRainyCloudy WarmCold Yes No YesNo 7RainyWarmNormalWeakCoolSame? 8CloudyWarmHighStrongCoolChange?
4
4 Decision Trees Humidity NormalHigh Yes Sky AirTemp SunnyRainyCloudy WarmCold Yes No YesNo
5
5 Decision Trees + + + A 2 = v 2 A 1 = v 1
6
6 Homogenity of Examples Entropy(S) = p + log 2 p + p - log 2 p - 0.5
7
7 Homogenity of Examples Entropy(S) = i=1,c p i log 2 p i impurity measure
8
8 Information Gain Gain(S, A) = Entropy(S) v Values(A) (|S v |/|S|).Entropy(S v ) A S v1 S v2...
9
9 Example Entropy(S) = p + log 2 p + p - log 2 p - = (4/6)log 2 (4/6) (2/6)log 2 (2/6) = 0.389 + 0.528 = 0.917 Gain(S, Sky) = Entropy(S) v {Sunny, Rainy, Cloudy} (|S v |/|S|)Entropy(S v ) = Entropy(S) [(3/6).Entropy(S Sunny ) + (1/6).Entropy(S Rainy ) + (2/6).Entropy(S Cloudy )] = Entropy(S) (2/6).Entropy(S Cloudy ) = Entropy(S) (2/6)[ (1/2)log 2 (1/2) (1/2)log 2 (1/2)] = 0.917 0.333 = 0.584
10
10 Example Entropy(S) = p + log 2 p + p - log 2 p - = (4/6)log 2 (4/6) (2/6)log 2 (2/6) = 0.389 + 0.528 = 0.917 Gain(S, Water) = Entropy(S) v {Warm, Cool} (|S v |/|S|)Entropy(S v ) = Entropy(S) [(3/6).Entropy(S Warm ) + (3/6).Entropy(S Cool )] = Entropy(S) (3/6).2.[ (2/3)log 2 (2/3) (1/3)log 2 (1/3)] = Entropy(S) 0.389 0.528 = 0
11
11 Example Sky ? SunnyRainyCloudy Yes No Gain(S Cloudy, AirTemp) = Entropy(S Cloudy ) v {Warm, Cold} (|S v |/|S|)Entropy(S v ) = 1 Gain(S Cloudy, Humidity) = Entropy(S Cloudy ) v {Normal, High} (|S v |/|S|)Entropy(S v ) = 0
12
12 Inductive Bias Hypothesis space: complete!
13
13 Inductive Bias Hypothesis space: complete! Shorter trees are preferred over larger trees Prefer the simplest hypothesis that fits the data
14
14 Inductive Bias Decision Tree algorithm: searches incompletely thru a complete hypothesis space. Preference bias Cadidate-Elimination searches completely thru an incomplete hypothesis space. Restriction bias
15
15 Overfitting h H is said to overfit the training data if there exists h’ H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances:
16
16 Overfitting h H is said to overfit the training data if there exists h’ H, such that h has smaller error than h’ over the training examples, but h’ has a smaller error than h over the entire distribution of instances: – There is noise in the data – The number of training examples is too small to produce a representative sample of the target concept
17
17 Homework Exercises 3-1 3.4 (Chapter 3, ML textbook)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.