Download presentation
Presentation is loading. Please wait.
Published byRudolph Kenneth Mosley Modified over 9 years ago
1
Mehdi Ghayoumi MSB rm 132 mghayoum@kent.edu Ofc hr: Thur, 11-12 a Machine Learning
5
Each branch corresponds to attribute value Each internal node has a splitting predicate Each leaf node assigns a classification Machine Learning
6
Entropy (disorder, impurity) of a set of examples, S, relative to a binary classification is: where p 1 is the fraction of positive examples in S and p 0 is the fraction of negatives. Machine Learning
8
If all examples are in one category, entropy is zero (we define 0 log(0)=0) If examples are equally mixed (p 1 =p 0 =0.5), entropy is a maximum of 1. Entropy can be viewed as the number of bits required on average to encode the class of an example in S where data compression (e.g. Huffman coding) is used to give shorter codes to more likely cases. For multi-class problems with c categories, entropy generalizes to: Machine Learning
10
Gain (S, A) = expected reduction in entropy due to sorting on A Values (A) is the set of all possible values for attribute A, Sv is the subset of S which attribute A has value v Gain(S,A) is the expected reduction in entropy caused by knowing the values of attribute A. Machine Learning
11
Humidity High Normal 3+,4- 6+,1- E =.985 E =.592 Machine Learning
12
Humidity Wind High NormalWeakStrong 3+,4- 6+,1-6+2-3+,3- E =.985 E =.592 E =.811 E =1.0 Machine Learning
13
Humidity Wind High NormalWeakStrong 3+,4- 6+,1-6+2-3+,3- E =.985 E =.592 E =.811 E =1.0 Gain(S, Humidity ) =.94 - 7/14 0.985 - 7/14 0.592= 0.151 Gain(S, Wind ) =.94 - 8/14 0.811 - 6/14 1.0 = 0.048 Machine Learning
14
Outlook OvercastRain 3,7,12,13 4,5,6,10,14 3+,2- Sunny 1,2,8,9,11 4+,0-2+,3- 0.0 0.970 Gain(S, Outlook ) = 0.246 Machine Learning
15
Outlook Gain(S, Wind ) =0.048 Gain(S, Humidity ) =0.151 Gain(S, Temperature ) =0.029 Gain(S, Outlook ) =0.246 Machine Learning
16
Outlook OvercastRain 3,7,12,13 4,5,6,10,14 3+,2- Sunny 1,2,8,9,11 4+,0-2+,3- Yes?? Machine Learning
17
Outlook Overcast Rain 3,7,12,13 4,5,6,10,14 3+,2- Sunny 1,2,8,9,11 4+,0-2+,3- Yes??.97-(3/5) 0-(2/5) 0 =.97.97- 0-(2/5) 1 =.57.97-(2/5) 1 - (3/5).92=.02 Machine Learning
18
Outlook Overcast Rain 3,7,12,13 4,5,6,10,14 3+,2- Sunny 1,2,8,9,11 4+,0-2+,3- YesHumidity? NormalHigh NoYes Machine Learning
19
Outlook Overcast Rain 3,7,12,13 4,5,6,10,14 3+,2- Sunny 1,2,8,9,11 4+,0-2+,3- YesHumidityWind NormalHigh No Yes WeakStrong No Yes Machine Learning
30
Person Hair Length WeightAge Class Homer0”25036 M Vardhan10”15034 F Kumar2”9010 M Lisa6”788 F Maggie4”201 F Abe1”17070 M Selma8”16041 F Sai10”18038 M Krusty6”20045 M Machine Learning
31
Hair Length <= 5? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = 0.9911 Entropy(1F,3M) = -(1/4)log 2 (1/4) - (3/4)log 2 (3/4) = 0.8113 Entropy(3F,2M) = -(3/5)log 2 (3/5) - (2/5)log 2 (2/5) = 0.9710 Gain(Hair Length <= 5) = 0.9911 – (4/9 * 0.8113 + 5/9 * 0.9710 ) = 0.0911 Let us try splitting on Hair length Machine Learning
32
Weight <= 160? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = 0.9911 Entropy(4F,1M) = -(4/5)log 2 (4/5) - (1/5)log 2 (1/5) = 0.7219 Entropy(0F,4M) = -(0/4)log 2 (0/4) - (4/4)log 2 (4/4) = 0 Let us try splitting on Weight Machine Learning
33
age <= 40? yes no Entropy(4F,5M) = -(4/9)log 2 (4/9) - (5/9)log 2 (5/9) = 0.9911 Entropy(3F,3M) = -(3/6)log 2 (3/6) - (3/6)log 2 (3/6) = 1 Entropy(1F,2M) = -(1/3)log 2 (1/3) - (2/3)log 2 (2/3) = 0.9183 Let us try splitting on Age Machine Learning
34
Weight <= 160? yesno Hair Length <= 2? yes no Male Female Machine Learning
35
Weight <= 160? yesno Hair Length <= 2? yes no Male Female Rules to Classify Males/Females If Weight greater than 160, classify as Male Elseif Hair Length less than or equal to 2, classify as Male Else classify as Female Rules to Classify Males/Females If Weight greater than 160, classify as Male Elseif Hair Length less than or equal to 2, classify as Male Else classify as Female Machine Learning
36
Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.