Download presentation
Presentation is loading. Please wait.
Published byLiliana Green Modified over 9 years ago
1
Iterative Dichotomiser 3 By Christopher Archibald
2
Decision Trees A Decision tree is a tree with branching nodes with a choice between 2 or more choices. Decision Node: A node that a choice is made Leaf Node: The result from that point of the tree
3
Decision Trees Will it rain? If it is Sunny, it will not rain. If it is cloudy it will rain. If it is partially cloudy, it will depends on the if it is humid or not.
4
ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through the space of possible decision trees. Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).
5
Entropy Entropy tells us how well an attribute will separate the given example according to the target classification class. Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg P pos = Proportion of positive examples P neg = proportion of negative example
6
Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = 0.918 In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2
7
Information Gain Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. The Equation for Information gain is.
8
Information Gain A is an attribute of collection S S v = subset of S for which attribute A has value v |S v | = number of elements in S v |S| = number of elements in S
9
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
10
Example (cont) Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg Entropy(4Y,2N): -(4/6)log 2 (4/6) – (2/6)log 2 (2/6) = 0.91829 Now that we know the Entropy where going to use that answer to find the Information Gain
11
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
12
Example (Cont) For Attributes (Contains Cars) S = [4Y,2N] S Yes = [3Y,2N] E(S Yes ) = 0.97095 S No = [1Y,0N] E(S No ) = 0 Gain (S, Contains Cars) = 0.91829–[(5/6)*0.97095 + (1/6)*0] = 0.10916
13
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
14
Example (Cont) For Attributes (Contains Rally Cars) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Contains Rally Cars) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167
15
VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
16
Example (Cont) For Attributes (Races) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = 0.7219 Gain (S, Races) = 0.91829 – [(1/6)*0 + (5/6)*0.7219] = 0.3167
17
Example (Cont) Gain (S, Contains Cars) = 0.10916 Gain (S, Contains Rally Cars) = 0.3167 Gain (S, Races) = 0.3167
18
Source Dr. Lee’s Slides, San Jose State University, Spring 2008 http://www.cise.ufl.edu/~ddd/cap6635/Fall- 97/Short-papers/2.htm http://decisiontrees.net/node/27
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.