Iterative Dichotomiser 3 By Christopher Archibald
Decision Trees A Decision tree is a tree with branching nodes with a choice between 2 or more choices. Decision Node: A node that a choice is made Leaf Node: The result from that point of the tree
Decision Trees Will it rain? If it is Sunny, it will not rain. If it is cloudy it will rain. If it is partially cloudy, it will depends on the if it is humid or not.
ID3 Invented by J. Ross Quinlan Employs a top-down greedy search through the space of possible decision trees. Select attribute that is most useful for classifying examples (attribute that has the highest Information Gain).
Entropy Entropy tells us how well an attribute will separate the given example according to the target classification class. Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg P pos = Proportion of positive examples P neg = proportion of negative example
Entropy Example Example: If S is a collection of 15 examples with 10 YES and 5 NO, then: Entropy(S) = - (10/15) log2 (10/15) - (5/15) log2 (5/15) = In your Calculator you would have to enter -((10/15)log(10/15))/log2 – ((5/15)log(10/15))/log2 Because log is set to base 10 and you need base 2
Information Gain Measures the expected reduction in entropy. The higher the Information Gain, more is the expected reduction in entropy. The Equation for Information gain is.
Information Gain A is an attribute of collection S S v = subset of S for which attribute A has value v |S v | = number of elements in S v |S| = number of elements in S
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
Example (cont) Entropy(S) = -P pos log 2 P pos – P neg log 2 P neg Entropy(4Y,2N): -(4/6)log 2 (4/6) – (2/6)log 2 (2/6) = Now that we know the Entropy where going to use that answer to find the Information Gain
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
Example (Cont) For Attributes (Contains Cars) S = [4Y,2N] S Yes = [3Y,2N] E(S Yes ) = S No = [1Y,0N] E(S No ) = 0 Gain (S, Contains Cars) = –[(5/6)* (1/6)*0] =
Example VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
Example (Cont) For Attributes (Contains Rally Cars) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = Gain (S, Contains Rally Cars) = – [(1/6)*0 + (5/6)*0.7219] =
VideoContains Car Contains Violence Rally Cars Races GTA 4Yes No DoomNoYesNo GTA3Yes No Halo 3Yes No Need for Speed YesNo Yes Rally Sport YesNoYesNo
Example (Cont) For Attributes (Races) S = [4Y,2N] S Yes = [0Y,1N] E(S Yes ) = 0 S No = [4Y,1N] E(S No ) = Gain (S, Races) = – [(1/6)*0 + (5/6)*0.7219] =
Example (Cont) Gain (S, Contains Cars) = Gain (S, Contains Rally Cars) = Gain (S, Races) =
Source Dr. Lee’s Slides, San Jose State University, Spring /Short-papers/2.htm