Presentation is loading. Please wait.

Presentation is loading. Please wait.

Training Examples. Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information.

Similar presentations


Presentation on theme: "Training Examples. Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information."— Presentation transcript:

1 Training Examples

2 Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information is contained in the final answer. Scale: –1 = completely clueless – the answer to Boolean question with prior –0 bit = complete knowledge – the answer to Boolean question with prior –? = answer to Boolean question with prior –The concept of Entropy

3 Entropy S is a sample of training examples p + is the proportion of positive examples p - is the proportion of negative examples Entropy measures the impurity of S Entropy(S) = -p + log 2 p + - p - log 2 p -

4 Information Gain Gain(S,A): expected reduction in entropy due to sorting S on attribute A Gain(S,A)=Entropy(S) -  v  values(A) |S v |/|S| Entropy(S v )

5 Information Gain Gain(S,A): expected reduction in entropy due to sorting S on attribute A Gain(S,A)=Entropy(S) -  v  values(A) |S v |/|S| Entropy(S v )

6 Training Examples

7 Selecting the First Attribute Humidity HighNormal [3+, 4-][6+, 1-] S=[9+,5-] E=0.940 Gain(S,Humidity) =0.940-(7/14)*0.985 – (7/14)*0.592 =0.151 E=0.985 E=0.592 Wind WeakStrong [6+, 2-][3+, 3-] S=[9+,5-] E=0.940 E=0.811E=1.0 Gain(S,Wind) =0.940-(8/14)*0.811 – (6/14)*1.0 =0.048 Humidity provides greater info. gain than Wind, w.r.t target classification.

8 Selecting the First Attribute Outlook Sunny Rain [2+, 3-] [3+, 2-] S=[9+,5-] E=0.940 Gain(S,Outlook) =0.940-(5/14)*0.971 -(4/14)*0.0 – (5/14)*0.971 =0.247 E=0.971 Over cast [4+, 0] E=0.0

9 Selecting the First Attribute The information gain values for the 4 attributes are: Gain(S,Outlook) =0.247 Gain(S,Humidity) =0.151 Gain(S,Wind) =0.048 Gain(S,Temperature) =0.029 where S denotes the collection of training examples

10 Selecting the Next Attribute Outlook SunnyOvercastRain Yes [D1,D2,…,D14] [9+,5-] S sunny =[D1,D2,D8,D9,D11] [2+,3-] ? ? [D3,D7,D12,D13] [4+,0-] [D4,D5,D6,D10,D14] [3+,2-] Gain(S sunny, Humidity)=0.970-(3/5)0.0 – 2/5(0.0) = 0.970 Gain(S sunny, Temp.)=0.970-(2/5)0.0 –2/5(1.0)-(1/5)0.0 = 0.570 Gain(S sunny, Wind)=0.970= -(2/5)1.0 – 3/5(0.918) = 0.019

11 ID3 Algorithm Outlook SunnyOvercastRain Humidity HighNormal Wind StrongWeak NoYes No [D3,D7,D12,D13] [D8,D9,D11] [D6,D14] [D1,D2] [D4,D5,D10]

12 Which attribute should we start with? ID#TextureTempSizeClassification 1SmoothColdLargeYes 2SmoothColdSmallNo 3SmoothCoolLargeYes 4SmoothCoolSmallYes 5SmoothHotSmallYes 6WavyColdMediumNo 7WavyHotLargeYes 8RoughColdLargeNo 9RoughCoolLargeYes 10RoughHotSmallNo 11RoughWarmMediumYes

13 Which node is the best? Texture (smooth,wavy,rough) 5/11 * ( -4/5*log4/5 – 1/5*log1/5) + 2/11 * (-1/2*log1/2 – ½ *log1/2) + 4/11 * (-2/4*log2/4 – 2/4*log2/4) = 5/11*(.722) + 2/11*1 + 4/11*1 =.874

14 Which node is the best? Temperature(cold,cool,hot,warm) 4/11* ( -1/4*log1/4 – 3/4*log3/4) + 3/11 * (-3/3*log3/3 – 0/3 *log0/3) + 3/11 * (-2/3*log2/3 – 1/3 *log1/3) + 1/11 * (-1/1*log1/1 – 0/1*log0/1) = 4/11*(.811) + 0 + 3/11*(.918) + 0 =.545

15 Which node is the best? Size (large,medium,small) 5/11 * ( -4/5*log4/5 – 1/5*log1/5) + 2/11 * (-1/2*log1/2 – ½ *log1/2) + 4/11 * (-2/4*log2/4 – 2/4*log2/4) = 5/11*(.722) + 2/11*1 + 4/11*1 =.874

16

17 Learning over time How do you evolve knowledge over time when you learn little bit by little bit? –Abstract version – the “Frinkle”

18 The Question –How can we build this kind of representation over time? The Answer –Rely on the concepts of false positives and false negatives

19 The idea False Positive –An example which is predicted to be positive but whose known outcome is negative –The problem is that our hypothesis is too general. –The solution is to add another condition to our hypothesis. False Negative –An example which is predicted to be negative but whose known outcome is positive –The problem is that our hypothesis is too restrictive. –The solution is to remove a condition to our hypothesis [or to add disjunction]

20 Creating a model one “case” at a time ID#TextureTempSizeClassification 1SmoothColdLargeYes 2SmoothColdSmallNo 3SmoothCoolLargeYes 4SmoothCoolSmallYes 5SmoothHotSmallYes 6WavyColdMediumNo 7WavyHotLargeYes 8RoughColdLargeNo 9RoughCoolLargeYes 10RoughHotSmallNo 11RoughWarmMediumYes


Download ppt "Training Examples. Entropy and Information Gain Information answers questions The more clueless I am about the answer initially, the more information."

Similar presentations


Ads by Google