Supplemental slides for CSE 327 Prof. Jeff Heflin Ch. 18 – Learning Supplemental slides for CSE 327 Prof. Jeff Heflin
Decision Tree Learning function Dec-Tree-Learn(examples,attribs,parent_examples) returns a decision tree if examples is empty then return Plurality-Value(parent_examples) else if all examples have the same classification then return the classification else if attribs is empty then return Plurality-Value (examples) else A argmaxaattribs Importance(a, examples) tree a new decision tree with root test A for each value vk of A do exs {e : e examples and e.A = vk} subtree Dec-Tree-Learn (exs,attribs – A, examples) add a branch to tree with label (A = vk) and subtree subtree return tree From Figure 18.5, p. 702
Decision Tree Data Set Example Color Size Shape Goal Predicate X1 blue small square no X2 green large triangle X3 red circle yes X4 X5 yellow X6 X7 X8
Decision Tree Result Shape? No No Color? Yes No Yes Yes +: X3,X6 -: X1,X2,X4,X5,X7,X8 Shape? circle square triangle +: -: X2,X7 +: -: X1,X4,X8 +: X3,X6 -: X5 No No Color? green red yellow blue +: X3,X6 -: +: -: X5 +: -: +: -: Yes No Yes Yes
Alternate Decision Tree +: X3,X6 -: X1,X2,X4,X5,X7,X8 What if Size was the first attribute? Size? small large +: X3 -: X2,X7 +: X6 -: X1,X4,X5,X8 Color? Shape? red yellow blue green circle square triangle +:X6 -:X8 +: -: X5 +: -: X1 +: -: X4 +: X3 -: +: -: +: -: X2,X7 Shape? No No No Yes No No circle square triangle +: X6 -: +: -: X8 +: -: Yes No No
A Neuron
Perceptron Learning function Perceptron-Learning(examples,network) returns a perceptron hypothesis inputs: examples, a set of examples with input x and output y network, a perceptron with weights Wj and activation function g repeat for each example (x,y) in examples do Err y – g(in) for each j in 0..n Wj Wj + Err g’(in) xj until some stopping criteria is satisfied return Neural-Net-Hypothesis(network)
Perceptron Training Example -1 W0=0.2 W1= -0.2 W2= 0.3 Training Set X1 X2 Y 1 =0.1 Epoch Ex X0 W0 X1 W1 X2 W2 in out Y Err W0 W1 W2 1 -1 0.2 -0.2 0.3 -0.1 0.1*1*-1 0.1*1*1 2 0.1 0.4 0.1*-1*-1 0.1*-1*0 0.1*-1*1 3 0.1*0*-1 0.1*0*0 4 -0.3 0.1*0*1 2 0 1 -0.4 = W1X1 + W2X2 – W0 = f(in) = Y - out = *Err*Xi
NETTalk /r/ … … O _ A R E _ Y 26 output units one layer of 80 hidden units … 7x29 input units O _ A R E _ Y
ALVINN 30 output units 5 hidden units 1 input pixel Straight ahead Sharp right Sharp left 30 output units 5 hidden units 1 input pixel Pictures from Tom Mitchell’s Machine Learning book slides Input is 30x32 pixels = 960 values
SVM Kernels Non-linear separator in 2 dimensions: Mapped to 3 dimensions