Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012.

Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012

Previous lecture 2 Applications Physiological basis Demos Perceptron Y = u(W 0 X 0 + W 1 X 1 + W b ) Y X0X0 X1X1 Δ W i = η (Y 0 -Y) X i x fHfH

In this lecture Multilayer perceptron (MLP) – Representation – Feed forward – Back-propagation Break Case studies Milestones & forefront 3 2

Perceptron 4 A 400-26 perceptron © Springer

5 XOR Exclusive OR

Root cause Consider a 2-1 perceptron, 6

A single perceptron is limited to learning linearly separable cases. 7 Minsky M. L. and Papert S. A. 1969. Perceptrons. Cambridge, MA: MIT Press.

9 Cybenko., G. (1989) "Approximations by superpositions of sigmoidal functions", Mathematics of Control, Signals, and Systems, 2 (4), 303-314 An MLP can learn any continuous function. A single perceptron is limited to learning linearly separable cases (linear function).

How’s that relevant? Function approximation Intelligence 10 Waveform Words Recognition The road ahead Speed Bearing Wheel turn Pedal depression Regression

17 ∞

Matrix representation 19

20 Knowledge learned by an MLP is encoded in its layers of weights.

What does it learn? Decision boundary perspective 21

What does it learn? Highly non-linear decision boundaries 22

What does it learn? Real world decision boundaries 23

24 Cybenko., G. (1989) "Approximations by superpositions of sigmoidal functions", Mathematics of Control, Signals, and Systems, 2 (4), 303-314 An MLP can learn any continuous function. Think Fourier.

What does it learn? Weight perspective 25 An 64-M-3 MLP

How does it learn? From examples By back propagation 26 0 1 2 3 4 5 6 7 8 9 Polar bear Not a polar bear

Back propagation 27

Gradient descent 28 “epoch”

Back propagation 30

Back propagation Steps 31 Think about this: What happens when you train a 10-layer MLP?

Overfitting and cross-validation 32 Learning curve error

Break 33

Design considerations Learning task X - input Y - output D M K #layers Training epochs Training data – # – Source 34

Case study 1: digit recognition 35 28 An 768-1000-10 MLP

Case study 1: digit recognition 36

Milestones: a race to 100% accuracy on MNIST 37

Milestones: a race to 100% accuracy on MNIST 38 CLASSIFIER ERROR RATE (%) Reported by Perceptron12.0LeCun et al. 1998 2-layer NN, 1000 hidden units4.5LeCun et al. 1998 5-layer Convolutional net0.95LeCun et al. 1998 5-layer Convolutional net0.4Simard et al. 2003 6-layer NN 784-2500-2000-1500- 1000-500-10 (on GPU) 0.35Ciresan et al. 2010 See full list at http://yann.lecun.com/exdb/mnist/

Case study 2: sketch recognition 41

Case study 2: sketch recognition Convolutional neural network 42 Convolution Sub-sampling Product Matrices Element of a vector Or Scope Transf. Fun. Gain Sum Sine wave … (LeCun, 1998)

Case study 3: autonomous driving 45 Pomerleau, 1995

Case study 4: sketch beautification 46 Orbay and Kara, 2011

Case study 4: sketch beautification 47

Case study 4: sketch beautification 48

Research forefront Deep belief network – Critique, or classify – Create, synthesize 49 Demo at: http://www.cs.toronto.edu/~hinton/adi/index.htm

In summary 1.Powerful machinery 2.Feed-forward 3.Back propagation 4.Design considerations 50

Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012.

Similar presentations

Presentation on theme: "Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012.

Similar presentations

Presentation on theme: "Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012."— Presentation transcript:

Similar presentations

About project

Feedback