Download presentation
Presentation is loading. Please wait.
Published byMisael Tipton Modified over 10 years ago
1
Learning Functions and Neural Networks II 24-787 Lecture 9 Luoting Fu Spring 2012
2
Previous lecture 2 Applications Physiological basis Demos Perceptron Y = u(W 0 X 0 + W 1 X 1 + W b ) Y X0X0 X1X1 Δ W i = η (Y 0 -Y) X i x fHfH
3
In this lecture Multilayer perceptron (MLP) – Representation – Feed forward – Back-propagation Break Case studies Milestones & forefront 3 2
4
Perceptron 4 A 400-26 perceptron © Springer
5
5 XOR Exclusive OR
6
Root cause Consider a 2-1 perceptron, 6
7
A single perceptron is limited to learning linearly separable cases. 7 Minsky M. L. and Papert S. A. 1969. Perceptrons. Cambridge, MA: MIT Press.
8
8
9
9 Cybenko., G. (1989) "Approximations by superpositions of sigmoidal functions", Mathematics of Control, Signals, and Systems, 2 (4), 303-314 An MLP can learn any continuous function. A single perceptron is limited to learning linearly separable cases (linear function).
10
How’s that relevant? Function approximation Intelligence 10 Waveform Words Recognition The road ahead Speed Bearing Wheel turn Pedal depression Regression
11
11
12
12 0
13
13 1
14
14 2
15
15 3
16
16 3
17
17 ∞
18
18
19
Matrix representation 19
20
20 Knowledge learned by an MLP is encoded in its layers of weights.
21
What does it learn? Decision boundary perspective 21
22
What does it learn? Highly non-linear decision boundaries 22
23
What does it learn? Real world decision boundaries 23
24
24 Cybenko., G. (1989) "Approximations by superpositions of sigmoidal functions", Mathematics of Control, Signals, and Systems, 2 (4), 303-314 An MLP can learn any continuous function. Think Fourier.
25
What does it learn? Weight perspective 25 An 64-M-3 MLP
26
How does it learn? From examples By back propagation 26 0 1 2 3 4 5 6 7 8 9 Polar bear Not a polar bear
27
Back propagation 27
28
Gradient descent 28 “epoch”
29
29
30
Back propagation 30
31
Back propagation Steps 31 Think about this: What happens when you train a 10-layer MLP?
32
Overfitting and cross-validation 32 Learning curve error
33
Break 33
34
Design considerations Learning task X - input Y - output D M K #layers Training epochs Training data – # – Source 34
35
Case study 1: digit recognition 35 28 An 768-1000-10 MLP
36
Case study 1: digit recognition 36
37
Milestones: a race to 100% accuracy on MNIST 37
38
Milestones: a race to 100% accuracy on MNIST 38 CLASSIFIER ERROR RATE (%) Reported by Perceptron12.0LeCun et al. 1998 2-layer NN, 1000 hidden units4.5LeCun et al. 1998 5-layer Convolutional net0.95LeCun et al. 1998 5-layer Convolutional net0.4Simard et al. 2003 6-layer NN 784-2500-2000-1500- 1000-500-10 (on GPU) 0.35Ciresan et al. 2010 See full list at http://yann.lecun.com/exdb/mnist/
39
Milestones: a race to 100% accuracy on MNIST 39
40
Milestones: a race to 100% accuracy on MNIST 40
41
Case study 2: sketch recognition 41
42
Case study 2: sketch recognition Convolutional neural network 42 Convolution Sub-sampling Product Matrices Element of a vector Or Scope Transf. Fun. Gain Sum Sine wave … (LeCun, 1998)
43
Case study 2: sketch recognition 43
44
Case study 2: sketch recognition 44
45
Case study 3: autonomous driving 45 Pomerleau, 1995
46
Case study 4: sketch beautification 46 Orbay and Kara, 2011
47
Case study 4: sketch beautification 47
48
Case study 4: sketch beautification 48
49
Research forefront Deep belief network – Critique, or classify – Create, synthesize 49 Demo at: http://www.cs.toronto.edu/~hinton/adi/index.htm
50
In summary 1.Powerful machinery 2.Feed-forward 3.Back propagation 4.Design considerations 50
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.