National Taiwan University

National Taiwan University
Deep Neural Networks J.-S. Roger Jang (張智星) MIR Lab, CSIE Dept. National Taiwan University

Concept of Modeling Modeling Two steps in modeling x1 xn . . . y y*
Given desired i/o pairs (training set) of the form (x1, ..., xn; y), construct a model to match the i/o pairs Two steps in modeling Structure identification: input selection, model complexity Parameter identification: optimal parameters x1 xn . . . Unknown target system y Model y*

Neural Networks Supervised Learning Unsupervised Learning Others
Multilayer perceptrons Radial basis function networks Modular neural networks LVQ (learning vector quantization) Unsupervised Learning Competitive learning networks Kohonen self-organizing networks ART (adaptive resonant theory) Others Hopfield networks

Single-layer Perceptrons
Proposed by Widrow & Hoff in 1960 AKA ADALINE (Adaptive Linear Neuron) or single-layer perceptron Training data x1 w0 w1 y w2 x2 (voice freq.) x2 Quiz! x1 (hair length)

Multilayer Perceptrons (MLPs)
Extension of SLP to MLP to have complex decision boundaries How to train MLPs? Use logistic function to replace signum function Use gradient descent for updating parameters x1 y1 x2 y2

Continuous Activation Functions
In order to use gradient descent, we need to replace the signum function by its continuous versions Logistic Hyper-tangent Identity y = 1/(1+exp(-x)) y = tanh(x/2) y = x

Classical MLPs Typical 2-layer MLPs: Learning rule x1 y1 x2 y2
Gradient descent (Backpropagation) Conjugate gradient method All optim. methods using first derivative Derivative-free optim. x1 y1 x2 y2

MLP Examples XOR problem Training data Network Arch. x1 x2 y x1 y x2
x1 y x2 x2 x1 y x2 x1

MLP Decision Boundaries
Single-layer: Half planes Exclusive-OR problem Meshed regions Most general regions A B A B B A

Two-layer: Convex regions Exclusive-OR problem Meshed regions Most general regions A B A B B A

Three-layer: Arbitrary regions Exclusive-OR problem Meshed regions Most general regions A B A B B A

Summary: MLP Decision Boundaries
XOR Interwined General 1-layer: Half planes A B A B B A 2-layer: Convex A B A B B A 3-layer: Arbitrary A B A B B A

MLP Configurations

Deep Neural Networks

Training an MLP Methods for training MLP
Gradient descent Gauss-Newton method Levenberg-Marquart method Backpropagation: A systematic way to compute gradients, starting from the NN’s output

Simple Derivatives Review of chain rule Network representation x y
f(x) x y

Chain Rule for Composite Functions
Review of chain rule Network representation f(x) y g(x) x z

Chain Rule for Network Representation
Review of chain rule Network representation y f(x) u h(y,z) x g(x) z

Backpropagation Backpropagation Example
A way to systematic computing the gradient Compute the gradient from output toward input Example

Use of Mini-batch in Gradient Descent
Goal: To speed up the training with large dataset Approach: Update by mini-batch instead of epoch If dataset size is 1000 Batch size = 10  100 updates in an epoch Batch size = 100  10 updates in an epoch Epoch: Process of going through all data Update by epoch Update by mini-batch Slow! Faster!

Use of Momentum Term in Gradient Descent
Purpose of using momentum term Avoid oscillations in gradient descent (banana function!) Escape from local minima Formula Original Updated Contours of banana function Momentum term

Optimizer in Keras Choices of optimization methods in Keras
SDG: Stochastic gradient descent Adagrad: Adaptive learning rate RMSprop: Similar to Adagrad Adam: Similar to RMSprop + momentum Nadam: Adam + Nesterov momentum

Loss Functions for Regression

Loss Functions for Classification

Activation Functions

Learning Rate Selection

Exercises Express the derivative of y=f(x) in terms of y:
Derive the derivative of tanh(x/2) in terms of sigmoid(x) Express tanh(x/2) in terms of sigmoid(x). Given y=sigmoid(x) and y’=(1+y)(1-y), find the derivative of tanh(x/2).

National Taiwan University

Similar presentations

Presentation on theme: "National Taiwan University"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

National Taiwan University

Similar presentations

Presentation on theme: "National Taiwan University"— Presentation transcript:

Similar presentations

About project

Feedback