Artificial Neural Network Yalong Li Some slides are from _24_2011_ann.pdf
Structure Motivation Artificial neural networks Learning: Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Some facts about our brain Performance tends to degrade gracefully under partial damage Learn (reorganize itself) from experience Recovery from damage is possible Performs massively parallel computations extremely efficiently For example, complex visual perception occurs within less than 100 ms, that is, 10 processing steps!(processing speed of synapses about 100hz) Supports our intelligence and self-awareness
Neural Networks in the Brain Cortex, midbrain, brainstem and cerebellum Visual System 10 or 11 processing stages have been identified feedforward earlier processing stages (near the sensory input) to later ones (near the motor output) feedback
Neurons and Synapses Basic computational unit in the nervous system is the nerve cell, or neuron.
Synaptic Learning One way brain learn is by altering the strengths of connections between neurons, and by adding or deleting connections between neurons LTP(long-term potentiation) Long-Term Potentiation: An enduring (>1 hour) increase in synaptic efficacy that results from high-frequency stimulation of an afferent (input) pathway The efficacy of a synapse can change as a result of experience, providing both memory and learning through long-term potentiation. One way this happens is through release of more neurotransmitter. Hebbs Postulate: "When an axon of cell A... excites[s] cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells so that A's efficiency as one of the cells firing B is increased.“ Points to note about LTP: Synapses become more or less important over time (plasticity) LTP is based on experience LTP is based only on local information (Hebb's postulate)
? ?
Structure Motivation Artificial neural networks Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Multilayer Networks of Sigmoid Units
Connectionist Models Consider humans: Neuron switching time ~.001 second Number of neurons ~10 10 fs Connections per neuron ~ Scene recognition time ~.1 second 100 inference steps doesn’t seem like enough → Much parallel compution Properties of artificial neural nets(ANN’s): Many neuron-like threshold switching units Many weighted interconnections among units Highly parallel, distributed process
Structure Motivation Artificial neural networks Learning: Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Backpropagation Algorithm Looks for the minium of the error function in weight space using the method of gradient descent. The combination of weights which minimizes the error function is considered to be a solution of the learning problem.
Sigmoid unit
Error Gradient for a Sigmoid Unit
Gradient Descent
Incremental(Stochastic) Gradient Descent
Backpropagation Algorithm(MLE)
Derivation of the BP rule: Goal: Error: Notation:
Backpropagation Algorithm(MLE) For ouput unit j:
Backpropagation Algorithm(MLE) For hidden unit j:
More on Backpropagation
Structure Motivation Artificial neural networks Learning: Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Overfitting in ANNs
Dealing with Overfitting
K-Fold Cross Validation
Leave-Out-One Cross Validation
Structure Motivation Artificial neural networks Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Expressive Capabilities of ANNs Single Layer: Preceptron XOR problem problem
Single Layer: Perceptron
Representational Power of Perceptrons hyperplane decision surface in the n-dimensional space of instances wx = 0 Linear separable sets Logical: and, or, … How to learn w ?
Single Layer: Perceptron Nonliear sets of examples?
Multi-layer perceptron, XOR
Multi-layer perceptron
Expressive Capabilities of ANNs
Leaning Hidden Layer Representations problem
Leaning Hidden Layer Representations problem
Leaning Hidden Layer Representations problem Auto Encoder?
Training
Neural Nets for Face Recognition
Leaning Hidden Layer Representations
Structure Motivation Artificial neural networks Backpropagation Algorithm Overfitting Expressive Capabilities of ANNs Summary
Thank you!