CSE 573 Introduction to Artificial Intelligence Neural Networks Henry Kautz Autumn 2005
Perceptron weighted sum of inputs (sigmoid unit) “soft” threshold constant term weighted sum of inputs “soft” threshold
Training a Neuron Idea: adjust weights to reduce sum of squared errors over training set Error = difference between actual and intended output Algorithm: gradient descent Calculate derivative (slope) of error function Take a small step in the “downward” direction Step size is the “training rate”
Gradient of the Error Function
Gradient of the Error Function
Single Unit Training Rule In short: adjust weights on inputs that were “on” in proportion to the error and the size of the output
Beyond Perceptrons Single units can learn any linear function Single layer of units can learn any set of linear inequalities Adding additional layers of “hidden” units between input and output allows any function to be learned! Hidden units trained by propagating errors back through the network
Character Recognition Demo
Beyond Backprop… Backpropagation is the most common algorithm for supervised learning with feed-forward neural networks Many other learning rules for these and other cases have been studied
Hebbian Learning Alternative to backprop for unsupervised learning Increase weights on connected neurons whenever both fire simultaneously Neurologically plausible (Hebbs 1949)
Self-organizing maps (SOMs) Unsupervised learning for clustering inputs “Winner take all” network one cell per cluster Learning rule: update weights near “winning” neuron to make it closer to the input
Recurrent Neural Networks Include time-delay feedback loops Can handle temporal data tasks, such as sequence prediction