Download presentation
Presentation is loading. Please wait.
1
Capabilities of Threshold Neurons
What do we do if we need a more complex function? We can combine multiple artificial neurons to form networks with increased capabilities. For example, we can build a two-layer network with any number of neurons in the first layer giving input to a single neuron in the second layer. The neuron in the second layer could, for example, implement an AND function. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
2
Capabilities of Threshold Neurons
x1 x2 . xi What kind of function can such a network realize? November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
3
Capabilities of Threshold Neurons
Assume that the dotted lines in the diagram represent the input-dividing lines implemented by the neurons in the first layer: 1st comp. 2nd comp. Then, for example, the second-layer neuron could output 1 if the input is within a polygon, and 0 otherwise. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
4
Capabilities of Threshold Neurons
However, we still may want to implement functions that are more complex than that. An obvious idea is to extend our network even further. Let us build a network that has three layers, with arbitrary numbers of neurons in the first and second layers and one neuron in the third layer. The first and second layers are completely connected, that is, each neuron in the first layer sends its output to every neuron in the second layer. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
5
Capabilities of Threshold Neurons
x1 x2 . oi What type of function can a three-layer network realize? November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
6
Capabilities of Threshold Neurons
Assume that the polygons in the diagram indicate the input regions for which each of the second-layer neurons yields output 1: 1st comp. 2nd comp. Then, for example, the third-layer neuron could output 1 if the input is within any of the polygons, and 0 otherwise. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
7
Capabilities of Threshold Neurons
November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
8
Capabilities of Threshold Neurons
The more neurons there are in the first layer, the more vertices can the polygons have. With a sufficient number of first-layer neurons, the polygons can approximate any given shape. The more neurons there are in the second layer, the more of these polygons can be combined to form the output function of the network. With a sufficient number of neurons and appropriate weight vectors wi, a three-layer network of threshold neurons can realize any (!) function Rn {0, 1}. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
9
Terminology Usually, we draw neural networks in such a way that the input enters at the bottom and the output is generated at the top. Arrows indicate the direction of data flow. The first layer, termed input layer, just contains the input vector and does not perform any computations. The second layer, termed hidden layer, receives input from the input layer and sends its output to the output layer. After applying their activation function, the neurons in the output layer contain the output vector. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
10
Terminology Example: Network function f: R3 {0, 1}2 output vector
output layer hidden layer input layer input vector November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
11
General Network Structure
November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
12
Feedback-Based Weight Adaptation
Feedback from environment (possibly teacher) is used to improve the system’s performance Synaptic weights are modified to reduce the system’s error in computing a desired function For example, if increasing a specific weight increases error, then the weight is decreased Small adaptation steps are needed to find optimal set of weights Learning rate can vary during learning process Typical for supervised learning November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
13
Network Training Basic idea: Define error function to measure deviation of network output from desired output across all training exemplars. As the weights of the network completely determine the function computed by it, this error is a function of all weights. We need to find those weights that minimize the error. An efficient way of doing this is based on the technique of gradient descent. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
14
Gradient Descent Gradient descent is a very common technique to find the absolute minimum of a function. It is especially useful for high-dimensional functions. We will use it to iteratively minimizes the network’s (or neuron’s) error by finding the gradient of the error surface in weight-space and adjusting the weights in the opposite direction. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
15
Gradient Descent Gradient-descent example: Finding the absolute minimum of a one-dimensional error function f(x): f(x) x slope: f’(x0) x0 x1 = x0 - f’(x0) Repeat this iteratively until for some xi, f’(xi) is sufficiently close to 0. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
16
Gradient Descent Gradients of two-dimensional functions:
The two-dimensional function in the left diagram is represented by contour lines in the right diagram, where arrows indicate the gradient of the function at different locations. Obviously, the gradient is always pointing in the direction of the steepest increase of the function. In order to find the function’s minimum, we should always move against the gradient. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
17
Multilayer Networks The backpropagation algorithm was popularized by Rumelhart, Hinton, and Williams (1986). This algorithm solved the “credit assignment” problem, i.e., crediting or blaming individual neurons across layers for particular outputs. The error at the output layer is propagated backwards to units at lower layers, so that the weights of all neurons can be adapted appropriately. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
18
Backpropagation Learning
Algorithm Backpropagation; Start with randomly chosen weights; while MSE is above desired threshold and computational bounds are not exceeded, do for each input pattern xp, 1 p P, Compute hidden node inputs; Compute hidden node outputs; Compute inputs to the output nodes; Compute the network outputs; Compute the error between output and desired output; Modify the weights between hidden and output nodes; Modify the weights between input and hidden nodes; end-for end-while. November 13, 2018 Introduction to Artificial Intelligence Lecture 18: Neural Network Paradigms III
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.