NEURAL NETWORKS Introduction For many decades, it has been a goal of science and engineering is to develop intelligent machines with a large number of simple elements. The interest in neural network comes from the networks’ ability to mimic human brain as well as its ability to learn and respond. This course introduces these models and their learning mechanisms. This introductory part will draw the limits of our course. PROF. DR. YUSUF OYSAL
INTRODUCTION TO NEURAL NETWORKS NEURAL NETWORKS – Introduction INTRODUCTION TO NEURAL NETWORKS Introduction Single-Layer Perceptron Networks Learning Rules for Single-Layer Perceptron Networks Perceptron Learning Rule Adaline Leaning Rule -Leaning Rule Multilayer Perceptron Back Propagation Learning algorithm In the introduction part, the models of human neurons will be introduced with respect to information transmission model and structures. Then the first model known as perceptron in the earliest literature of neural networks will be explained. And also the details of the perceptron learning rules will be illustrated in this part. The limitations of single layer perceptron for data classification will be overcome by the multilayer structure. The reasons and their structural conclusions will be explained. And finally one of the most popular gradient based learning algorithm known as back propagation learning algorithm will be explained. The course will cover basics of neural network theory and practice for supervised and unsupervised learning. Moreover, the student will learn how to implement and apply basic learning algorithms of neural networks. Course information, program, slides and links to on-line material are available at our department website.
Historical Background NEURAL NETWORKS – Introduction Historical Background 1943 McCulloch and Pitts proposed the first computational models of neuron. 1949 Hebb proposed the first learning rule. 1958 Rosenblatt’s work in perceptrons. 1969 Minsky and Papert’s exposed limitation of the theory. 1970s Decade of dormancy for neural networks. 1980-90s Neural network return (self-organization, back-propagation algorithms, etc) 1990-Today Advanced applications based fast learning algorithms In 1940s, researchers have developed simple hardware and later software models of biological neurons and their interaction systems. Mc Culloch and Pitts published the first systematic study of the artificial neural network. Four years later, they explored network paradigms for pattern recognition using a single layer perceptron. Some other researchers work on this model until 1960s. A group of researchers combined these biological models and physicological insights to produce the first artifical neurla network (ANN). Here is the list of historical background. Neural networks nowadays have been used in a large number of applications and have proven to be effective in performing complex functions in a variety of fields such as pattern recognition, classification, vision, control systems and prediction.
NEURAL NETWORKS – Introduction Nervous Systems UNITs: nerve cells called neurons, many different types and are extremely complex Human brain contains ~ 1011 neurons. INTERACTIONs: signal is conveyed by action potentials, interactions could be chemical (release or receive neurotransmitters) or electrical at the synapse. Each neuron is connected ~ 104 others. Some scientists compared the brain with a “complex, nonlinear, parallel computer”. The largest modern neural networks achieve the complexity comparable to a nervous system of a fly. Biological units, the basic building blocks of the brain are nerve cells called neurons. At the early stage of the human brain development (the first two years from the birth) about 1 million synapses (hard-wired connections) are performed per second and synapses are then modified through the learning process. Many different types and are extremely complex nearly 1011 neurons exist in human brain. Interaction between each neuron means a signal conveyed by action potentials. Interactions could be chemical (release or receive neurotransmitters) or electrical at the synapse. Each neuron is connected ~ 104 others. Some scientists compared the brain with a “complex, nonlinear, parallel computer”. And they investigated that the neurons operate in milliseconds which is about six orders of magnitude slower than the silicon gates operating in nanosecond range. The brain makes up for the slow rate of operation with two factors: 1. a huge number of neurons and interconnections between them and 2) a function of biological neuron seems to be much more complex than that of a logical gate. The brain is also very energy efficient. It consumes only 10 to the power -16 joules per operation per second, comparing with 10 to the power -6 joules per operation per second for a digital computer. The largest modern neural networks achieve the complexity comparable to a nervous system of a fly. The brain is highly complex, nonlinear, parallel information processing system. It performs tasks like pattern recognition, perception, motor control many times faster than the fastest digital computers.
Neurons NEURAL NETWORKS – Introduction Each neurons in the brain is composed of a cell body, one axon and multitude of dendrites. The dendrites receive signals from other neurons. The cell body (soma) of a neuron sums its incoming signals from dendrites as well as the signals from numerous synapses on its surface. A particular neuron will send an impulse to its axon if sufficient input signals are received to stimulate the neuron to its threshold level. The axon can be considered as a long tube which divides into brunches terminating in little end bulbs. The small gap between an end bulb and a dendrite is called a synapse. The axon of a single neuron forms synaptic connections with many other neurons. The synapses are elementary signal processing devices. A synapse is a biochemical device which converts a pre-synaptic electrical signal into a chemical signal and then back into a post-synaptic electrical signal. The input pulse train has its amplitude modified by parameters stored in the synapse. The nature of the modification depends on the type of the synapse, which can be either inhibitory or excitatory. The post-synaptic signals are aggregated and transferred along the dendrites to the nerve cell body. The cell body generates the output neuron signals, a spike, which is transferred along the axon to the synaptic terminals of other neurons. The frequency of firing of a neuron is proportional to the total synaptic activities and is controlled by the synaptic parameters called weights.
A Model of Artificial Neuron NEURAL NETWORKS – Introduction A Model of Artificial Neuron x1 x2 xm= 1 wi1 wi2 wim =i . f (.) a (.) yi bias The main purpose of neurons is to receive, analyze and transmit further the information in a form of signals (electric pulses). When a neuron sends the information we say that a neuron “fires”. Acting through specialized projections known as dendrites and axons, neurons carry information throughout the neural network. A neural network (NN) is an interconnected assembly of simple processing elements, units or nodes, whose functionality is loosely based on the animal neuron. The processing ability of the network is stored in the inter unit connection strengths, or weights, obtained by a process of adapting to, or learning from, a set of training patterns. As a conclusion an artifical neuron is defined by the components: a set of inputs Xi, a set of synaptic weights Wi, a bias that represents the threshold level, an activation function that represents the activation level of the sending message through the axon and a neuron output y.
A Model of Artificial Neuron NEURAL NETWORKS – Introduction A Model of Artificial Neuron . . . x1 x2 xm y1 y2 yn Graph representation: nodes: neurons arrows: signal flow directions A neural network that does not contain cycles (feedback loops) is called a feed–forward network (or perceptron). Here is a graph representation of an artificial neural network. The network contains many neurons showed by nodes. Arrows are used to represent signal flow directions. A neural network that does not contain cycles (feedback loops) as in this figure is called a feed–forward network (or perceptron). An artificial neural network (ANN) is a massively parallel distributed computing system (algorithm, device, or other) that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two aspects according to Haykın: 1). Knowledge is acquired by the network through a learning process. 2). Inter–neuron connection strengths known as synaptic weights are used to store the knowledge.
Output Layer Hidden Layer(s) Input Layer Layered Structure x1 x2 xm y1 NEURAL NETWORKS – Introduction Layered Structure . . . x1 x2 xm y1 y2 yn Output Layer Hidden Layer(s) An artificial neural network has a layered structure. It has three main layers. The input signal comes from the other neurons are combined as input signals through the input layer. And then the activation potential is formed as a linear combination of input signals and synaptic strength parameters through a hidden layer or hidden layers. Finally, the output layer produces output signals that may also be inputs of other neural networks in the brain. Properties of the layered structure can be summarized as follows: No connections within a layer No direct connections between input and output layers Fully connected between layers Often more than 3 layers Number of output units need not equal number of input units Number of hidden units per layer can be more or less than input or output units Input Layer
Knowledge and Memory x1 x2 xm y1 y2 yn NEURAL NETWORKS – Introduction Knowledge and Memory . . . x1 x2 xm y1 y2 yn The output behavior of a network is determined by the weights. Weights the memory of an NN. Knowledge distributed across the network. Large number of nodes increases the storage “capacity”; ensures that the knowledge is robust; fault tolerance. Store new information by changing weights. The main purpose of neurons is to receive, analyze and transmit further the information in a form of signals (electric pulses). When a neuron sends the information we say that a neuron “fires”. Acting through specialized projections known as dendrites and axons, neurons carry information throughout the neural network. The output behavior of a network is determined by the weights. Weights are the memory of an artificial neural network (ANN). Learning of an ANN means adjusting these weights to the proper value. After the learning process, knowledge is distributed across the network. Large number of nodes increases the storage “capacity, ensures that the knowledge is robust with acceptable fault tolerance. Learning continues when new information stored by changing these weights.
Neuron Models v y x1 x2 xm w2 wm w1 w0 x0 = +1 Input signal Synaptic NEURAL NETWORKS – Introduction Neuron Models Input signal Synaptic weights Summing function Activation Local Field v Output y x1 x2 xm w2 wm w1 w0 x0 = +1 A single neuron model can be viewed as a network which has a set of parameters associated with it. This network transforms input data into an output through an activation function. The neuron model names are determined by the activation functions in its processing units.
Neuron Models If the activation function is a step function or a sign function, the network is named as a single perceptron. If the activation function is a ramp function or a linear function, it is named as Adaline (Adaptive Linear Element). If the activation function is a sigmoid function and this single neuron used in a multilayer structure, the neural network is named as multilayer perceptron. And if the network uses Gaussian activation function of the norm function of inputs, this network is known as radial basis function network. Our course will cover these concepts and their learning algorithms.
Network architectures NEURAL NETWORKS – Introduction Network architectures Three different classes of network architectures single-layer feed-forward neurons are organized multi-layer feed-forward in acyclic layers recurrent The architecture of a neural network is linked with the learning algorithm used to train According to the network structures, neural networks are classified into three architectures. These are single-layer feed forward, multi-layer feed forward and recurrent neural networks. The architecture of a neural network is linked with the learning algorithm used to train. Three different classes of network architectures single-layer feed-forward: neurons are organized multi-layer feed-forward: in acyclic layers recurrent The architecture of a neural network is linked with the learning algorithm used to train
Single Layer Feed-forward NEURAL NETWORKS – Introduction Single Layer Feed-forward Input layer of source nodes Output layer of neurons In the simplest form of a layered network, we have an input layer of source nodes that projects onto an output layer neurons, but not vice versa. In other words, this network is strictly a feed-forward or acyclic type. Single layer means a single output layer of computation nodes (neurons).
Multi-layer feed-forward NEURAL NETWORKS – Introduction Multi-layer feed-forward 3-4-2 Network Output layer Input layer The source nodes (neurons) in the input layer of the network supply respective elements of the activation pattern which constitute the input signals applied to the neurons. And then the activation potential is formed as a linear combination of input signals and synaptic strength parameters through a hidden layer or hidden layers. This layer may be more than one layer and outputs of each layer are the inputs of the next layer, and so on for the rest of the network. Hidden Layer(s)
Recurrent network z-1 input hidden output NEURAL NETWORKS – Introduction Recurrent network z-1 input hidden output A recurrent neural network distinguishes itself from a feed-forward neural network in that it has at least one feedback loop. For example in the figure, there are no self feedback loops in the network and no hidden neurons. The feedback loops involve the use of particular branches composed of unit-delay elements which results in a nonlinear dynamical behavior, assuming that the network contains nonlinear units.
ANN Configuration NEURAL NETWORKS – Introduction Uni-directional communication links represented by directed arcs. The ANN structure thus can be described by a directed graph. The neural network is said to be fully connected in the sense that every node in each layer of the network is connected to every other node in the adjacent forward layer. If however, some of the communication links (syanptic connections) are missing from the network, we say that the network is partially connected.