Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell

EE3J2 Data Mining Objectives  Unsupervised and supervised learning  Modelling and discrimination  Introduction to Artificial Neural Networks (ANNs)

EE3J2 Data Mining Unsupervised learning  So far we have looked at techniques which try to discover structure in ‘raw’ data – data with no information about classes –Gaussian Mixture Modelling –Clustering  We treat the whole data set as a single entity, and try to discover underlying structure  The analysis is unsupervised, and automatic learning of the structure of the data is unsupervised learning

EE3J2 Data Mining Supervised learning  In some cases additional information is available  For example, for speech data we might know who was speaking, or what he or she said  This is information about the class of each piece of data  When the analysis is driven by class labels, it is called supervised learning

EE3J2 Data Mining Modelling and Discrimination  In supervised learning we can: –Analyse the data for each class separately –Try to discover how to distinguish between classes  Could apply GMM or clustering separately to model each class  Alternatively, we could try to find a method to discriminate between the classes

EE3J2 Data Mining Modelling and Discrimination Class models Decision boundary

EE3J2 Data Mining Discrimination  In the simplest cases we can discriminate between two classes using a class boundary  Allocation of a point to a class depends on which side of the boundary it lies Linear decision boundary Non- linear decision boundary

EE3J2 Data Mining Artificial Neural Networks  There are many approaches to discrimination  A common class of approaches is based on the idea of Artificial Neural Networks (ANNs)  Inspiration for the basic elements of an ANN (artificial neuron) comes from biology…  …but the analogy really stops there  ANNs are just a computational device for processing patterns – not “artificial brains”

EE3J2 Data Mining A model of a neuron

EE3J2 Data Mining An Artificial Neuron  Simple artificial neuron  Basic idea – –if the input to unit u 4 is big enough, then the neurone ‘fires’ –Otherwise nothing happens  How do we calculate the input to u 4 ? i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

EE3J2 Data Mining Artificial Neurone (2)  Suppose that the inputs to units 1, 2 and 3 are i 1, i 2 and i 3  Then the input to u 4 is:  In general, for an artificial neuron with N input units the input to unit k is: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

EE3J2 Data Mining The ‘threshold’ activation function  The activation function decides whether the neuron should “fire”  A suitable activation function is the threshold function g:  The output of u 4 is then: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

EE3J2 Data Mining Other activation functions  Linear:  Sigmoid Sigmoid activation function

EE3J2 Data Mining The ‘bias’  As described, the neuron will ‘fire’ only if its input is greater than 0  We can change the value of the point of firing by introducing a bias  This is an additional input unit whose input is fixed at 1 i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4 w b,4 1

EE3J2 Data Mining How the bias works…  The artificial neuron ‘fires’ if input to u 4 is greater than or equal to 0  I.E:  But this happens only if  Or, equivalently,

EE3J2 Data Mining Example (2D)  Suppose u has a threshold or sigmoid activation function  u will ‘fire’ if: xy 3 1 u -2 1

EE3J2 Data Mining Example (continued) xy 3 1 u4u4 -2 1 2/3 2 u1u1 u2u2 u3u3

EE3J2 Data Mining Example (continued)  Assume –linear activation functions for units u 1, u 2 and u 3 –Sigmoid activation function for u 4  If input to u 1 is 2 and input to u 2 is 2, then: –Input to u 4 is 2 × 3 + 2 ×1 + 1 × (-2) = 6 –Hence output from u 4 is g(6) = 0.998  If input to u 1 is -2 and input to u 2 is -2, then: –Input to u 4 is -2 × 3 + -2 ×1 + 1 × (-2) = -10 –Hence output from u 4 is g(-10) = 4.54 × 10 -5

EE3J2 Data Mining Example 2 xy 2 u4u4 1 1/2

EE3J2 Data Mining Combining 2 Artificial Neurons x y 3 1 u -2 1 x y 2 u 1 1/2 2 2/3

EE3J2 Data Mining Combining neurons – artificial neural networks x y 3 u4u4 -2 1 u1u1 u2u2 2 1 20 -20 u5u5 u6u6 -2 1

EE3J2 Data Mining Combining neurons  Input to u 4 is 3 × x + 1 × y - 2  Input to u 5 is 2 × x + (-1) × y – 1  When x = 3, y = 0 –Input to u 4 is 7, input to u 5 is 5 –Output from u 4 is 1, output from u 5 is 0.99 –Input to u 6 is 1 × 20 + 0.88 × (-20) - 2 = -1.88 –Output from u 6 is 0.13

EE3J2 Data Mining Outputs i1i1 i2i2 o6o6 300.13 0.521.00 0.5-20.00 00.06

EE3J2 Data Mining Combining neurones 2 2/3 ‘firing region’

EE3J2 Data Mining Single layer Multi-Layer Perceptron (MLP) Input layer Hidden layer Output layer

EE3J2 Data Mining Single Layer MLP  Can characterize arbitrary convex regions  Defines the region using linear decision boundaries

EE3J2 Data Mining Two-layer MLP Hidden layers

EE3J2 Data Mining Two-Layer MLP  An MLP with two hidden layers can characterize arbitrary shapes  First hidden layer characterises convex regions  Second hidden layer combines these convex regions  There is no advantage in having more than two hidden layers

EE3J2 Data Mining MLP training  To define an MLP must decide: –Number of layers –Number of input units –Number of hidden units –Number of output units  Once these are defined, properties of the MLP are completely defined by the values of the weights  How do we choose the weight values?

EE3J2 Data Mining MLP training (continued)  MLP weights learnt automatically from training data  We have already seen computational techniques for estimating: –Parameters of GMMs –Centroid positions in clustering  Similarly there is an iterative computational technique for estimating MLP weights – “Error- Back-Propagation”

EE3J2 Data Mining Error-back propagation (EBP)  EBP is a ‘gradient descent’ method, like others we have seen  First stage is to choose initial values for the weights  The EBP algorithm then changes the weights incrementally to identify the class boundaries  Only guaranteed to find a local optimum

EE3J2 Data Mining Other types of ANN  Multi-Layer Perceptrons (MLP) are not the only types of ANNs  There are lots of others: –Radial Basis Function (RBF) networks –Support Vector Machines (SVMs) –…  There are also ANN interpretations of other methods

EE3J2 Data Mining Summary  Discrimination versus Modelling  Brief introduction to neural networks  Definition of an ‘artificial neurone’  Activation functions – linear and sigmoid  Linear boundary defined by a single neurone  Convex region defined by a one-level MLP  Two-level MLPs

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

Similar presentations

Presentation on theme: "Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

Similar presentations

Presentation on theme: "Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell."— Presentation transcript:

Similar presentations

About project

Feedback