Download presentation
Presentation is loading. Please wait.
1
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell
2
Slide 2 EE3J2 Data Mining Objectives Unsupervised and supervised learning Modelling and discrimination Introduction to Artificial Neural Networks (ANNs)
3
Slide 3 EE3J2 Data Mining Unsupervised learning So far we have looked at techniques which try to discover structure in ‘raw’ data – data with no information about classes –Gaussian Mixture Modelling –Clustering We treat the whole data set as a single entity, and try to discover underlying structure The analysis is unsupervised, and automatic learning of the structure of the data is unsupervised learning
4
Slide 4 EE3J2 Data Mining Supervised learning In some cases additional information is available For example, for speech data we might know who was speaking, or what he or she said This is information about the class of each piece of data When the analysis is driven by class labels, it is called supervised learning
5
Slide 5 EE3J2 Data Mining Modelling and Discrimination In supervised learning we can: –Analyse the data for each class separately –Try to discover how to distinguish between classes Could apply GMM or clustering separately to model each class Alternatively, we could try to find a method to discriminate between the classes
6
Slide 6 EE3J2 Data Mining Modelling and Discrimination Class models Decision boundary
7
Slide 7 EE3J2 Data Mining Discrimination In the simplest cases we can discriminate between two classes using a class boundary Allocation of a point to a class depends on which side of the boundary it lies Linear decision boundary Non- linear decision boundary
8
Slide 8 EE3J2 Data Mining Artificial Neural Networks There are many approaches to discrimination A common class of approaches is based on the idea of Artificial Neural Networks (ANNs) Inspiration for the basic elements of an ANN (artificial neuron) comes from biology… …but the analogy really stops there ANNs are just a computational device for processing patterns – not “artificial brains”
9
Slide 9 EE3J2 Data Mining A model of a neuron
10
Slide 10 EE3J2 Data Mining An Artificial Neuron Simple artificial neuron Basic idea – –if the input to unit u 4 is big enough, then the neurone ‘fires’ –Otherwise nothing happens How do we calculate the input to u 4 ? i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4
11
Slide 11 EE3J2 Data Mining Artificial Neurone (2) Suppose that the inputs to units 1, 2 and 3 are i 1, i 2 and i 3 Then the input to u 4 is: In general, for an artificial neuron with N input units the input to unit k is: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4
12
Slide 12 EE3J2 Data Mining The ‘threshold’ activation function The activation function decides whether the neuron should “fire” A suitable activation function is the threshold function g: The output of u 4 is then: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4
13
Slide 13 EE3J2 Data Mining Other activation functions Linear: Sigmoid Sigmoid activation function
14
Slide 14 EE3J2 Data Mining The ‘bias’ As described, the neuron will ‘fire’ only if its input is greater than 0 We can change the value of the point of firing by introducing a bias This is an additional input unit whose input is fixed at 1 i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4 w b,4 1
15
Slide 15 EE3J2 Data Mining How the bias works… The artificial neuron ‘fires’ if input to u 4 is greater than or equal to 0 I.E: But this happens only if Or, equivalently,
16
Slide 16 EE3J2 Data Mining Example (2D) Suppose u has a threshold or sigmoid activation function u will ‘fire’ if: xy 3 1 u -2 1
17
Slide 17 EE3J2 Data Mining Example (continued) xy 3 1 u4u4 -2 1 2/3 2 u1u1 u2u2 u3u3
18
Slide 18 EE3J2 Data Mining Example (continued) Assume –linear activation functions for units u 1, u 2 and u 3 –Sigmoid activation function for u 4 If input to u 1 is 2 and input to u 2 is 2, then: –Input to u 4 is 2 × 3 + 2 ×1 + 1 × (-2) = 6 –Hence output from u 4 is g(6) = 0.998 If input to u 1 is -2 and input to u 2 is -2, then: –Input to u 4 is -2 × 3 + -2 ×1 + 1 × (-2) = -10 –Hence output from u 4 is g(-10) = 4.54 × 10 -5
19
Slide 19 EE3J2 Data Mining Example 2 xy 2 u4u4 1 1/2
20
Slide 20 EE3J2 Data Mining Combining 2 Artificial Neurons x y 3 1 u -2 1 x y 2 u 1 1/2 2 2/3
21
Slide 21 EE3J2 Data Mining Combining neurons – artificial neural networks x y 3 u4u4 -2 1 u1u1 u2u2 2 1 20 -20 u5u5 u6u6 -2 1
22
Slide 22 EE3J2 Data Mining Combining neurons Input to u 4 is 3 × x + 1 × y - 2 Input to u 5 is 2 × x + (-1) × y – 1 When x = 3, y = 0 –Input to u 4 is 7, input to u 5 is 5 –Output from u 4 is 1, output from u 5 is 0.99 –Input to u 6 is 1 × 20 + 0.88 × (-20) - 2 = -1.88 –Output from u 6 is 0.13
23
Slide 23 EE3J2 Data Mining Outputs i1i1 i2i2 o6o6 300.13 0.521.00 0.5-20.00 00.06
24
Slide 24 EE3J2 Data Mining Combining neurones 2 2/3 ‘firing region’
25
Slide 25 EE3J2 Data Mining Single layer Multi-Layer Perceptron (MLP) Input layer Hidden layer Output layer
26
Slide 26 EE3J2 Data Mining Single Layer MLP Can characterize arbitrary convex regions Defines the region using linear decision boundaries
27
Slide 27 EE3J2 Data Mining Two-layer MLP Hidden layers
28
Slide 28 EE3J2 Data Mining Two-Layer MLP An MLP with two hidden layers can characterize arbitrary shapes First hidden layer characterises convex regions Second hidden layer combines these convex regions There is no advantage in having more than two hidden layers
29
Slide 29 EE3J2 Data Mining MLP training To define an MLP must decide: –Number of layers –Number of input units –Number of hidden units –Number of output units Once these are defined, properties of the MLP are completely defined by the values of the weights How do we choose the weight values?
30
Slide 30 EE3J2 Data Mining MLP training (continued) MLP weights learnt automatically from training data We have already seen computational techniques for estimating: –Parameters of GMMs –Centroid positions in clustering Similarly there is an iterative computational technique for estimating MLP weights – “Error- Back-Propagation”
31
Slide 31 EE3J2 Data Mining Error-back propagation (EBP) EBP is a ‘gradient descent’ method, like others we have seen First stage is to choose initial values for the weights The EBP algorithm then changes the weights incrementally to identify the class boundaries Only guaranteed to find a local optimum
32
Slide 32 EE3J2 Data Mining Other types of ANN Multi-Layer Perceptrons (MLP) are not the only types of ANNs There are lots of others: –Radial Basis Function (RBF) networks –Support Vector Machines (SVMs) –… There are also ANN interpretations of other methods
33
Slide 33 EE3J2 Data Mining Summary Discrimination versus Modelling Brief introduction to neural networks Definition of an ‘artificial neurone’ Activation functions – linear and sigmoid Linear boundary defined by a single neurone Convex region defined by a one-level MLP Two-level MLPs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.