April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 1 Now let us talk about… Neural Network Application.

Slides:

Advertisements

Similar presentations

Artificial Intelligence 12. Two Layer ANNs

Advertisements

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

Computer Vision Lecture 18: Object Recognition II

Data Mining Classification: Alternative Techniques

Simple Neural Nets For Pattern Classification

November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.

November 9, 2010Neural Networks Lecture 16: Counterpropagation 1 Unsupervised Learning So far, we have only looked at supervised learning, in which an.

September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.

September 14, 2010Neural Networks Lecture 3: Models of Neurons and Neural Networks 1 Visual Illusions demonstrate how we perceive an “interpreted version”

Introduction to Neural Networks Simon Durrant Quantitative Methods December 15th.

November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.

October 5, 2010Neural Networks Lecture 9: Applying Backpropagation 1 K-Class Classification Problem Let us denote the k-th class by C k, with n k exemplars.

October 14, 2010Neural Networks Lecture 12: Backpropagation Examples 1 Example I: Predicting the Weather We decide (or experimentally determine) to use.

Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.

Computer Vision Lecture 3: Digital Images

October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.

October 12, 2010Neural Networks Lecture 11: Setting Backpropagation Parameters 1 Exemplar Analysis When building a neural network application, we must.

November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

Radial Basis Function Networks

November 25, 2014Computer Vision Lecture 20: Object Recognition IV 1 Creating Data Representations The problem with some data representations is that the.

October 8, 2013Computer Vision Lecture 11: The Hough Transform 1 Fitting Curve Models to Edges Most contours can be well described by combining several.

September 10, 2012Introduction to Artificial Intelligence Lecture 2: Perception & Action 1 Boundary-following Robot Rules 1  2  3  4  5.

Issues with Data Mining

Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.

1/11 طراحی و آموزش شبکه های عصبی Slide from Dr. M. Pomplun.

November 28, 2012Introduction to Artificial Intelligence Lecture 18: Neural Network Application Design I 1 CPN Distance/Similarity Functions In the hidden.

December 5, 2012Introduction to Artificial Intelligence Lecture 20: Neural Network Application Design III 1 Example I: Predicting the Weather Since the.

Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.

Slides are based on Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems.

© Negnevitsky, Pearson Education, Will neural network work for my problem? Will neural network work for my problem? Character recognition neural.

1/11 طراحی و آموزش شبکه های عصبی Slide from Dr. M. Pomplun.

Machine Learning CSE 681 CH2 - Supervised Learning.

1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.

Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

November 26, 2013Computer Vision Lecture 15: Object Recognition III 1 Backpropagation Network Structure Perceptrons (and many other classifiers) can only.

Neural and Evolutionary Computing - Lecture 9 1 Evolutionary Neural Networks Design  Motivation  Evolutionary training  Evolutionary design of the architecture.

Over-Trained Network Node Removal and Neurotransmitter-Inspired Artificial Neural Networks By: Kyle Wray.

Supervised learning network G.Anuradha. Learning objectives The basic networks in supervised learning Perceptron networks better than Hebb rule Single.

October 16, 2014Computer Vision Lecture 12: Image Segmentation II 1 Hough Transform The Hough transform is a very general technique for feature detection.

November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.

February 25, 2016Introduction to Artificial Intelligence Lecture 10: Two-Player Games II 1 The Alpha-Beta Procedure Can we estimate the efficiency benefit.

March 31, 2016Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 1 … let us move on to… Artificial Neural Networks.

April 5, 2016Introduction to Artificial Intelligence Lecture 17: Neural Network Paradigms II 1 Capabilities of Threshold Neurons By choosing appropriate.

 Negnevitsky, Pearson Education, Lecture 12 Hybrid intelligent systems: Evolutionary neural networks and fuzzy evolutionary systems n Introduction.

Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.

Another Example: Circle Detection

Self-Organizing Network Model (SOM) Session 11

Neural Network Architecture Session 2

Supervised Learning in ANNs

Computer Vision Lecture 4: Color

Fitting Curve Models to Edges

K Nearest Neighbor Classification

Data Mining Practical Machine Learning Tools and Techniques

Computer Vision Lecture 3: Digital Images

Computer Vision Lecture 9: Edge Detection II

Hidden Markov Models Part 2: Algorithms

Artificial Intelligence Methods

Word Embedding Word2Vec.

Creating Data Representations

The Alpha-Beta Procedure

Supervised Function Approximation

Capabilities of Threshold Neurons

The Naïve Bayes (NB) Classifier

Example I: Predicting the Weather

NEURAL NETWORK APPLICATION DESIGN

Introduction to Artificial Intelligence Lecture 24: Computer Vision IV

Computer Vision Lecture 19: Object Recognition III

Introduction to Artificial Intelligence Lecture 22: Computer Vision II

Random Neural Network Texture Model

Presentation transcript:

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 1 Now let us talk about… Neural Network Application Design

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 2 NN Application Design Now that we got some insight into the theory of artificial neural networks, how can we design networks for particular applications? Designing NNs is basically an engineering task. As we discussed before, for example, there is no formula that would allow you to determine the optimal number of hidden units in a BPN for a given task.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 3 NN Application Design We need to address the following issues for a successful application design: Choosing an appropriate data representation Choosing an appropriate data representation Performing an exemplar analysis Performing an exemplar analysis Training the network and evaluating its performance Training the network and evaluating its performance We are now going to look into each of these topics.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 4 Data Representation Most networks process information in the form of input pattern vectors.Most networks process information in the form of input pattern vectors. These networks produce output pattern vectors that are interpreted by the embedding application.These networks produce output pattern vectors that are interpreted by the embedding application. All networks process one of two types of signal components: analog (continuously variable) signals or discrete (quantized) signals.All networks process one of two types of signal components: analog (continuously variable) signals or discrete (quantized) signals. In both cases, signals have a finite amplitude; their amplitude has a minimum and a maximum value.In both cases, signals have a finite amplitude; their amplitude has a minimum and a maximum value.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 5 Data Representation The main question is: How can we appropriately capture these signals and represent them as pattern vectors that we can feed into the network? We should aim for a data representation scheme that maximizes the ability of the network to detect (and respond to) relevant features in the input pattern. Relevant features are those that enable the network to generate the desired output pattern.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 6 Data Representation Similarly, we also need to define a set of desired outputs that the network can actually produce. Often, a “natural” representation of the output data turns out to be impossible for the network to produce. We are going to consider internal representation and external interpretation issues as well as specific methods for creating appropriate representations.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 7 Internal Representation Issues As we said before, in all network types, the amplitude of input signals and internal signals is limited: analog networks: values usually between 0 and 1 analog networks: values usually between 0 and 1 binary networks: only values 0 and 1allowed binary networks: only values 0 and 1allowed bipolar networks: only values –1 and 1allowed bipolar networks: only values –1 and 1allowed Without this limitation, patterns with large amplitudes would dominate the network’s behavior. A disproportionately large input signal can activate a neuron even if the relevant connection weight is very small.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 8 Creating Data Representations The patterns that can be represented by an ANN most easily are binary patterns. Even analog networks “like” to receive and produce binary patterns – we can simply round values < 0.5 to 0 and values  0.5 to 1. To create a binary input vector, we can simply list all features that are relevant to the current task. Each component of our binary vector indicates whether one particular feature is present (1) or absent (0).

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 9 Creating Data Representations With regard to output patterns, most binary-data applications perform classification of their inputs. The output of such a network indicates to which class of patterns the current input belongs. Usually, each output neuron is associated with one class of patterns. For any input, only one output neuron should be active (1) and the others inactive (0), indicating the class of the current input.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 10 Creating Data Representations In other cases, classes are not mutually exclusive, and more than one output neuron can be active at the same time. Another variant would be the use of binary input patterns and analog output patterns for “classification”. In that case, again, each output neuron corresponds to one particular class, and its activation indicates the probability (between 0 and 1) that the current input belongs to that class.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 11 Creating Data Representations For non-binary (e.g., tertiary) features: Use multiple binary inputs to represent non-binary states (e.g., 001 for “red”, 010 for “green”, 100 for “blue” for representing three possible colors).Use multiple binary inputs to represent non-binary states (e.g., 001 for “red”, 010 for “green”, 100 for “blue” for representing three possible colors). Treat each feature in the pattern as an individual subpattern.Treat each feature in the pattern as an individual subpattern. Represent each subpattern with as many positions (units) in the pattern vector as there are possible states for the feature.Represent each subpattern with as many positions (units) in the pattern vector as there are possible states for the feature. Then concatenate all subpatterns into one long pattern vector.Then concatenate all subpatterns into one long pattern vector.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 12 Creating Data Representations Another way of representing n-ary data in a neural network is using one neuron per feature, but scaling the (analog) value to indicate the degree to which a feature is present. Good examples: the brightness of a pixel in an input image the brightness of a pixel in an input image the output of an edge filter the output of an edge filter Poor examples: the letter (1 – 26) of a word the letter (1 – 26) of a word the type (1 – 6) of a chess piece the type (1 – 6) of a chess piece

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 13 Creating Data Representations This can be explained as follows: The way NNs work (both biological and artificial ones) is that each neuron represents the presence/absence of a particular feature. Activations 0 and 1 indicate absence or presence of that feature, respectively, and in analog networks, intermediate values indicate the extent to which a feature is present. Consequently, a small change in one input value leads to only a small change in the network’s activation pattern.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 14 Creating Data Representations Therefore, it is appropriate to represent a non-binary feature by a single analog input value only if this value is scaled, i.e., it represents the degree to which a feature is present. This is the case for the brightness of a pixel or the output of an edge detector. It is not the case for letters or chess pieces. For example, assigning values to individual letters (a = 0, b = 0.04, c = 0.08, …, z = 1) implies that a and b are in some way more similar to each other than are a and z. Obviously, in most contexts, this is not a reasonable assumption.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 15 Creating Data Representations It is also important to notice that, in artificial (not natural!), completely connected networks the order of features that you specify for your input vectors does not influence the outcome. For the network performance, it is not necessary to represent, for example, similar features in neighboring input units. All units are treated equally; neighborhood of two neurons does not imply to the network that these represent similar features. Of course once you specified a particular order, you cannot change it any more during training or testing.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 16 Exemplar Analysis When building a neural network application, we must make sure that we choose an appropriate set of exemplars (training data): The entire problem space must be covered. The entire problem space must be covered. There must be no inconsistencies (contradictions) in the data. There must be no inconsistencies (contradictions) in the data. We must be able to correct such problems without compromising the effectiveness of the network. We must be able to correct such problems without compromising the effectiveness of the network.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 17 Ensuring Coverage For many applications, we do not just want our network to classify any kind of possible input. Instead, we want our network to recognize whether an input belongs to any of the given classes or it is “garbage” that cannot be classified. To achieve this, we train our network with both “classifiable” and “garbage” data (null patterns). For the the null patterns, the network is supposed to produce a zero output, or a designated “null neuron” is activated.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 18 Ensuring Coverage In many cases, we use a 1:1 ratio for this training, that is, we use as many null patterns as there are actual data samples. We have to make sure that all of these exemplars taken together cover the entire input space. If it is certain that the network will never be presented with “garbage” data, then we do not need to use null patterns for training.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 19 Ensuring Consistency Sometimes there may be conflicting exemplars in our training set. A conflict occurs when two or more identical input patterns are associated with different outputs. Why is this problematic?

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 20 Ensuring Consistency Assume a BPN with a training set including the exemplars (a, b) and (a, c). Whenever the exemplar (a, b) is chosen, the network adjust its weights to present an output for a that is closer to b. Whenever (a, c) is chosen, the network changes its weights for an output closer to c, thereby “unlearning” the adaptation for (a, b). In the end, the network will associate input a with an output that is “between” b and c, but is neither exactly b or c, so the network error caused by these exemplars will not decrease. For many applications, this is undesirable.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 21 Ensuring Consistency To identify such conflicts, we can apply a (binary) search algorithm to our set of exemplars. How can we resolve an identified conflict? Of course, the easiest way is to eliminate the conflicting exemplars from the training set. However, this reduces the amount of training data that is given to the network. Eliminating exemplars is the best way to go if it is found that these exemplars represent invalid data, for example, inaccurate measurements. In general, however, other methods of conflict resolution are preferable.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 22 Ensuring Consistency Another method combines the conflicting patterns. For example, if we have exemplars (0011, 0101), (0011, 0010), we can replace them with the following single exemplar: (0011, 0111). The way we compute the output vector of the new exemplar based on the two original output vectors depends on the current task. It should be the value that is most “similar” (in terms of the external interpretation) to the original two values.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 23 Ensuring Consistency Alternatively, we can alter the representation scheme. Let us assume that the conflicting measurements were taken at different times or places. In that case, we can just expand all the input vectors, and the additional values specify the time or place of measurement. For example, the exemplars (0011, 0101), (0011, 0010) could be replaced by the following ones: (100011, 0101), (010011, 0010).

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 24 Ensuring Consistency One advantage of altering the representation scheme is that this method cannot create any new conflicts. Expanding the input vectors cannot make two or more of them identical if they were not identical before.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 25 Training and Performance Evaluation A more insightful way of performance evaluation is partial-set training. The idea is to split the available data into two sets – the training set and the test set. The network’s performance on the second set indicates how well the network has actually learned the desired mapping. We should expect the network to interpolate, but not extrapolate. Therefore, this test also evaluates our choice of training samples.

April 12, 2016Introduction to Artificial Intelligence Lecture 19: Neural Network Application Design II 26 Training and Performance Evaluation If the test set only contains one exemplar, this type of training is called “hold-one-out” training. It is to be performed sequentially for every individual exemplar. This, of course, is a very time-consuming process. A less extreme version of hold-one-out training is cross validation, in which we split the dataset into n subsets. Each subset serves as the test set once, with the other (n – 1) subsets forming the training set. This means that n training processes are performed; this is referred to as n-fold cross validation.