Module 1: Machine Learning

Module 1: Machine Learning

What is Learning? “Learning denotes changes in a system that.. enable a system to do the same task more efficiently the next time.” –Herbert Simon “Learning is constructing or modifying representations of what is being experienced.” – Ryszard Michalski “Learning is making useful changes in our minds.” –Marvin Minsky

Why do Machine Learning?
Understand and improve efficiency of human learning For example, use to improve methods for teaching and tutoring people, as done in CAI- Computer Aided Instruction Discover new things or structure that is unknown to humans Example: Data mining Fill in skeletal or incomplete specifications about a domain Large, complex AI systems cannot be completely derived by hand and require dynamic updating to incorporate new information.

Evaluating Performance
Several possible criteria for evaluating a learning algorithm: Predictive accuracy of classifier Speed of learner Speed of clasifier Space requirements Most common criterion is predictive accuracy

Major Paradigms of Machine Learning
Rote Learning One-to-one mapping from inputs to stored representation - “Learning by memorization”. Learning from instruction Acquiring knowledge from a teacher

Major Paradigms of Machine Learning
Learning by analogy Acquiring new facts or skills by transforming existing knowledge. Learning from examples Inductive learning. Given a set of examples of a concept. Learning from observation and discovery Unsupervised learning.

Learning Methods Neural computing Symbolic algorithms
Genetic algorithms/ genetic mutation

Learning through Neural computing
A NN is a structured distributed information processing system consisting of processing elements or nodes interconnected together – resemble human brain. Learning denotes changes in the NNs that are adaptive in the sense that the NNs can do the same tasks drawn from the same population more efficiently next time The learning - specify how the weights of the connections in the network are to be adjusted during the learning process or training. Many learning algorithms have been introduced with the objective to allow the network to produce the correct output at a specific period of time. Learning algorithms - supervised network and unsupervised network

Supervised learning Supervised learning requires the training data to be consist of a pairs of input patterns with a target patterns representing the desired output. These training patterns are called vector-pairs. The weights are adjusted during training until the input patterns approach the output patterns – acts as a of a teacher Example, figure illustrates how the weight vectors, i.e. w, represented by the linear hyperplanes are gradually adjusted to separate one class of patterns from another.

Unsupervised learning rule
The learning set consists of an input training pattern only. Therefore, the NN is learned without the benefit of any target value. It learns to adapt based on the experiences collected through the previous training patterns. Typical examples: - The Hebbian learning rule - The competitive learning rule.

A simple version of Hebbian learning is that when nodes (learning elements) i and j are simultaneously fired, the strength of the connection between them increases in some proportion. In competitive learning, a node learns by shifting connection weights from its inactive to active input nodes – winner nodes. Examples - Figures:

BEFORE AFTER

THE ROLE OF NEURAL NETWORKS
NNs - a very popular field of research in cognitive science, neurobiology, computer engineering/science, signal processing, optics and physics. NNs - alternative to symbolic processing paradigms. Learning through adaptation or self-organisation rather than computer programming.

Modelling the NN can be divided into 2 categories:
Biological-type Application-driven NN.

Biological-type NN, the model mimics biological neural systems such as audio/vision functions like motion field, binocular stereo and edge detection.

The Application-driven NN model/NN is not closely tied to biological realities - the architectures are largely influenced by the application because of the following reasons: - Speed of learning - Type of learning: supervised/ unsupervised - Robustness - Space - Approximation

The application domains of NNs can be roughly divided into the following categories :-
(1) association(2) classification(3) pattern completion(4) generalisation(5) optimisation

Association (memorisatio) –
two types: auto-association and hetero-association. Auto-association - store a set of patterns by repeatedly. The problem is to retrieve the complete pattern from a partial/distorted part of the pattern. Hetero-association involves pairing an arbitrary set of input patterns with another arbitrary set of output patterns. The problem is to retrieve a corresponding pattern from a given input pattern.

There are two types of classifications.
Categories + a set of input patterns that are repeatedly presented. When a new pattern is presented, the network is able to identify which category this pattern belongs to. No prior knowledge of the categories to classify the input patterns. In this case a network performs adaptive feature attraction or clustering during learning. Pattern/information completion - the original pattern is recovered from a given partial information. The process of completion takes place for many iterations. It reaches a stable state when there is no change of state.

Approximation involves the following task.
Suppose that a non-linear input-output mapping is described by the function, y = f(x) where x is an input vector and y is the scalar output. The function f is assumed unknown. The requirement is to design a NN that approximates the unknown vector x from f after the input-output pairs (x1,y1), (x2,y2), ..., (xn,yn) have been repeatedly presented.

A network is considered successful if it can closely approximate the actual values for the trained data set and can provide smooth interpolations for the untrained data set. The objective of generalisation is to yield a correct output response to an input pattern for which it has not been trained before.

Optimisation applications – looking for a nealy optimal solution
Optimisation applications – looking for a nealy optimal solution. Example: Scedulling process, time-tabling, network configuration, etc.

Where are NNs being used
Signal Processing 1st. Commercial applications – to supress noise on telephone line. NNs – ADALINES (Adaptive linear system) Control “truck backer-upper” to provide steering directions to a trailer steering directions Pattern Recognition automatic recognition of handwritten characters.

Where are NNs being used
Medicine to train an associative memory NN to store large number of medical records, each which includes information on symptons, diagnosis, and treatment for a particular case. After training, NN can recall the full stored pattern that represents the best diagnosis and treatment. Speech production Learning to read English text – NETtalk. Speech recognition Speaker-independent recognition of speech. Business Morgage assessment work by Nestor, Inc.

TAXONOMY OF THE NEURAL NETWORK MODELS
The interest in NNs has greatly increased since the beginning of the 1980's and a large range of models have been developed for different purposes. Characteristics based on:- (1) network architectures(2) node characteristics, and (3) training or learning rules

Architecture of Neural Networks
are formed by organising nodes into layers and linking them with weighted interconnections. NN architectures characteristics :- (1)The number of layers in a network such as a single layer, two layers or multilayer. (2)The type of connections are 'feedforward', 'feedback' and 'lateral'.. (3)The connection maybe fully or locally connected. (4)The connections can be excitatory (positive weights) or inhibitory (negative weights). Based on the above distinctions, six different architectures, figure 1.

Multilayer feedforward networks
Figure 1a, propagate data from the previous layer to the next layer. They range from simple two-layer perceptron to multiple hidden layers. are capable of doing classification, generalisation/ approximation and pattern recognition. (a) Multilayer feedforward network

(b) Single layer network Single layer networks The fully connected or laterally connected single layer networks or Hopfield-type networks. Figure 1b Suitable for pattern autoassociation, pattern completion, optimization. They are good for generating clean versions of patterns - given a noisy or incomplete pattern The Hopfield network and Brain-State-in-a-Box.

Topological networks Two layer networks.
(c) Topological network Two layer networks. The second layer - the nodes are laterally connected, figure 1c. This layer acts as a competitive layer, fires selective output nodes (i.e. winner node) if an input minimises/maximises corresponding functions. It use to cluster different classes of input patterns. The Learning Vector Quantisation and Kohonen Self-organising networks.

Two layer feedforward/feedback networks
The two layer feedforward and feedback networks with symmetrical connections. Two layer non-linear feedforward/feedback network, as shown in figure 1d. Patterns sweep from one node layer to the next, and then back again, slowly relaxing into a stable state that represents the network's association of the two patterns. Heteroassociation. Network. The Adaptive Resonance Theory (ART) and Bidirectional Associative Memory (BAM). (d) Two layer feedforward/ feedback network

Multilayer competitive networks
Three layer network. Second layer with lateral connections - for competitive learning purposes (figure 1e). The output of the network is determined through the winner node of the competitive layers. The Counterpropagation network. (e) Multilayer competitive network competitve layer

Cascading the networks
The possibility of cascading different structures, figure f Known as 'hybrid network'/ 'sequential network'. The basic variables are not individual node but the entire subnetworks. (f) Cascading the network NN

Dynamic Neural Network
The network with local feedback/ 'temporal model' The network structures discussed earlier known as static network. Static network are categorised as memoryless.- their output is a function only of the current input, not of past or future inputs or outputs.

(a) Multilayer feedforward network (c) Topological network (b) Single layer network (d) Two layer feedforward/ feedback network (e) Multilayer competitive network competitve layer (f) Cascading the network NN

Node characteristics All NNs have a set of processing nodes which represent the neurons. These nodes operate in parallel, either synchronously, or asynchronously. Each node: Receives input from one or more of its neighbours Computes an output value (it's activation state), and Sends it to one or more of its neighbours.

Node characteristics The input from the environment may be analogue or digital may have to be preprocessed - to represent as binary format. The output is the activation level of the node. The activation level may be discrete or continuous, bounded or unbounded, Fig. 2. Its function is called transfer function or activation function. Fig. 3 shows the structure of the ith. node Combination of all output nodes at the output layer may represents the result (classification, etc.)

Fig. 2: Threshold-function

Fig. 3: A Simple ith. node

Learning rules Fig. 4 – categorising models based on learning rules.
Fig..4: Some models of Neural Network

Module 1: Machine Learning

Similar presentations

Presentation on theme: "Module 1: Machine Learning"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Module 1: Machine Learning

Similar presentations

Presentation on theme: "Module 1: Machine Learning"— Presentation transcript:

Similar presentations

About project

Feedback