Download presentation
Presentation is loading. Please wait.
Published byFrank Floyd Modified over 9 years ago
1
LINEAR CLASSIFICATION
2
Biological inspirations Some numbers… The human brain contains about 10 billion nerve cells ( neurons ) Each neuron is connected to the others through 10000 synapses Properties of the brain It can learn, reorganize itself from experience It adapts to the environment It is robust and fault tolerant
3
Biological neuron ( simplified model ) A neuron has A branching input ( dendrites ) A branching output ( the axon ) The information circulates from the dendrites to the axon via the cell body The cell body sums up the inputs in some way and fires – generates a signal through the axon – if the result is greater than some threshold
4
An Artificial Neuron Activation Function Usually not pictured (we’ll see why), but you can imagine a threshold parameter here.
5
Same Idea using the Notation in the Book
6
The Output of a Neuron As described so far… This simplest form of a neuron is also called a perceptron.
7
The Output of a Neuron Other possibilities, such as the sigmoid function for continuous output.
8
Linear Regression using a Perceptron Linear regression :
9
Linear Regression As an Optimization Problem Finding the optimal weights could be solved through : Gradient descent Simulated annealing Genetic algorithms … and now Neural Networks
10
Linear Regression using a Perceptron
11
The Bias Term So far we have defined the output of a perceptron as controlled by a threshold x 1 w 1 + x 2 w 2 + x 3 w 3 … + x n w n >= t But just like the weights, this threshold is a parameter that needs to be adjusted Solution : make it another weight x 1 w 1 + x 2 w 2 + x 3 w 3 … + x n w n + (1)(- t ) >= 0 The bias term.
12
A Neuron with a Bias Term
13
Another Example Assign weights to perform the logical OR operation.
14
Artificial Neural Network ( ANN ) A mathematical model to solve engineering problems Group of highly connected neurons to realize compositions of non linear functions Tasks Classification Discrimination Estimation
15
Feed Forward Neural Networks The information is propagated from the inputs to the outputs There are no cycles between outputs and inputs the state of the system is not preserved from one iteration to another x1 x2xn….. 1st hidden layer 2nd hidden layer Output layer
16
ANN Structure Finite number of inputs Zero or more hidden layers One or more outputs All nodes at the hidden and output layers contain a bias term.
17
Examples Handwriting character recognition Control of a virtual agent
18
ALVINN Neural Network controlled AGV (1994)
19
http://blog.davidsingleton.org/nnrccar
20
Learning The procedure that consists in estimating the weight parameters so that the whole network can perform a specific task The Learning process ( supervised ) Present the network a number of inputs and their corresponding outputs See how closely the actual outputs match the desired ones Modify the parameters to better approximate the desired outputs
21
Perceptron Learning Rule
22
Linear Separability Perceptrons can classify any input that is linearly separable. For more complex problems we need a more complex model.
23
Different Non - Linearly Separable Problems Structure Types of Decision Regions Exclusive-OR Problem Classes with Meshed regions Most General Region Shapes Single-Layer Two-Layer Three-Layer Half Plane Bounded By Hyperplane Convex Open Or Closed Regions Arbitrary (Complexity Limited by No. of Nodes) A AB B A AB B A AB B B A B A B A
24
Calculating the Weights Perceptron learning rule is pretty much gradient descent.
25
Learning the Weights in a Neural Network Perceptron learning rule ( gradient descent ) worked before, but it required us to know the correct output of the node. How do we know the correct output of a given hidden node ??
26
Backpropagation Algorithm Gradient descent over entire network weight vector Easily generalized to arbitrary directed graphs Will find a local, not necessarily global error minimum in practice often works well ( can be invoked multiple times with different initial weights )
27
Backpropagation Algorithm
28
Intuition General idea : hidden nodes are “ responsible ” for some of the error at the output nodes it connects to The change in the hidden weights is proportional to the strength ( magnitude ) of the connection between the hidden node and the output node This is the same as the perceptron learning rule, but for a sigmoid decision function instead of a step decision function ( full derivation on p. 726)
29
Intuition General idea : hidden nodes are “ responsible ” for some of the error at the output nodes it connects to The change in the hidden weights is proportional to the strength ( magnitude ) of the connection between the hidden node and the output node
30
Intuition When expanded, the update to the output nodes is almost the same as the perceptron rule Slight difference is that the algorithm uses a sigmoid function instead of a step function ( full derivation on p. 726)
31
Questions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.