Download presentation
Presentation is loading. Please wait.
1
Models of the brain hardware
Meet the Neurons Models of the brain hardware
2
A Real Neuron The Neuron
3
Neuron Models
4
Mcculloch-Pitts 1940’s Y = f(next) = f( S wi xi + b) Wi are weights
Xi are inputs B is a bias term F() is activation function F = 1 if next >= 0 else -1
5
Activation Functions Modified Mcculloch-pitts
The function shown before is a threshold function Tanh function Logistic
6
Human Brain Neuron Speed - 10-3 seconds per operation
Brain weights about 3 pounds and at rest consumes 20% of the bodies oxygen. Estimates place neuron count at 1012 to 1014 Connectivity can be 10,000
7
What is the capacity of the brain?
Estimate the MIPS of a brain Estimate the MIPS needed by a computer to simulate the brain
8
Structure The cortex is estimated to be 6 layers
The brain does recognition type computations is milliseconds The brain clearly uses some specialized structures.
9
Two Approaches Artificial Neural Nets
Biologically inspired Neural Nets
10
Survey of Artificial Neural Networks
11
Alan Turing’s Idea X1 X2 1
12
Turing (Cont) B type link
13
B type link
14
Biowall
15
The Perceptron Rosenblatt - 1950’s
Linear classifiers Mcculloch-pitts neurons: x1 1 y1 x2 y2 2 xn m ym
16
Perceptron Operation Unit Step Function Activation Y = Step{S wi * xi}
Learning rule W(t+1) = W(t) + (a * xi * e) a is a constant e is error
17
A Example Single Neuron trained for logical or
Example taken from “Artificial Intelligence Illuminated” by Be Coppin Step one random assignments of the weights on each input w1 = -0.2 and w2 = +0.4 Set a to 0.2 (a guess!)
18
Ex (cont) Training Set: 1
19
Ex (cont) X1 = 0, x2 = 0 we expect 0 Y = Step{S wi * xi}
= Step {(0 * -0.2) + (o * 0.4)} = 0 E = 0 no error! -> no change to weights Try x1 = 0 and x2 = 1 expect 1
20
Ex (cont) Answer Y = 1 -> no error But X1 = 0 and X2 = 0
Y = 0 error is 1 ( e = expect – actual) W1 = (0.2 * 1 * 1) = 0 W2 = 0.4 Now do the last case – correct no adj. Required!
21
Ex (cont) We call this an epoch – one pass of the training data.
If we keep going, it takes three epoch to reach a trained network. W1 = 0.2 and w2 = 0.4
22
Perceptron Capabilities
Rosenblatt modeled systems with a visual input. Perceptrons are linear classifiers In a two dimensional system this means separate sets of points with a line dividing the plane
23
Multi-layer Feed Forward
x1 1 1 x2 2 2 xn m m
24
Back Propagation Present input then measure desired vs.. Actual outputs Correct weights by back propagating error though net Hidden layers are corrected in proportion to the weight they provide to output stage. Need a constant, used to prevent rapid training
25
Back Propagation Training
Use the sigmoid activation function s(x) = 1 / (1 + e –x) This has a derivative: d s(x) / dx = s(x)(1- s(x)) Given the eq. for each neuron: Xj = Sxi * wi,j – qj Yj = s(Xj)
26
Cont Error at output layer is Error Gradient ek = dk – yk
dk = dyk/dxk ek Note should be a partial derivative dk = yk (1 – yk) ek
27
cont W(t+1) = W(t) + a * x * d
Hidden layer nodes (back propagate the error) dj = yj (1-yj) Swjk dk
28
Improve Back Propagation
Add momentum. DWij(t) = a * xi * dj + b Dwij(t-1)
29
Recurrent Networks D D D
30
Recurrent with Hidden Neurons
31
Properties Memory But can have stability issues
32
Training Vs. Learning Learning is ‘self directed”
Training is externally controlled Set of pairs of inputs and desired outputs
33
Learning Vs. Training Hebbian learning: Training:
Strengthen connections that are fired at same time Training: Back propagation Hopfield networks Boltzman machines
34
Hopfield Networks Single layer Each neuron receives and input
Each neuron has connections to all others Training by “clamping” and adjusting weights
35
Hopfield Details Activation function is Sign(x)
Sign(x) = +1 if X>0, -1 other wise Hamming Distance
36
Boltzman Machines Change f() in McCulloch-Pitts to be probabilistic
The energy gap between 0 and 1 outputs is related to a “temperature” pk = 1 / [ 1 + e t] t = -DEk / T Learning: Hold inputs and outputs according to training data Anneal temperature while adjusting weights
37
Kohonen Maps Self organizing feature map
Winner take all learning algorithm Clusters data Two layers input and cluster
38
Kohonen Operation Feed a input vector in – Neuron ath matches best is the winner. Euclidean Distance between vectors with weights w Di = SQRT(S(wij-xj)2 ) Smallest distance wins! And is updated wij = wij + a(xj-wij )
39
Kohonen Operation(cont)
Sometimes a neighborhood around winner is also updated.
40
Kohonen Example Two inputs and 3x3 cluster layer: x2 x1
41
Ex (cont) The training will cluster the data and also the distance between clusters can be measured. The example is very simple! Usually use a bigger map!
42
PDP - an turning point Parallel Distributed Processing (1985)
Properties: Learning similar to observed human behavior Knowledge is distributed! Robust
43
Connectionism Superposition principle
Distributed “knowledge representation” Separation of process and “knowledge representation” Knowledge representation is not symbolic in the same sense as symbolic AI!
44
Example 1 Face recognition – Gary Cottrell Input a 64x64 grid
Hidden layer 80 neurons Output layer 8 neurons – face yes,no, 2 bits for gender and 5 bits for name Results: Recognized the input set – 100% Face / no-face on new faces – 100% Male / female determination on new faces – 81%
45
Example 2 SRI worked on a net to spot tanks
Used pictures of tanks and non-tanks Pictures were both exposed and hidden vehicles! When it worked, exposed it to new pictures It failed! Why?
46
Example 3 The nature of Sub-symbolic
Categorize words by lexical type based on word order (Elman 1991)
47
Elman’s Network output Hidden Layer Input Context Units
48
Training Set build from sample sentences
29 words 10,000 two and three word sentences Training sample is input word and following word pair No unique answer – output is a set of words Analysis of the trained network – No symbol in the hidden layer corresponding to words or word pairs!!
49
Analysis of Networks – How?
Problem – Neural systems seem very unpricipled Principle Component Analysis. De-compile nets to rules
50
Guido Bologna Symbolic Rule Extraction
How is this possible? Given a single neuron of N inputs The neuron’s input form an N dimensional space The output divides the space by a hyperplane
51
(cont) The previous statement means we could write a function or rule for the neuron. Now using the weights between the output layer and the hidden layer – select hidden layer neurons that drive the desired output. Input to these hidden layers form rule anticedents.
52
(cont) Finally use boolean algebra to form a logical rule
53
Other techniques There are some very math heavy methods
Eigen Tensor analysis
54
Connectionisms challenge
Fodor’s Language of the Mind Folk Psychology mind states and our tags for them How does the brain get to these? Marr’s type 1 theory – competence with out explanations See Associative Engines by Andy Clark for more!
55
How about some reality?
56
Pulsed Systems Real neurons pulse!
Pulsed neurons have computing power over level
57
Spike Response Model Variable ui describes the internal state
Firing times are: Fi = {ti(f) ; i< f < n} = {t | ui(t) = threshold } After a spike, the state variable's value is lowered Inputs ui= S wij eij(t - tj(f))
59
Models Full Models Spike Models Simulate continuous functions
Can integrate other factors Currents other than dendrite Chemical states Spike Models Simplify and treat output as an impulse
60
Computational Power Spiked Neurons Power
All the neurons are Turing Computable Can do some things cheaper in neuron count
61
Encoding Problem How does the human brain use the spike trains?
Rate Coding Spike density Rate over population Pulse Coding Time to first spike Phase Correlation
62
Where Next? Build a brain model! Analyze the operation of real brains
Symbolic – Neural Bridge
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.