Byoung-Tak Zhang Biointelligence Laboratory

Brain, Computation, and AI Part I: Brain and Computers Brain-Mind-Behavior Seminar April 13, 2015
Byoung-Tak Zhang Biointelligence Laboratory Computer Science and Engineering & Brain Science, Cognitive Science, and Bioinformatics Programs & Brain-Mind-Behavior Concentration Program Seoul National University

Lecture Overview Part I: Brain and Computers
How the brain encodes and processes information? How to build brain-like computers? Part II: Artificial Intelligence and Brain How to build intelligent machines inspired by brain computing? (c) 2009 SNU Biointelligence Laboratory,

Human Brain Project (EU, 2013-2022)
(c) 2008 SNU Biointelligence Laboratory,

Human Brain: Functional Architecture
Brodmann’s areas & functions © 2009, SNU CSE BioIntelligence Lab,

Cortex: Perception, Action, and Cognition
Fig 3-18 Primary sensory and motor cortex & association cortex © 2009, SNU CSE BioIntelligence Lab,

Mind, Brain, Cell, Molecule
memory Molecule 1011 cells 1010 mol. (c) SNU CSE Biointelligence Lab,

Computational Neuroscience

From Molecules to the Whole Brain

Cortical Parameters

The Structure of Neurons
(C) 2009, SNU CSE Biointelligence Lab,

Information Transmission between Neurons
Overview of signaling between neurons Synaptic inputs Synaptic inputs make postsynaptic current. Passive depolarizing currents Action potential: depolarize the membrane, and trigger another action potential. The inward current conducted down the axon . This leads to depolarization of adjacent regions of membrane (C) 2006, SNU CSE Biointelligence Lab,

Voltage-gated channel in the neuronal membrane.
Mechanisms of neurotransmitter receptor molecules. (C) 2006, SNU CSE Biointelligence Lab,

Hodgkin-Huxley Model Hodgkin-Huxley model Three ionic currents
C: capacitance I(t): external current Three ionic currents Fig. 2.7

Molecular Basis of Learning and Memory in the Brain
(c) SNU CSE Biointelligence Lab,

Neuronal Connectivity

Population Coding The average population activity A(t) of neurons
Very small time windows Pool or local population of neurons with similar response characteristics. The pool average is defined as the average firing rate over the neurons in the pool within a relatively small time window.

Associative Networks Associative node and network architecture. (A) A simplified neuron that receives a large number of inputs riin. The synaptic efficiency is denoted by wi. the output of the neuron, rout depends on the particular input stimulus. (B) A network of associative nodes. Each component of the input vector, riin, is distributed to each neuron in the network. However, the effect of the input can be different for each neuron as each individual synapse can have different efficiency values wij, where j labels the neuron in the network. Auto-associative node and network architecture. (A) Schematic illustration of an auto-associative node that is distinguished from the associative node as illustrated in Fig. 7.1A in that it has, in addition, a recurrent feedback connection. (B) An auto-associative network that consist of associative nodes that not only receive external input from other neural layers but, in addition, have many recurrent collateral connections between the nodes in the neural layer.

Von Neumann’s The Computer and the Brain (1958)
John von Neumann ( ) (c) SNU CSE Biointelligence Lab,

Some Facts about the Brain
Volume and mass: 1.35 liter & 1.35 kg Processors: 1011 neurons Communication: 1014 synapses Speed: 10-3 sec Computer: 1 GHz = 10-9 sec Memory: 2.8 x 1021 bits = 14 bits/sec x 1011 neurons x (2 x 109) sec (2 x 109 sec = 60 years of life time) Computer disk: tera bits = 1012 bits Reliability: 104 neurons dying everyday Plasticity: biochemical learning (c) 2008 SNU Biointelligence Laboratory,

Principles of Information Processing in the Brain
The Principle of Uncertainty Precision vs. prediction The Principle of Nonseparability “UN-IBM” Processor vs. memory The Principle of Infinity Limited matter vs. unbounded memory The Principle of “Big Numbers Count” Hyperinteraction of 1011 neurons (or > 1017 molecules) The Principle of “Matter Matters” Material basis of “consciousness” [Zhang, 2005] (c) SNU CSE Biointelligence Lab,

Neural Networks

What Is a Neural Network?
A new form of computing, inspired by biological (brain) models. A mathematical model composed of a large number of simple, highly interconnected processing elements. A computational model for studying learning and intelligence. (c) 2008 SNU Biointelligence Laboratory,

From Biological Neuron to Artificial Neuron
Dendrite Cell Body Axon

From Biology to Artificial Neural Networks

Properties of Artificial Neural Networks
A network of artificial neurons Characteristics Nonlinear I/O mapping Adaptivity Generalization ability Fault-tolerance (graceful degradation) Biological analogy <Multilayer Perceptron Network>

Integrate-and-Fire Neuron
Membrane potential, Membrane time constant, Input current, Synaptic efficiency, Firing time of presynaptic neuron of synapse j, Firing time of the postsynaptic neuron, Firing threshold, Reset membrane potential, Absolute refractory time by holding this value Fig. 3.1 Schematic illustration of a leaky integrate-and-fire neuron. This neuron model integrates(sums) the external input, with each channel weighted with a corresponding synaptic weighting factors wi, and produces an output spike if the membrane potential reaches a firing threshold

Activation Functions

Associative Networks Associative node and network architecture. (A) A simplified neuron that receives a large number of inputs riin. The synaptic efficiency is denoted by wi. the output of the neuron, rout depends on the particular input stimulus. (B) A network of associative nodes. Each component of the input vector, riin, is distributed to each neuron in the network. However, the effect of the input can be different for each neuron as each individual synapse can have different efficiency values wij, where j labels the neuron in the network. Auto-associative node and network architecture. (A) Schematic illustration of an auto-associative node that is distinguished from the associative node as illustrated in Fig. 7.1A in that it has, in addition, a recurrent feedback connection. (B) An auto-associative network that consist of associative nodes that not only receive external input from other neural layers but, in addition, have many recurrent collateral connections between the nodes in the neural layer.

Multilayer Feedforward Networks
Error Backpropagation Output Comparison Information Propagation Input x1 Weights Input x2 Output Input x3 Input Layer Hidden Layer Output Layer Scaling Function Activation Function Activation Function (c) 2008 SNU Biointelligence Laboratory,

Application Example: Autonomous Land Vehicle (ALV)
NN learns to steer an autonomous vehicle. 960 input units, 4 hidden units, 30 output units Driving at speeds up to 70 miles per hour ALVINN System Image of a forward - mounted camera Weight values for one of the hidden units (c) 2008 SNU Biointelligence Laboratory,

Multilayer Networks and its Decision Boundaries
Decision regions of a multilayer feedforward network. The network was trained to recognize 1 of 10 vowel sounds occurring in the context “h_d” The network input consists of two parameter, F1 and F2, obtained from a spectral analysis of the sound. The 10 network outputs correspond to the 10 possible vowel sounds.

Neural Nets for Face Recognition
960 x 3 x 4 network is trained on gray-level images of faces to predict whether a person is looking to their left, right, ahead, or up. (c) SNU CSE Biointelligence Lab,

Properties of Neural Networks

Hidden Layer Representation for Identity Function

The evolving sum of squared errors for each of the eight output units as the number of training iterations (epochs) increase

The evolving hidden layer representation for the input string “ ”

The evolving weights for one of the three hidden units

Generalization and Overfitting
Continuing training until the training error falls below some predetermined threshold is a poor strategy since BP is susceptible to overfitting. Need to measure the generalization accuracy over a validation set (distinct from the training set). Two different types of overffiting Generalization error first decreases, then increases, even the training error continues to decrease. Generalization error decreases, then increases, then decreases again, while the training error continues to decreases.

Two Kinds of Overfitting Phenomena

Deep Neural Networks

Learning to extract the orientation of a face patch (Salakhutdinov & Hinton, NIPS 2007)

The training and test sets for predicting face orientation
100, 500, or 1000 labeled cases 11,000 unlabeled cases face patches from new people

The root mean squared error in the orientation when combining GP’s with deep belief nets
GP on the pixels GP on top-level features GP on top-level features with fine-tuning 100 labels 500 labels 1000 labels Conclusion: The deep features are much better than the pixels. Fine-tuning helps a lot.

Deep Autoencoders (Hinton & Salakhutdinov, 2006)
28x28 1000 neurons They always looked like a really nice way to do non-linear dimensionality reduction: But it is very difficult to optimize deep autoencoders using backpropagation. We now have a much better way to optimize them: First train a stack of 4 RBM’s Then “unroll” them. Then fine-tune with backprop. 500 neurons 250 neurons linear units 30 250 neurons 500 neurons 1000 neurons 28x28

A comparison of methods for compressing digit images to 30 real numbers.
real data 30-D deep auto 30-D logistic PCA 30-D PCA

Retrieving documents that are similar to a query document
We can use an autoencoder to find low-dimensional codes for documents that allow fast and accurate retrieval of similar documents from a large set. We start by converting each document into a “bag of words”. This a 2000 dimensional vector that contains the counts for each of the 2000 commonest words.

How to compress the count vector
output vector 2000 reconstructed counts We train the neural network to reproduce its input vector as its output This forces it to compress as much information as possible into the 10 numbers in the central bottleneck. These 10 numbers are then a good way to compare documents. 500 neurons 250 neurons 10 250 neurons 500 neurons input vector 2000 word counts

Performance of the autoencoder at document retrieval
Train on bags of 2000 words for 400,000 training cases of business documents. First train a stack of RBM’s. Then fine-tune with backprop. Test on a separate 400,000 documents. Pick one test document as a query. Rank order all the other test documents by using the cosine of the angle between codes. Repeat this using each of the 400,000 test documents as the query (requires 0.16 trillion comparisons). Plot the number of retrieved documents against the proportion that are in the same hand-labeled class as the query document.

Proportion of retrieved documents in same class as query
Number of documents retrieved

First compress all documents to 2 numbers using a type of PCA Then use different colors for different document categories

First compress all documents to 2 numbers
First compress all documents to 2 numbers Then use different colors for different document categories

Byoung-Tak Zhang Biointelligence Laboratory

Similar presentations

Presentation on theme: "Byoung-Tak Zhang Biointelligence Laboratory"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Byoung-Tak Zhang Biointelligence Laboratory

Similar presentations

Presentation on theme: "Byoung-Tak Zhang Biointelligence Laboratory"— Presentation transcript:

Similar presentations

About project

Feedback