CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.

Slides:



Advertisements
Similar presentations
The Helmholtz Machine P Dayan, GE Hinton, RM Neal, RS Zemel
Advertisements

CSC321 Introduction to Neural Networks and Machine Learning Lecture 21 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Learning Bing-Chen Tsai 1/21.
CIAR Second Summer School Tutorial Lecture 2a Learning a Deep Belief Net Geoffrey Hinton.
Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
What kind of a Graphical Model is the Brain?
CIAR Summer School Tutorial Lecture 2b Learning a Deep Belief Net
Deep Learning.
Structure learning with deep neuronal networks 6 th Network Modeling Workshop, 6/6/2013 Patrick Michl.
How to do backpropagation in a brain
Restricted Boltzmann Machines and Deep Belief Networks
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Can computer simulations of the brain allow us to see into the mind? Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
CIAR Second Summer School Tutorial Lecture 2b Autoencoders & Modeling time series with Boltzmann machines Geoffrey Hinton.
How to do backpropagation in a brain
Learning Multiplicative Interactions many slides from Hinton.
CSC2535: Computation in Neural Networks Lecture 11: Conditional Random Fields Geoffrey Hinton.
Explorations in Neural Networks Tianhui Cai Period 3.
An efficient way to learn deep generative models Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of.
CIAR Second Summer School Tutorial Lecture 1b Contrastive Divergence and Deterministic Energy-Based Models Geoffrey Hinton.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
Geoffrey Hinton CSC2535: 2013 Lecture 5 Deep Boltzmann Machines.
CSC321: Introduction to Neural Networks and machine Learning Lecture 16: Hopfield nets and simulated annealing Geoffrey Hinton.
CIAR Second Summer School Tutorial Lecture 1a Sigmoid Belief Nets and Boltzmann Machines Geoffrey Hinton.
CSC321: Neural Networks Lecture 24 Products of Experts Geoffrey Hinton.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
CSC2535 Lecture 4 Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling Geoffrey Hinton.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
How to learn a generative model of images Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Boltzman Machines Stochastic Hopfield Machines Lectures 11e 1.
CSC2515 Lecture 10 Part 2 Making time-series models with RBM’s.
Cognitive models for emotion recognition: Big Data and Deep Learning
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Deep learning Tsai bing-chen 10/22.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
CSC2535 Lecture 5 Sigmoid Belief Nets
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 8 Deep Belief Nets All lecture slides will be available as.ppt,.ps, &.htm at
CSC321: Computation in Neural Networks Lecture 21: Stochastic Hopfield nets and simulated annealing Geoffrey Hinton.
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
CSC2535: Computation in Neural Networks Lecture 8: Hopfield nets Geoffrey Hinton.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
1 Restricted Boltzmann Machines and Applications Pattern Recognition (IC6304) [Presentation Date: ] [ Ph.D Candidate,
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Spiking Neuron Networks
Learning Deep Generative Models by Ruslan Salakhutdinov
Supplementary readings:
CSC321 Lecture 18: Hopfield nets and simulated annealing
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
All lecture slides will be available as .ppt, .ps, & .htm at
Restricted Boltzmann Machines for Classification
CSC321: Neural Networks Lecture 19: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
Multimodal Learning with Deep Boltzmann Machines
Deep Learning Qing LU, Siyuan CAO.
Deep Belief Networks Psychology 209 February 22, 2013.
Structure learning with deep autoencoders
Restricted Boltzman Machines
Read Chapter of Russell and Norvig
Deep Architectures for Artificial Intelligence
Deep Belief Nets and Ising Model-Based Network Construction
network of simple neuron-like computing elements
CSC 2535: Computation in Neural Networks Lecture 9 Learning Multiple Layers of Features Greedily Geoffrey Hinton.
CSC321: Neural Networks Lecture 11: Learning in recurrent networks
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton

A simple learning module: A Restricted Boltzmann Machine We restrict the connectivity to make learning easier. Only one layer of hidden units. We will worry about multiple layers later No connections between hidden units. In an RBM, the hidden units are conditionally independent given the visible states.. So we can quickly get an unbiased sample from the posterior distribution over hidden “causes” when given a data-vector : hidden j i visible

Weights  Energies  Probabilities Each possible joint configuration of the visible and hidden units has a Hopfield “energy” The energy is determined by the weights and biases. The energy of a joint configuration of the visible and hidden units determines the probability that the network will choose that configuration. By manipulating the energies of joint configurations, we can manipulate the probabilities that the model assigns to visible vectors. This gives a very simple and very effective learning algorithm.

How to learn a set of features that are good for reconstructing images of the digit 2 50 binary feature neurons 50 binary feature neurons Increment weights between an active pixel and an active feature Decrement weights between an active pixel and an active feature 16 x 16 pixel image 16 x 16 pixel image Bartlett data (reality) reconstruction (lower energy than reality)

The weights of the 50 feature detectors We start with small random weights to break symmetry

The final 50 x 256 weights Each neuron grabs a different feature.

feature data reconstruction

How well can we reconstruct the digit images from the binary feature activations? Reconstruction from activated binary features Reconstruction from activated binary features Data Data New test images from the digit class that the model was trained on Images from an unfamiliar digit class (the network tries to see every image as a 2)

Some features learned in the first hidden layer for all digits

And now for something a bit more realistic Handwritten digits are convenient for research into shape recognition, but natural images of outdoor scenes are much more complicated. If we train a network on patches from natural images, does it produce sets of features that look like the ones found in real brains?

A network with local connectivity The local connectivity between the two hidden layers induces a topography on the hidden units. Global connectivity image

Features learned by a net that sees 100,000 patches of natural images. The feature neurons are locally connected to each other. Osindero, Welling and Hinton (2006) Neural Computation

Training a deep network First train a layer of features that receive input directly from the pixels. Then treat the activations of the trained features as if they were pixels and learn features of features in a second hidden layer. It can be proved that each time we add another layer of features we get a better model of the set of training images. i.e. we assign lower energy to the real data and higher energy to all other possible images. The proof is complicated. It uses variational free energy, a method that physicists use for analyzing complicated non-equilibrium systems. But it is based on a neat equivalence.

The generative model after learning 3 layers To generate data: Get an equilibrium sample from the top-level RBM by performing alternating Gibbs sampling. Perform a top-down pass to get states for all the other layers. So the lower level bottom-up connections are not part of the generative model h3 h2 h1 data