CSC2515 Lecture 10 Part 2 Making time-series models with RBM’s.

Slides:



Advertisements
Similar presentations
2007 NIPS Tutorial on: Deep Belief Nets
Advertisements

Geoffrey Hinton University of Toronto
Deep Learning Bing-Chen Tsai 1/21.
CIAR Second Summer School Tutorial Lecture 2a Learning a Deep Belief Net Geoffrey Hinton.
Stochastic Neural Networks Deep Learning and Neural Nets Spring 2015.
CS590M 2008 Fall: Paper Presentation
Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.
What kind of a Graphical Model is the Brain?
CIAR Summer School Tutorial Lecture 2b Learning a Deep Belief Net
CSC321: Neural Networks Lecture 3: Perceptrons
How to do backpropagation in a brain
Restricted Boltzmann Machines and Deep Belief Networks
Computer vision: models, learning and inference Chapter 10 Graphical Models.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models
CIAR Second Summer School Tutorial Lecture 2b Autoencoders & Modeling time series with Boltzmann machines Geoffrey Hinton.
Playing with features for learning and prediction Jongmin Kim Seoul National University.
Deep Learning and Neural Nets Spring 2015
CSC Advanced Machine Learning Lecture 4 Restricted Boltzmann Machines
How to do backpropagation in a brain
Boltzmann Machines and their Extensions S. M. Ali Eslami Nicolas Heess John Winn March 2013 Heriott-Watt University.
Learning Multiplicative Interactions many slides from Hinton.
CSC2535: Computation in Neural Networks Lecture 11: Conditional Random Fields Geoffrey Hinton.
CSC 2535: Computation in Neural Networks Lecture 10 Learning Deterministic Energy-Based Models Geoffrey Hinton.
An efficient way to learn deep generative models Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science University of.
CIAR Second Summer School Tutorial Lecture 1b Contrastive Divergence and Deterministic Energy-Based Models Geoffrey Hinton.
Csc Lecture 7 Recognizing speech. Geoffrey Hinton.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
CSC Lecture 8a Learning Multiplicative Interactions Geoffrey Hinton.
Geoffrey Hinton CSC2535: 2013 Lecture 5 Deep Boltzmann Machines.
Markov Random Fields Probabilistic Models for Images
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
Training Restricted Boltzmann Machines using Approximations to the Likelihood Gradient Tijmen Tieleman University of Toronto.
CSC2535 Lecture 4 Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling Geoffrey Hinton.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
Csc Lecture 8 Modeling image covariance structure Geoffrey Hinton.
CSC321: Neural Networks Lecture 16: Hidden Markov Models
CSC Lecture 6a Learning Multiplicative Interactions Geoffrey Hinton.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
CIAR Summer School Tutorial Lecture 1b Sigmoid Belief Nets Geoffrey Hinton.
How to learn a generative model of images Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Boltzman Machines Stochastic Hopfield Machines Lectures 11e 1.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
Cognitive models for emotion recognition: Big Data and Deep Learning
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
Preliminary version of 2007 NIPS Tutorial on: Deep Belief Nets Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science.
Deep learning Tsai bing-chen 10/22.
CSC2535 Lecture 5 Sigmoid Belief Nets
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 8 Deep Belief Nets All lecture slides will be available as.ppt,.ps, &.htm at
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
CSC 2535: Computation in Neural Networks Lecture 10 Learning Deterministic Energy-Based Models Geoffrey Hinton.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
All lecture slides will be available as .ppt, .ps, & .htm at
Multimodal Learning with Deep Boltzmann Machines
Deep Belief Networks Psychology 209 February 22, 2013.
Deep Belief Nets and Ising Model-Based Network Construction
Computer Graphics Lecture 15.
2007 NIPS Tutorial on: Deep Belief Nets
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
CSC 2535: Computation in Neural Networks Lecture 9 Learning Multiple Layers of Features Greedily Geoffrey Hinton.
Presentation transcript:

CSC2515 Lecture 10 Part 2 Making time-series models with RBM’s

Time series models Inference is difficult in directed models of time series if we use non-linear distributed representations in the hidden units. –It is hard to fit Dynamic Bayes Nets to high- dimensional sequences (e.g motion capture data). So people tend to avoid distributed representations and use much weaker methods (e.g. HMM’s).

Time series models If we really need distributed representations (which we nearly always do), we can make inference much simpler by using three tricks: –Use an RBM for the interactions between hidden and visible variables. This ensures that the main source of information wants the posterior to be factorial. –Model short-range temporal information by allowing several previous frames to provide input to the hidden units and to the visible units. This leads to a temporal module that can be stacked –So we can use greedy learning to learn deep models of temporal structure.

The conditional RBM model (Sutskever & Hinton 2007) Given the data and the previous hidden state, the hidden units at time t are conditionally independent. –So online inference is very easy Learning can be done by using contrastive divergence. –Reconstruct the data at time t from the inferred states of the hidden units and the earlier states of the visibles. –The temporal connections can be learned as if they were additional biases t- 2 t- 1 t t

Why the autoregressive connections do not cause problems The autoregressive connections do not mess up contrastive divergence learning because: –We know the initial state of the visible units, so we know the initial effect of the autoregressive connections. –It is not necessary for the reconstructions to be at equilibrium with the hidden units. –The important thing for contrastive divergence is to ensure the hiddens are in equilibrium with the visibles whenever statistics are measured.

Generating from a learned model The inputs from the earlier states of the visible units create dynamic biases for the hidden and current visible units. Perform alternating Gibbs sampling for a few iterations between the hidden units and the current visible units. –This picks new hidden and visible states that are compatible with each other and with the recent history. t- 2 t- 1 t t

Stacking temporal RBM’s Treat the hidden activities of the first level TRBM as the data for the second-level TRBM. –So when we learn the second level, we get connections across time in the first hidden layer. After greedy learning, we can generate from the composite model –First, generate from the top-level model by using alternating Gibbs sampling between the current hiddens and visibles of the top-level model, using the dynamic biases created by the previous top-level visibles. –Then do a single top-down pass through the lower layers, but using the autoregressive inputs coming from earlier states of each layer.

An application to modeling motion capture data (Taylor, Roweis & Hinton, 2007) Human motion can be captured by placing reflective markers on the joints and then using lots of infrared cameras to track the 3-D positions of the markers. Given a skeletal model, the 3-D positions of the markers can be converted into the joint angles plus 6 parameters that describe the 3-D position and the roll, pitch and yaw of the pelvis. –We only represent changes in yaw because physics doesn’t care about its value and we want to avoid circular variables.

Modeling multiple types of motion We can easily learn to model many styles of walking in a single model. –This means we can share a lot of knowledge. –It should also make it much easier to learn nice transitions between styles. Because we can do online inference (slightly incorrectly), we can fill in missing markers in real time. t- 2 t- 1 t t style label

Show Graham Taylor’s movies available at