Learning Multiplicative Interactions many slides from Hinton.

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Deep learning with multiplicative interactions
Deep Learning Bing-Chen Tsai 1/21.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Nathan Wiebe, Ashish Kapoor and Krysta Svore Microsoft Research ASCR Workshop Washington DC Quantum Deep Learning.
Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
CIAR Summer School Tutorial Lecture 2b Learning a Deep Belief Net
DA/SAT Training Course, March 2006 Variational Quality Control Erik Andersson Room: 302 Extension: 2627
Computer vision: models, learning and inference Chapter 10 Graphical Models.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Lecture II-2: Probability Review
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero
CSC 2535: 2013 Lecture 3b Approximate inference in Energy-Based Models
Distribution Function properties. Density Function – We define the derivative of the distribution function F X (x) as the probability density function.
CIAR Second Summer School Tutorial Lecture 2b Autoencoders & Modeling time series with Boltzmann machines Geoffrey Hinton.
Copyright, Gerry Quinn & Mick Keough, 1998 Please do not copy or distribute this file without the authors’ permission Experimental design and analysis.
A NON-IID FRAMEWORK FOR COLLABORATIVE FILTERING WITH RESTRICTED BOLTZMANN MACHINES Kostadin Georgiev, VMware Bulgaria Preslav Nakov, Qatar Computing Research.
CSC Advanced Machine Learning Lecture 4 Restricted Boltzmann Machines
CSC2535 Spring 2013 Lecture 1: Introduction to Machine Learning and Graphical Models Geoffrey Hinton.
Boltzmann Machines and their Extensions S. M. Ali Eslami Nicolas Heess John Winn March 2013 Heriott-Watt University.
Multimodal Interaction Dr. Mike Spann
Chapter 14 Monte Carlo Simulation Introduction Find several parameters Parameter follow the specific probability distribution Generate parameter.
CSC2535: Computation in Neural Networks Lecture 11: Conditional Random Fields Geoffrey Hinton.
CSC 2535: Computation in Neural Networks Lecture 10 Learning Deterministic Energy-Based Models Geoffrey Hinton.
Quantification of the non- parametric continuous BBNs with expert judgment Iwona Jagielska Msc. Applied Mathematics.
CIAR Second Summer School Tutorial Lecture 1b Contrastive Divergence and Deterministic Energy-Based Models Geoffrey Hinton.
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
CSC Lecture 8a Learning Multiplicative Interactions Geoffrey Hinton.
Markov Random Fields Probabilistic Models for Images
CSC321: Neural Networks Lecture 24 Products of Experts Geoffrey Hinton.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
NCAP Summer School 2010 Tutorial on: Deep Learning
CSC2535 Lecture 4 Boltzmann Machines, Sigmoid Belief Nets and Gibbs sampling Geoffrey Hinton.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 18 Learning Boltzmann Machines Geoffrey Hinton.
Csc Lecture 8 Modeling image covariance structure Geoffrey Hinton.
CSC Lecture 6a Learning Multiplicative Interactions Geoffrey Hinton.
CSC2515: Lecture 7 (post) Independent Components Analysis, and Autoencoders Geoffrey Hinton.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Boltzman Machines Stochastic Hopfield Machines Lectures 11e 1.
CSC2515 Lecture 10 Part 2 Making time-series models with RBM’s.
Maximum Entropy Model, Bayesian Networks, HMM, Markov Random Fields, (Hidden/Segmental) Conditional Random Fields.
Introduction to Sampling Methods Qi Zhao Oct.27,2004.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Preliminary version of 2007 NIPS Tutorial on: Deep Belief Nets Geoffrey Hinton Canadian Institute for Advanced Research & Department of Computer Science.
Deep learning Tsai bing-chen 10/22.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 17: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 8 Deep Belief Nets All lecture slides will be available as.ppt,.ps, &.htm at
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
CSC 2535: Computation in Neural Networks Lecture 10 Learning Deterministic Energy-Based Models Geoffrey Hinton.
CSC Lecture 23: Sigmoid Belief Nets and the wake-sleep algorithm Geoffrey Hinton.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
CSC2535: Computation in Neural Networks Lecture 7: Independent Components Analysis Geoffrey Hinton.
Biointelligence Laboratory, Seoul National University
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Spontaneous activity in V1: a probabilistic framework
All lecture slides will be available as .ppt, .ps, & .htm at
Restricted Boltzmann Machines for Classification
CSC321: Neural Networks Lecture 19: Boltzmann Machines as Probabilistic Models Geoffrey Hinton.
Multimodal Learning with Deep Boltzmann Machines
The Chinese University of Hong Kong
Bayesian Models in Machine Learning
Deep Belief Nets and Ising Model-Based Network Construction
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Learning Multiplicative Interactions many slides from Hinton

Two different meanings of “multiplicative” multiplicative interactions between latent variables. If we take two density models and multiply together their probability distributions at each point in data-space, we get a “product of experts”. –The product oftwo Gaussian experts is a Gaussian. If we take two variables and we multiply them together to provide input to a third variable we get a “multiplicative interaction”. –The distribution of the product of two Gaussian- distributed variables is NOT Gaussian distributed. It is a heavy-tailed distribution. One Gaussian determines the standard deviation of the other Gaussian. –Heavy-tailed distributions are the signatures of

Learning multiplicative interactions form a bi-partite graph. It is fairly easy to learn multiplicative interactions if all of the variables are observed. –This is possible if we control the variables used to create a training set (e.g. pose, lighting, identity …) It is also easy to learn energy-based models in which all but one of the terms in each multiplicative interaction are observed. –Inference is still easy. If more than one of the terms in each multiplicative interaction are unobserved, the interactions between hidden variables make inference difficult. –Alternating Gibbs can be used if the latent variables

Higher order Boltzmann machines (Sejnowski, ~1986) The usual energy function is quadratic in the states: But we could use higher order interactions: Hidden unit h acts as a switch. When h is on, it switches in the pairwise interaction between unit i and unit j. – Units i and j can also be viewed as switches that control the pairwise interactions between j and h or between i and h.

Using higher-order Boltzmann machines to model image transformations (Memisevic and Hinton, 2007) A global transformation specifies which pixel goes to which other pixel. Conversely, each pair of similar intensity pixels, one in each image, votes for a particular global transformation. image transformation image(t)image(t+1)

Using higher-order Boltzmann machines to model image transformations (1) (2)

Making the reconstruction easier Condition on the first image so that only one visible group needs to be reconstructed. – Given the hidden states and the previous image, the pixels in the second image are conditionally independent. image transformation image(t)image(t+1)

The main problem with 3-way interactions

Factoring three-way interactions We use factors that correspond to 3-way outer- products.  E   s i s j s h w ijh i, j,h unfactored  E    s i s j s h w if w jf w hf fi, j,h factored w jf w hf w if

(Ranzato, Krizhevsky and Hinton, 2010) Joint 3-way model Model the covariance structure of natural images. The visible units are two identical copies Factored 3-Way Restricted Boltzmann Machines For Modeling Natural Images

A powerful module for deep learning

Producing reconstructions using hybrid Monte Carlo

(Hinton et al., 2011) describe a generative model of the relationship between two images The model is defined as a factored three-way Boltzmann machine, in which hidden variables collaborate to define the joint correlation matrix for image pairs Modeling the joint density of two images under a variety of tranformations

Model

Three-way contrastive Divergence

Thank you