PixelGAN Autoencoders

PixelGAN Autoencoders
Alireza Makhzani, Brendan Frey University of Toronto Liu ze Dec 30th, 2017 中国科学技术大学 University of Science and Technology of China

Outline 1. Background 2. PixelGAN Autoencoders 3. Experiments
PixelCNNs Variational Autoencoders Adversarial Autoencoders 2. PixelGAN Autoencoders Limitations of VAE/AAE Structure and Training Benefits of PixelGAN Autoencoders 3. Experiments 4. Conclusion

PixelCNNs

Contional PixelCNNs h h

Conditional PixelCNNs
h h Learn the image statistics directly at the pixel level. Good at modelling low-level pixel statistics. Conditional PixelCNNs can learn conditional densities. Samples lack global structure. Lacking latent representation.

Variational Autoencoders
Good at capturing the global structure, but samples are blurry.

Adversarial Autoencoders
Code Space of MNIST: Gaussian Prior Mixture of Gaussians

Limitations of VAE/AAE
✦ All the image statistics are captured by the single latent vector. VAE label, style global and local p(z) Latent Variable Deterministic (factorized Gaussians) p(x|z) None

Structure and Training
Cost function of PixelGAN = Reconstruction + Adversarial Cost

Benefits of PixelGAN Autoencoders
✦ The image statistics are captured jointly by the latent vector and the autoregressive decoder. p(z) Latent Variable p(x|z) PixelCNN

Benefits of PixelGAN Autoencoders
✦ The image statistics are captured jointly by the latent vector and the autoregressive decoder. PixelGAN (Gaussian) PixelGAN (Categorical) Discrete (label) Global (low-frequency) p(z) Latent Variable Local (high-frequency) Continuous (Style) p(x|z) PixelCNN

PixelGAN (Categorical)
Benefits of PixelGAN Autoencoders ✦ The image statistics are captured jointly by the latent vector and the autoregressive decoder. PixelGAN (Gaussian) PixelGAN (Categorical) Discrete (label) Global (low-frequency) p(z) Latent Variable Local (high-frequency) Continuous (Style) p(x|z) PixelCNN Semi-supervised Learning

Global vs. Local Decomposition

Code Space Code Space of MNIST:

PixelGAN Autoencoders with Categorical Priors

Discrete vs. Continuous Decomposition (Clustering)

Unsupervised Clustering
5% Error rate

Semi-supervised Learning

Semi-supervised Classification

Unsupervised Clustering
A Proposed the PixelGAN autoencoder, which is a generative autoencoder that combines a generative PixelCNN with a GAN inference network that can impose arbitrary priors on the latent code. B Showed that imposing different distributions as the prior enables us to learn a latent representation that captures the type of statistics that we care about, while the remaining structure of the image is captured by the PixelCNN decoder. C Demonstrate the application of PixelGAN autoencoders in downstream tasks such as semi-supervised learning; Discussed how these architectures have other potentials such as learning cross-domain relations between two different domains

Thank you!

PixelGAN Autoencoders

Similar presentations

Presentation on theme: "PixelGAN Autoencoders"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PixelGAN Autoencoders

Similar presentations

Presentation on theme: "PixelGAN Autoencoders"— Presentation transcript:

Similar presentations

About project

Feedback