CS 2770: Computer Vision Generative Adversarial Networks

Slides:



Advertisements
Similar presentations
Unsupervised Learning Clustering K-Means. Recall: Key Components of Intelligent Agents Representation Language: Graph, Bayes Nets, Linear functions Inference.
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advanced topics.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Discriminative and generative methods for bags of features
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Conditional Generative Adversarial Networks
Neural networks and support vector machines
Big data classification using neural network
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
Visualizing High-Dimensional Data
Generative Adversarial Nets
Generative Adversarial Network (GAN)
Semi-Supervised Clustering
Learning Deep Generative Models by Ruslan Salakhutdinov
Deep Feedforward Networks
Environment Generation with GANs
Deep Neural Net Scenery Generation
Dhruv Batra Georgia Tech
Automatic Lung Cancer Diagnosis from CT Scans (Week 2)
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Automatic Lung Cancer Diagnosis from CT Scans (Week 4)
Matt Gormley Lecture 16 October 24, 2016
Generative Adversarial Networks
CSCI 5922 Neural Networks and Deep Learning Generative Adversarial Networks Mike Mozer Department of Computer Science and Institute of Cognitive Science.
Neural networks (3) Regularization Autoencoder
Machine Learning Basics
Deep learning and applications to Natural language processing
CS 188: Artificial Intelligence
CSCI 5922 Neural Networks and Deep Learning: NIPS Highlights
Ying Kin Yu, Kin Hong Wong and Siu Hang Or
Generative adversarial networks (GANs) for edge detection
Unsupervised Conditional Generation
Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros
Object detection as supervised classification
Presenter: Hajar Emami
Adversarially Tuned Scene Generation
"Playing Atari with deep reinforcement learning."
Low Dose CT Image Denoising Using WGAN and Perceptual Loss
CSCI 5822 Probabilistic Models of Human and Machine Learning
Goodfellow: Chap 6 Deep Feedforward Networks
CS 2770: Computer Vision Intro to Visual Recognition
CS 2750: Machine Learning Support Vector Machines
Deep learning Introduction Classes of Deep Learning Networks
Goodfellow: Chapter 14 Autoencoders
Image recognition: Defense adversarial attacks
David Healey BYU Capstone Course 15 Nov 2018
Image to Image Translation using GANs
Semantic segmentation
Neural Networks Geoff Hulten.
GAN Applications.
Lip movement Synthesis from Text
CS4670: Intro to Computer Vision
Autoencoders hi shea autoencoders Sys-AI.
Neural networks (3) Regularization Autoencoder
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
Ch 14. Generative adversarial networks (GANs) for edge detection
Introduction to Neural Networks
Semantic Segmentation
Angel A. Cantu, Nami Akazawa Department of Computer Science
Text-to-speech (TTS) Traditional approaches (before 2016) Neural TTS
CSC 578 Neural Networks and Deep Learning
Goodfellow: Chapter 14 Autoencoders
Presentation transcript:

CS 2770: Computer Vision Generative Adversarial Networks Prof. Adriana Kovashka University of Pittsburgh April 16, 2019

Plan for this lecture Generative models: What are they? Technique: Generative Adversarial Networks Applications Conditional GANs Cycle-consistency loss Dealing with sparse data, progressive training

Supervised vs Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. K-means clustering Lecture 13 - 10 Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Supervised vs Unsupervised Learning Data: x Just data, no labels! Goal: Learn some underlying hidden structure of the data Examples: Clustering, dimensionality reduction, feature learning, density estimation, etc. Autoencoders (Feature learning) Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Models Lecture 13 - Training data ~ pdata(x) Generated samples ~ pmodel(x) Want to learn pmodel(x) similar to pdata(x) Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Models Lecture 13 - Training data ~ pdata(x) Generated samples ~ pmodel(x) Want to learn pmodel(x) similar to pdata(x) Addresses density estimation, a core problem in unsupervised learning Several flavors: Explicit density estimation: explicitly define and solve for pmodel(x) Implicit density estimation: learn model that can sample from pmodel(x) w/o explicitly defining it Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Why Generative Models? Lecture 13 - - Realistic samples for artwork, super-resolution, colorization, etc. Generative models can be used to enhance training datasets with diverse synthetic data Generative models of time-series data can be used for simulation Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Adapted from Serena Young

Taxonomy of Generative Models Direct GAN Generative models Explicit density Implicit density Markov Chain Tractable density Approximate density GSN Fully Visible Belief Nets NADE MADE PixelRNN/CNN Variational Markov Chain Variational Autoencoder Boltzmann Machine Figure copyright and adapted from Ian Goodfellow, Tutorial on Generative Adversarial Networks, 2017. Change of variables models (nonlinear ICA) Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Networks Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Problem: Want to sample from complex, high-dimensional training distribution. No direct way to do this! Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution. Q: What can we use to represent this complex transformation? Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Networks Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Problem: Want to sample from complex, high-dimensional training distribution. No direct way to do this! Solution: Sample from a simple distribution, e.g. random noise. Learn transformation to training distribution. Output: Sample from training distribution Q: What can we use to represent this complex transformation? Generator Network A: A neural network! Input: Random noise z Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Real or Fake Discriminator Network Fake Images (from generator) Real Images (from training set) Generator Network Random noise z Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Adversarial Networks Framework D tries to output 1 Differentiable function D x sampled from data D tries to output 0 Differentiable function D x sampled from model Differentiable function G Input noise Z Ian Goodfellow

Adversarial Networks Framework Generator 𝑥 ~ 𝐺(𝑧) Discriminator Real vs. Fake [Goodfellow et al. 2014] Jun-Yan Zhu

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Minimax objective function: Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Discriminator outputs likelihood in (0,1) of real image Minimax objective function: Discriminator output for real data x Discriminator output for generated fake data G(z) Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Train jointly in minimax game Discriminator outputs likelihood in (0,1) of real image Minimax objective function: Discriminator output for real data x Discriminator output for generated fake data G(z) Discriminator (θd) wants to maximize objective such that D(x) is close to 1 (real) and D(G(z)) is close to 0 (fake) Generator (θg) wants to minimize objective such that D(G(z)) is close to 1 (discriminator is fooled into thinking generated G(z) is real) Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Gradient descent on generator Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator Gradient signal dominated by region where sample is already good 2. Gradient descent on generator When sample is likely fake, want to learn from it to improve generator. But gradient in this region is relatively flat! In practice, optimizing this generator objective does not work well! 11 Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Minimax objective function: Alternate between: 1. Gradient ascent on discriminator 2. Instead: Gradient ascent on generator, different objective Instead of minimizing likelihood of discriminator being correct, now maximize likelihood of discriminator being wrong. Same objective of fooling discriminator, but now higher gradient signal for bad samples => works much better! Standard in practice. High gradient signal Low gradient signal Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Putting it together: GAN training algorithm Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Training GANs: Two-player game Ian Goodfellow et al., “Generative Adversarial Nets”, NIPS 2014 Generator network: try to fool the discriminator by generating real-looking images Discriminator network: try to distinguish between real and fake images Real or Fake Discriminator Network Fake Images (from generator) Random noise Real Images (from training set) Generator Network After training, use generator network to generate new images z Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

GAN training is challenging Vanishing gradient – when discriminator is very good Mode collapse – too little diversity in the samples generated Lack of convergence because hard to reach Nash equilibrium Loss metric doesn’t always correspond to image quality; Frechet Inception Distance (FID) is a decent choice

Alternative loss functions https://github.com/hwalsuklee/tensorflow-generative-model-collections

WGAN vs GAN https://medium.com/@jonathan_hui/gan-wasserstein-gan-wgan-gp-6a1a2aa1b490

Tips and tricks Use batchnorm, ReLU Regularize norm of gradients Use one of the new loss functions Add noise to inputs or labels Append image similarity to avoid mode collapse Use labels when available (CGAN) … https://github.com/soumith/talks/blob/master/2017-ICCV_Venice/How_To_Train_a_GAN.pdf https://towardsdatascience.com/gan-ways-to-improve-gan-performance-acf37f9f59b

Generative Adversarial Nets: Interpretable Vector Math Radford et al, ICLR 2016 Smiling woman Neutral woman Neutral man Samples from the model Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Nets: Interpretable Vector Math Radford et al, ICLR 2016 Smiling woman Neutral woman Neutral man Samples from the model Average Z vectors, do arithmetic Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Nets: Interpretable Vector Math Radford et al, ICLR 2016 Smiling woman Neutral woman Neutral man Smiling Man Samples from the model Average Z vectors, do arithmetic Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Nets: Interpretable Vector Math Glasses man No glasses man No glasses woman Radford et al, ICLR 2016 Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

Generative Adversarial Nets: Interpretable Vector Math Glasses man No glasses man No glasses woman Woman with glasses Radford et al, ICLR 2016 Lecture 13 - Fei-Fei Li & Justin Johnson & Serena Yeung May 18, 2017 Serena Young

What is in this image? (Yeh et al., 2016) Ian Goodfellow

Generative modeling reveals a face (Yeh et al., 2016) Ian Goodfellow

Artificial Fashion: vue.ai ✓ Ian Goodfellow

Celebrities Who Never Existed (Goodfell ow 2017) Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Creative Adversarial Networks (Elgammal et al., 2017) Ian Goodfellow

GANs for Privacy (Action Detection) Ren et al., “Learning to Anonymize Faces for Privacy Preserving Action Detection”, ECCV 2018

Plan for this lecture Generative models: What are they? Technique: Generative Adversarial Networks Applications Conditional GANs Cycle-consistency loss Dealing with sparse data, progressive training

Conditional GANs https://medium.com/@jonathan_hui/gan-cgan-infogan-using-labels-to-improve-gan-8ba4de5f9c3d

GANs D G 𝐺: generate fake samples that can fool 𝐷 real or fake? Discriminator x G(x) D Generator G 𝐺: generate fake samples that can fool 𝐷 𝐷: classify fake samples vs. real images [Goodfellow et al. 2014] Jun-Yan Zhu

Conditional GANs G D x G(x) real or fake pair ? Adapted from Jun-Yan Zhu

Edges → Images Input Output Input Output Input Output Edges from [Xie & Tu, 2015] Pix2pix / CycleGAN

Sketches → Images Trained on Edges → Images Input Output Input Output Data from [Eitz, Hays, Alexa, 2012] Pix2pix / CycleGAN

https://affinelayer.com/pixsrv/ #edges2cats [Christopher Hesse] @gods_tail @matthematician Vitaly Vidmirov @vvid Ivy Tasi @ivymyt https://affinelayer.com/pixsrv/ Pix2pix / CycleGAN

Paired Unpaired … … … Jun-Yan Zhu

… … … Cycle Consistency Discriminator D Y : 𝐿 𝐺𝐴𝑁 𝐺 𝑥 ,𝑦 Real zebras vs. generated zebras Zhu et al., “Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks”, ICCV 2017

… … Cycle Consistency Discriminator D Y : 𝐿 𝐺𝐴𝑁 𝐺 𝑥 ,𝑦 Real zebras vs. generated zebras Discriminator D X : 𝐿 𝐺𝐴𝑁 𝐹 𝑦 ,𝑥 Real horses vs. generated horses Zhu et al., “Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks”, ICCV 2017

Cycle Consistency Forward cycle loss: F G x −x 1 x G(x) F(G x ) Zhu et al., “Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks”, ICCV 2017

Cycle Consistency Forward cycle loss: F G x −x 1 G(x) F(G x ) x Small cycle loss Large cycle loss Helps cope with mode collapse Zhu et al., “Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks”, ICCV 2017

Training Details: Objective Zhu et al., “Unpaired Image-To-Image Translation Using Cycle-Consistent Adversarial Networks”, ICCV 2017

Input Monet Van Gogh Cezanne Ukiyo-e Pix2pix / CycleGAN

Pix2pix / CycleGAN

Pix2pix / CycleGAN

Pix2pix / CycleGAN

Pix2pix / CycleGAN

Pix2pix / CycleGAN

StarGAN Choi et al., “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation”, CVPR 2018

Plan for this lecture Generative models: What are they? Technique: Generative Adversarial Networks Applications Conditional GANs Cycle-consistency loss Dealing with sparse data, progressive training

Generating with little data for ads Faces are persuasive and carry meaning/sentiment We learn to generate faces appropriate for each ad category Because our data is so diverse yet limited in count, standard approaches that directly model pixel distributions don’t work well Beauty Clothing Safety Soda Cars Human Rights Self Esteem Chocolate Domestic Violence Thomas and Kovashka, BMVC 2018

Generating with little data for ads Instead we model the distribution over attributes for each category (e.g. domestic violence ads contain “black eye”, beauty contains “red lips”) Generate an image with the attributes of an ad class Model attributes w/ help from external large dataset 128x128x3 Input 32x32x16 16x16x32 8x8x64 4x4x128 2x2x256 512 64x64x8 N 150 1024 Output Encoder Sampling Decoder 𝟏𝟎𝟎 (𝝈) 𝟏𝟎𝟎 (𝝁) Embedding Facial Expressions (10-D) Facial Attributes (40-D) Latent (100-D) Latent captures non-semantic appearance properties (colors, etc.) Externally Enforced Semantics Facial attributes: <Attractive, Baggy eyes, Big lips, Bushy eyebrows, Eyeglasses, Gray hair, Makeup, Male, Pale skin, Rosy cheeks, etc.> Facial expressions: <Anger, Contempt, Disgust, Fear, Happy, Neutral, Sad, Surprise> + Valence and Arousal scores Thomas and Kovashka, BMVC 2018

Generating with little data for ads Original Face Transform Reconstruction Beauty Clothing D.V. Safety Soda Alcohol StarGAN (C) StarGAN (T) Ours Conditional Latent Thomas and Kovashka, BMVC 2018

Stagewise generation Singh et al., “FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery”, CVPR 2019

Stagewise generation Johnson et al., “Image Generation from Scene Graphs”, CVPR 2018

Progressive generation 64×64 4×4 G Latent D Real or fake Generated image . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D 64×64 4×4 64×64 4×4 Latent Real or fake Generated image . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 1024×1024 1024×1024 4×4 Real or fake Generated image . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 4×4 1024×1024 G D Latent Real or fake There’s waves everywhere! But where’s the shore? . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 64×64 1024×1024 1024×1024 64×64 4×4 Real or fake There it is! . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 4×4 Real or fake . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 4×4 Real or fake . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 8×8 8×8 4×4 Real or fake . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation D Latent 4×4 1024×1024 1024×1024 4×4 Real or fake . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 4×4 2x 8×8 Replicated block Nearest-neighbor upsampling 2x 16×16 3×3 convolution . 2x 32×32 Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 4×4 toRGB 4×4 2x 8×8 8×8 2x 16×16 32×32 1×1 convolution . Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 4×4 4×4 toRGB 2x 8×8 8×8 toRGB 2x 16×16 16×16 . 2x 32×32 32×32 Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 4×4 toRGB 4×4 2x 8×8 + 8×8 toRGB 2x 16×16 16×16 Linear crossfade . 2x 32×32 32×32 Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

Progressive generation 32×32 0.5x 16×16 8×8 4×4 fromRGB D 4×4 4×4 2x 8×8 8×8 toRGB 2x 16×16 16×16 . 2x 32×32 32×32 Karras et al., “Progressive Growing of GANs for Improved Quality, Stability, and Variation”, ICLR 2018

What’s next algorithmically? And what are some social implications?

“Deepfakes” https://www.technologyreview.com/s/611726/the-defense-department-has-produced-the-first-tools-for-catching-deepfakes/ https://www.niemanlab.org/2018/11/how-the-wall-street-journal-is-preparing-its-journalists-to-detect-deepfakes/

You can be anyone you want… Karras et al., “A Style-Based Generator Architecture for Generative Adversarial Networks”, https://arxiv.org/pdf/1812.04948.pdf