Unsupervised Conditional Generation

Slides:



Advertisements
Similar presentations
Xiaoxiao Shi, Qi Liu, Wei Fan, Philip S. Yu, and Ruixin Zhu
Advertisements

Jointly Optimized Regressors for Image Super-resolution Dengxin Dai, Radu Timofte, and Luc Van Gool Computer Vision Lab, ETH Zurich 1.
Deep Visual Analogy-Making
PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Deep Learning(II) Dong Wang CSLT ML Summer Seminar (5)
Today’s Lecture Neural networks Training
Conditional Generative Adversarial Networks
Generative Adversarial Nets
Unsupervised Learning of Video Representations using LSTMs
Generative Adversarial Network (GAN)
Improving Generative Adversarial Network
Deep learning David Kauchak CS158 – Fall 2016.
SeaRNN: training RNNs with global-local losses
ECE 5424: Introduction to Machine Learning
Deep Predictive Model for Autonomous Driving
Data Driven Attributes for Action Detection
Classification: Logistic Regression
Adversarial Learning for Neural Dialogue Generation
Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.
Perceptual Loss Deep Feature Interpolation for Image Content Changes
Generative Adversarial Networks
Neural Networks for Machine Learning Lecture 1e Three types of learning Geoffrey Hinton with Nitish Srivastava Kevin Swersky.
Multi-Perspective Panoramas
Cold-Start Heterogeneous-Device Wireless Localization
Chaoyun Zhang, Xi Ouyang, and Paul Patras
Intelligent Information System Lab
Ying Kin Yu, Kin Hong Wong and Siu Hang Or
Generative adversarial networks (GANs) for edge detection
Juncen Li1, Robin Jia2, He He2, and Percy Liang2
Unsupervised Learning and Autoencoders
Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros
Presenter: Hajar Emami
Adversarially Tuned Scene Generation
State-of-the-art face recognition systems
Deep Learning based Machine Translation
Improving Sequence Generation by GAN
PixelGAN Autoencoders
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Object Detection + Deep Learning
Introduction of MATRIX CAPSULES WITH EM ROUTING
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Image recognition: Defense adversarial attacks
Non-Stationary Texture Synthesis by Adversarial Expansion
David Healey BYU Capstone Course 15 Nov 2018
Image to Image Translation using GANs
GAN Applications.
Deep Robust Unsupervised Multi-Modal Network
Lip movement Synthesis from Text
RCNN, Fast-RCNN, Faster-RCNN
Machine learning overview
Recent Advances in Generative Adversarial Networks (GAN)
CS 2770: Computer Vision Generative Adversarial Networks
Advances in Deep Audio and Audio-Visual Processing
TPGAN overview.
Abnormally Detection
Attention for translation
LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
Ch 14. Generative adversarial networks (GANs) for edge detection
3. Adversarial Teacher-Student Learning (AT/S)
A Survey to Deep Facial Attribute Manipulation Methods
Learning Compositional Visual Concepts with Mutual Consistency
Word representations David Kauchak CS158 – Fall 2016.
Image Processing and Multi-domain Translation
one-input multi-output architecture
Hsiao-Yu Chiang Xiaoya Li Ganyu “Bruce” Xu Pinshuo Ye Zhanyuan Zhang
Angel A. Cantu, Nami Akazawa Department of Computer Science
CVPR 2019 Tutorial on Map Synchronization
Text-to-speech (TTS) Traditional approaches (before 2016) Neural TTS
Improving Cross-lingual Entity Alignment via Optimal Transport
Presentation transcript:

Unsupervised Conditional Generation 沒講 http://paintstransfer.com https://www.bilibili.com/video/av17537429/

Unsupervised Conditional Generation Domain X Domain Y Transform an object from one domain to another without paired data (e.g. style transfer) Domain X Domain Y male female landscape photo Van Gogh Transform an object from one domain to another without paired data It is good. It’s a good day. I love you. It is bad. It’s a bad day. I don’t love you.

Unsupervised Conditional Generation Approach 1: Direct Transformation Approach 2: Projection to Common Space ? 𝐺 𝑋→𝑌 For texture or color change Domain X Domain Y This is not real example. 𝐸𝑁 𝑋 𝐷𝐸 𝑌 Domain X Encoder of domain X Face Attribute Decoder of domain Y Domain Y Larger change, only keep the semantics

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X ? 𝐺 𝑋→𝑌 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley Input image belongs to domain Y or not Domain Y

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 Not what we want! ignore input 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley Input image belongs to domain Y or not Domain Y

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 Not what we want! ignore input 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley The issue can be avoided by network design. Input image belongs to domain Y or not Simpler generator makes the input and output more closely related. [Tomer Galanti, et al. ICLR, 2018]

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 𝐷 𝑌 scalar Encoder Network Encoder Network https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley pre-trained Input image belongs to domain Y or not as close as possible Baseline of DTN [Yaniv Taigman, et al., ICLR, 2017]

Direct Transformation [Jun-Yan Zhu, et al., ICCV, 2017] Direct Transformation as close as possible Cycle consistency 𝐺 𝑋→𝑌 𝐺 Y→X Lack of information for reconstruction 𝐷 𝑌 scalar Input image belongs to domain Y or not Domain Y

Direct Transformation as close as possible 𝐺 𝑋→𝑌 𝐺 Y→X 𝐷 𝑌 scalar: belongs to domain X or not scalar: belongs to domain Y or not 𝐷 𝑋 http://sunreader1309.pixnet.net/blog/post/394665458-%E6%94%9D%E5%BD%B1%E8%88%87%E7%95%AB%E4%BD%9C%EF%BC%8C%E6%A2%B5%E8%B0%B7%E8%87%AA%E7%95%AB%E5%83%8F%E7%9A%84%E7%9C%9F%E9%9D%A2%E7%9B%AE%3F-------------- 𝐺 Y→X 𝐺 𝑋→𝑌 as close as possible

Cycle GAN – Silver Hair https://github.com/Aixile/c hainer-cyclegan

Cycle GAN – Silver Hair https://github.com/Aixile/c hainer-cyclegan

Issue of Cycle Consistency CycleGAN: a Master of Steganography (隱寫術) [Casey Chu, et al., NIPS workshop, 2017] 𝐺 Y→X 𝐺 𝑋→𝑌 Steg no gra phy The information is hidden.

Disco GAN Dual GAN Cycle GAN [Taeksoo Kim, et al., ICML, 2017] [Zili Yi, et al., ICCV, 2017] Cycle GAN [Jun-Yan Zhu, et al., ICCV, 2017]

StarGAN For multiple domains, considering starGAN [Yunjey Choi, arXiv, 2017] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

StarGAN

StarGAN

StarGAN

Unsupervised Conditional Generation Approach 1: Direct Transformation Approach 2: Projection to Common Space ? 𝐺 𝑋→𝑌 For texture or color change Domain X Domain Y This is not real example. 𝐸𝑁 𝑋 𝐷𝐸 𝑌 Domain X Encoder of domain X Face Attribute Decoder of domain Y Domain Y Larger change, only keep the semantics

Projection to Common Space Target image image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image Face Attribute Domain X Domain Y

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error image 𝐷𝐸 𝑋 image 𝐸𝑁 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image Domain X Domain Y

Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image 𝐷𝐸 𝑋 image 𝐸𝑁 𝑋 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Minimizing reconstruction error Because we train two auto-encoders separately … The images with the same attribute may not project to the same position in the latent space.

Projection to Common Space Training 𝐸𝑁 𝑋 𝐷𝐸 𝑋 𝐸𝑁 𝑌 𝐷𝐸 𝑌 Sharing the parameters of encoders and decoders Couple GAN[Ming-Yu Liu, et al., NIPS, 2016] UNIT[Ming-Yu Liu, et al., NIPS, 2017]

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Domain Discriminator 𝐸𝑁 𝑋 and 𝐸𝑁 𝑌 fool the domain discriminator From 𝐸𝑁 𝑋 or 𝐸𝑁 𝑌 The domain discriminator forces the output of 𝐸𝑁 𝑋 and 𝐸𝑁 𝑌 have the same distribution. [Guillaume Lample, et al., NIPS, 2017]

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Cycle Consistency: Used in ComboGAN [Asha Anoosheh, et al., arXiv, 017]

Projection to Common Space Training To the same latent space Discriminator of X domain image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Semantic Consistency: Used in DTN [Yaniv Taigman, et al., ICLR, 2017] and XGAN [Amélie Royer, et al., arXiv, 2017]

世界二次元化 Using the code: https://github.com/Hi- king/kawaii_creator It is not cycle GAN, Disco GAN input output domain 很有用 線上課程

Voice Conversion

Speakers A and B are talking about completely different things. In the past Speaker A Speaker B How are you? How are you? Good morning Good morning Today Speaker A Speaker B 天氣真好 How are you? 再見囉 Good morning Speakers A and B are talking about completely different things.

Speaker A Speaker B 我 感謝周儒杰同學提供實驗結果

Reference Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to- Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017 Zili Yi, Hao Zhang, Ping Tan, Minglun Gong, DualGAN: Unsupervised Dual Learning for Image-to-Image Translation, ICCV, 2017 Tomer Galanti, Lior Wolf, Sagie Benaim, The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings, ICLR, 2018 Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised Cross-Domain Image Generation, ICLR, 2017 Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool, ComboGAN: Unrestrained Scalability for Image Domain Translation, arXiv, 2017 Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy, XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings, arXiv, 2017

Reference Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato, Fader Networks: Manipulating Images by Sliding Attributes, NIPS, 2017 Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, Jiwon Kim, Learning to Discover Cross-Domain Relations with Generative Adversarial Networks, ICML, 2017 Ming-Yu Liu, Oncel Tuzel, “Coupled Generative Adversarial Networks”, NIPS, 2016 Ming-Yu Liu, Thomas Breuel, Jan Kautz, Unsupervised Image-to-Image Translation Networks, NIPS, 2017 Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, Jaegul Choo, StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, arXiv, 2017