Unsupervised Conditional Generation

Slides:

Advertisements

Similar presentations

Xiaoxiao Shi, Qi Liu, Wei Fan, Philip S. Yu, and Ruixin Zhu

Advertisements

Jointly Optimized Regressors for Image Super-resolution Dengxin Dai, Radu Timofte, and Luc Van Gool Computer Vision Lab, ETH Zurich 1.

Deep Visual Analogy-Making

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Deep Learning(II) Dong Wang CSLT ML Summer Seminar (5)

Today’s Lecture Neural networks Training

Conditional Generative Adversarial Networks

Generative Adversarial Nets

Unsupervised Learning of Video Representations using LSTMs

Generative Adversarial Network (GAN)

Improving Generative Adversarial Network

Deep learning David Kauchak CS158 – Fall 2016.

SeaRNN: training RNNs with global-local losses

ECE 5424: Introduction to Machine Learning

Deep Predictive Model for Autonomous Driving

Data Driven Attributes for Action Detection

Classification: Logistic Regression

Adversarial Learning for Neural Dialogue Generation

Many slides and slide ideas thanks to Marc'Aurelio Ranzato and Michael Nielson.

Perceptual Loss Deep Feature Interpolation for Image Content Changes

Generative Adversarial Networks

Neural Networks for Machine Learning Lecture 1e Three types of learning Geoffrey Hinton with Nitish Srivastava Kevin Swersky.

Multi-Perspective Panoramas

Cold-Start Heterogeneous-Device Wireless Localization

Chaoyun Zhang, Xi Ouyang, and Paul Patras

Intelligent Information System Lab

Ying Kin Yu, Kin Hong Wong and Siu Hang Or

Generative adversarial networks (GANs) for edge detection

Juncen Li1, Robin Jia2, He He2, and Percy Liang2

Unsupervised Learning and Autoencoders

Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros

Presenter: Hajar Emami

Adversarially Tuned Scene Generation

State-of-the-art face recognition systems

Deep Learning based Machine Translation

Improving Sequence Generation by GAN

PixelGAN Autoencoders

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Object Detection + Deep Learning

Introduction of MATRIX CAPSULES WITH EM ROUTING

with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017

Image recognition: Defense adversarial attacks

Non-Stationary Texture Synthesis by Adversarial Expansion

David Healey BYU Capstone Course 15 Nov 2018

Image to Image Translation using GANs

GAN Applications.

Deep Robust Unsupervised Multi-Modal Network

Lip movement Synthesis from Text

RCNN, Fast-RCNN, Faster-RCNN

Machine learning overview

Recent Advances in Generative Adversarial Networks (GAN)

CS 2770: Computer Vision Generative Adversarial Networks

Advances in Deep Audio and Audio-Visual Processing

TPGAN overview.

Abnormally Detection

Attention for translation

LOGAN: Unpaired Shape Transform in Latent Overcomplete Space

Ch 14. Generative adversarial networks (GANs) for edge detection

3. Adversarial Teacher-Student Learning (AT/S)

A Survey to Deep Facial Attribute Manipulation Methods

Learning Compositional Visual Concepts with Mutual Consistency

Word representations David Kauchak CS158 – Fall 2016.

Image Processing and Multi-domain Translation

one-input multi-output architecture

Hsiao-Yu Chiang Xiaoya Li Ganyu “Bruce” Xu Pinshuo Ye Zhanyuan Zhang

Angel A. Cantu, Nami Akazawa Department of Computer Science

CVPR 2019 Tutorial on Map Synchronization

Text-to-speech (TTS) Traditional approaches (before 2016) Neural TTS

Improving Cross-lingual Entity Alignment via Optimal Transport

Presentation transcript:

Unsupervised Conditional Generation 沒講 http://paintstransfer.com https://www.bilibili.com/video/av17537429/

Unsupervised Conditional Generation Domain X Domain Y Transform an object from one domain to another without paired data (e.g. style transfer) Domain X Domain Y male female landscape photo Van Gogh Transform an object from one domain to another without paired data It is good. It’s a good day. I love you. It is bad. It’s a bad day. I don’t love you.

Unsupervised Conditional Generation Approach 1: Direct Transformation Approach 2: Projection to Common Space ? 𝐺 𝑋→𝑌 For texture or color change Domain X Domain Y This is not real example. 𝐸𝑁 𝑋 𝐷𝐸 𝑌 Domain X Encoder of domain X Face Attribute Decoder of domain Y Domain Y Larger change, only keep the semantics

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X ? 𝐺 𝑋→𝑌 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley Input image belongs to domain Y or not Domain Y

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 Not what we want! ignore input 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley Input image belongs to domain Y or not Domain Y

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 Not what we want! ignore input 𝐷 𝑌 scalar https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley The issue can be avoided by network design. Input image belongs to domain Y or not Simpler generator makes the input and output more closely related. [Tomer Galanti, et al. ICLR, 2018]

Direct Transformation Domain X Direct Transformation Domain Y Become similar to domain Y Domain X 𝐺 𝑋→𝑌 𝐷 𝑌 scalar Encoder Network Encoder Network https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/ Berkeley AI Research (BAIR) laboratory, UC Berkeley pre-trained Input image belongs to domain Y or not as close as possible Baseline of DTN [Yaniv Taigman, et al., ICLR, 2017]

Direct Transformation [Jun-Yan Zhu, et al., ICCV, 2017] Direct Transformation as close as possible Cycle consistency 𝐺 𝑋→𝑌 𝐺 Y→X Lack of information for reconstruction 𝐷 𝑌 scalar Input image belongs to domain Y or not Domain Y

Direct Transformation as close as possible 𝐺 𝑋→𝑌 𝐺 Y→X 𝐷 𝑌 scalar: belongs to domain X or not scalar: belongs to domain Y or not 𝐷 𝑋 http://sunreader1309.pixnet.net/blog/post/394665458-%E6%94%9D%E5%BD%B1%E8%88%87%E7%95%AB%E4%BD%9C%EF%BC%8C%E6%A2%B5%E8%B0%B7%E8%87%AA%E7%95%AB%E5%83%8F%E7%9A%84%E7%9C%9F%E9%9D%A2%E7%9B%AE%3F-------------- 𝐺 Y→X 𝐺 𝑋→𝑌 as close as possible

Cycle GAN – Silver Hair https://github.com/Aixile/c hainer-cyclegan

Cycle GAN – Silver Hair https://github.com/Aixile/c hainer-cyclegan

Issue of Cycle Consistency CycleGAN: a Master of Steganography (隱寫術) [Casey Chu, et al., NIPS workshop, 2017] 𝐺 Y→X 𝐺 𝑋→𝑌 Steg no gra phy The information is hidden.

Disco GAN Dual GAN Cycle GAN [Taeksoo Kim, et al., ICML, 2017] [Zili Yi, et al., ICCV, 2017] Cycle GAN [Jun-Yan Zhu, et al., ICCV, 2017]

StarGAN For multiple domains, considering starGAN [Yunjey Choi, arXiv, 2017] StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

StarGAN

StarGAN

StarGAN

Unsupervised Conditional Generation Approach 1: Direct Transformation Approach 2: Projection to Common Space ? 𝐺 𝑋→𝑌 For texture or color change Domain X Domain Y This is not real example. 𝐸𝑁 𝑋 𝐷𝐸 𝑌 Domain X Encoder of domain X Face Attribute Decoder of domain Y Domain Y Larger change, only keep the semantics

Projection to Common Space Target image image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image Face Attribute Domain X Domain Y

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error image 𝐷𝐸 𝑋 image 𝐸𝑁 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image Domain X Domain Y

Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image 𝐷𝐸 𝑋 image 𝐸𝑁 𝑋 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Minimizing reconstruction error Because we train two auto-encoders separately … The images with the same attribute may not project to the same position in the latent space.

Projection to Common Space Training 𝐸𝑁 𝑋 𝐷𝐸 𝑋 𝐸𝑁 𝑌 𝐷𝐸 𝑌 Sharing the parameters of encoders and decoders Couple GAN[Ming-Yu Liu, et al., NIPS, 2016] UNIT[Ming-Yu Liu, et al., NIPS, 2017]

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Domain Discriminator 𝐸𝑁 𝑋 and 𝐸𝑁 𝑌 fool the domain discriminator From 𝐸𝑁 𝑋 or 𝐸𝑁 𝑌 The domain discriminator forces the output of 𝐸𝑁 𝑋 and 𝐸𝑁 𝑌 have the same distribution. [Guillaume Lample, et al., NIPS, 2017]

Minimizing reconstruction error Projection to Common Space Training Minimizing reconstruction error Discriminator of X domain image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Cycle Consistency: Used in ComboGAN [Asha Anoosheh, et al., arXiv, 017]

Projection to Common Space Training To the same latent space Discriminator of X domain image 𝐸𝑁 𝑋 𝐷𝐸 𝑋 image 𝐷 𝑋 image 𝐸𝑁 𝑌 𝐷𝐸 𝑌 image 𝐷 𝑌 Discriminator of Y domain Semantic Consistency: Used in DTN [Yaniv Taigman, et al., ICLR, 2017] and XGAN [Amélie Royer, et al., arXiv, 2017]

世界二次元化 Using the code: https://github.com/Hi- king/kawaii_creator It is not cycle GAN, Disco GAN input output domain 很有用線上課程

Voice Conversion

Speakers A and B are talking about completely different things. In the past Speaker A Speaker B How are you? How are you? Good morning Good morning Today Speaker A Speaker B 天氣真好 How are you? 再見囉 Good morning Speakers A and B are talking about completely different things.

Speaker A Speaker B 我感謝周儒杰同學提供實驗結果

Reference Jun-Yan Zhu, Taesung Park, Phillip Isola, Alexei A. Efros, Unpaired Image-to- Image Translation using Cycle-Consistent Adversarial Networks, ICCV, 2017 Zili Yi, Hao Zhang, Ping Tan, Minglun Gong, DualGAN: Unsupervised Dual Learning for Image-to-Image Translation, ICCV, 2017 Tomer Galanti, Lior Wolf, Sagie Benaim, The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings, ICLR, 2018 Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised Cross-Domain Image Generation, ICLR, 2017 Asha Anoosheh, Eirikur Agustsson, Radu Timofte, Luc Van Gool, ComboGAN: Unrestrained Scalability for Image Domain Translation, arXiv, 2017 Amélie Royer, Konstantinos Bousmalis, Stephan Gouws, Fred Bertsch, Inbar Mosseri, Forrester Cole, Kevin Murphy, XGAN: Unsupervised Image-to-Image Translation for Many-to-Many Mappings, arXiv, 2017

Reference Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic Denoyer, Marc'Aurelio Ranzato, Fader Networks: Manipulating Images by Sliding Attributes, NIPS, 2017 Taeksoo Kim, Moonsu Cha, Hyunsoo Kim, Jung Kwon Lee, Jiwon Kim, Learning to Discover Cross-Domain Relations with Generative Adversarial Networks, ICML, 2017 Ming-Yu Liu, Oncel Tuzel, “Coupled Generative Adversarial Networks”, NIPS, 2016 Ming-Yu Liu, Thomas Breuel, Jan Kautz, Unsupervised Image-to-Image Translation Networks, NIPS, 2017 Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, Jaegul Choo, StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation, arXiv, 2017