Adversarially Tuned Scene Generation

Slides:



Advertisements
Similar presentations
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Advertisements

Patch to the Future: Unsupervised Visual Prediction
Automatic scene inference for 3D object compositing Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth.
Data-Driven Markov Chain Monte Carlo Presented by Tomasz MalisiewiczTomasz Malisiewicz for Advanced PerceptionAdvanced Perception 3/1/2006.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
A General Framework for Tracking Multiple People from a Moving Camera
80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
Motion Estimation using Markov Random Fields Hrvoje Bogunović Image Processing Group Faculty of Electrical Engineering and Computing University of Zagreb.
Convolutional Neural Networks for Direct Text Deblurring
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Unsupervised Learning of Video Representations using LSTMs
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Convolutional Neural Network
Summary of “Efficient Deep Learning for Stereo Matching”
Object Detection based on Segment Masks
Deep Learning Amin Sobhani.
Compact Bilinear Pooling
Automatic Lung Cancer Diagnosis from CT Scans (Week 2)
DeepCount Mark Lenson.
Deep Predictive Model for Autonomous Driving
Jure Zbontar, Yann LeCun
Announcements Project proposal due tomorrow
Tracking Objects with Dynamics
Generative Adversarial Networks
Rotational Rectification Network for Robust Pedestrian Detection
Nonparametric Semantic Segmentation
RIVER SEGMENTATION FOR FLOOD MONITORING
Recognition using Nearest Neighbor (or kNN)
Training Techniques for Deep Neural Networks
Dynamical Statistical Shape Priors for Level Set Based Tracking
Efficient Deep Model for Monocular Road Segmentation
CS6890 Deep Learning Weizhen Cai
Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros
estimated tracklet partition
Presenter: Hajar Emami
Deep Learning and Newtonian Physics
By: Kevin Yu Ph.D. in Computer Engineering
Computer Vision James Hays
Image Classification.
Convolutional Neural Networks for Visual Tracking
RGB-D Image for Scene Recognition by Jiaqi Guo
Image to Image Translation using GANs
Semantic segmentation
GAN Applications.
Papers 15/08.
Lip movement Synthesis from Text
Depth Aware Inpainting for Novel View Synthesis Jayant Thatte
View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.
Introduction to Object Tracking
Problems with CNNs and recent innovations 2/13/19
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian
Course Recap and What’s Next?
TPGAN overview.
Abnormally Detection
Unsupervised Perceptual Rewards For Imitation Learning
Department of Computer Science Ben-Gurion University of the Negev
Human-object interaction
Deep Object Co-Segmentation
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Semantic Segmentation
Object Detection Implementations
Learning Deconvolution Network for Semantic Segmentation
Cengizhan Can Phoebe de Nooijer
Yalchin Efendiev Texas A&M University
Point Set Representation for Object Detection and Beyond
Directional Occlusion with Neural Network
CVPR 2019 Poster.
Shengcong Chen, Changxing Ding, Minfeng Liu 2018
Presentation transcript:

Adversarially Tuned Scene Generation Presenter: Kaiyue Zhou

Pipeline Background Motivation Structure Approach Experiments Existing methods for domain shift Existing Scene generation techniques Motivation Structure Approach Experiments

Background Domain Shift: Scene Generation Techniques Manually insert specific attributes such as illumination or pose to achieve invariance. Learning a scene prior to generation for the specific target domain. Scene Generation Techniques Optimal spatial arrangement of 3D models is well studied: Simulated annealing. Reversible-jump MCMC (Monte-Carlo Markov chain) Factor potentials

Motivation Recent works: Augmenting simulated data with a few labelled real samples can ameliorate domain shift. But costly. GAN uses unlabeled samples to obtain better estimation of parameters in generative models. Generator G: generate semantic label maps; Discriminator D: classify real and virtual data.

Structure

Discriminator Discriminator D: AlexNet Standard stochastic gradient descent with backpropagation. Using data augmentation techniques. Tuning G: classifier outputs conditional probability p(v=real|Θ)

Generator Generator G: DeepLab Real world target datasets Using Atrous convolutions for multi-scale objects; Fully connected conditional random field (CRF) for obtain relations of pixel to pixel and instance to instance. Able to generate 7 classes in this case: vehicle, pedestrian, building, vegetation, road, ground, and sky. Real world target datasets CityScapes CamVid

Approach Probabilistic scene generation Geometry: Photometry: two kinds of probabilistic distributions for points and attributes; Photometry: environmental parameters like light, weather and camera. Blender to render. Dynamics (not in this work)

Geometry Spatial non-overlap, cooccurrence and coherence among instances are incorporated with Gibbs potentials. Density of object layouts: Where

Photometry

Experiments

Presenter: Kaiyue Zhou High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs Presenter: Kaiyue Zhou

Pipeline Background and Motivation Structure Approach Experiments Use of GANs and purpose Existing image-to-image techniques and purpose Existing visual manipulation and purpose Structure Approach Experiments

Background and Motivation GANs Unconditional setting Image generation image manipulation representation learning object detection video apps Aim to coarse-to-fine generator and multi-scale discriminator for conditional image generation

Background and Motivation Image-to-image translation Conventional L1 loss Blurry Conditional GANs using adversarial loss function to avoid blurry images Pix2pix framework (hard for high-resolution) Perceptual loss (high-resolution but lack fine details) Aim to high-resolution and fine details

Background and Motivation Deep visual manipulation Works well for style transfer, inpainting, colorization and restoration Lack of interface to operate on the result Or low level manipulation such as color and sketch Aim to provide user interface for object-level semantic editing.

Structure – generator Baseline: pix2pix framework Generator G: U-Net Discriminator D: a patch-based fully convolutional network Resolution: 256*256 Improvement for high-resolution Global generator network G1 Local enhancer network G2

Structure – generator

Structure – discriminator Multi-scale discriminators: 3 discriminators with different scales for different resolution of images. In addition, feature matching loss is incorporated by match intermediate representations of real and synthesized image from multiple layers of discriminator.

Instance map Semantic label map: does not consider the boundary of instances. Concatenated with the one-hot vector representation from semantic label map to be fed to generator. As well to discriminator.

Learning with instance-level feature Using a feature encoder, which is a standard encode- decoder network. Change G(s) to G(s,E(x)) in the loss function.

Experiments (semantic segmentation) PSPNet is used to predict the ground truth label.

Experiments Image generation

Experiments (editing)

Experiments (editing)