Adversarially Tuned Scene Generation

Slides:

Advertisements

Similar presentations

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.

Advertisements

Patch to the Future: Unsupervised Visual Prediction

Automatic scene inference for 3D object compositing Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth.

Data-Driven Markov Chain Monte Carlo Presented by Tomasz MalisiewiczTomasz Malisiewicz for Advanced PerceptionAdvanced Perception 3/1/2006.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

A General Framework for Tracking Multiple People from a Moving Camera

80 million tiny images: a large dataset for non-parametric object and scene recognition CS 4763 Multimedia Systems Spring 2008.

Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.

Motion Estimation using Markov Random Fields Hrvoje Bogunović Image Processing Group Faculty of Electrical Engineering and Computing University of Zagreb.

Convolutional Neural Networks for Direct Text Deblurring

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Unsupervised Learning of Video Representations using LSTMs

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Convolutional Neural Network

Summary of “Efficient Deep Learning for Stereo Matching”

Object Detection based on Segment Masks

Deep Learning Amin Sobhani.

Compact Bilinear Pooling

Automatic Lung Cancer Diagnosis from CT Scans (Week 2)

DeepCount Mark Lenson.

Deep Predictive Model for Autonomous Driving

Jure Zbontar, Yann LeCun

Announcements Project proposal due tomorrow

Tracking Objects with Dynamics

Generative Adversarial Networks

Rotational Rectification Network for Robust Pedestrian Detection

Nonparametric Semantic Segmentation

RIVER SEGMENTATION FOR FLOOD MONITORING

Recognition using Nearest Neighbor (or kNN)

Training Techniques for Deep Neural Networks

Dynamical Statistical Shape Priors for Level Set Based Tracking

Efficient Deep Model for Monocular Road Segmentation

CS6890 Deep Learning Weizhen Cai

Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros

estimated tracklet partition

Presenter: Hajar Emami

Deep Learning and Newtonian Physics

By: Kevin Yu Ph.D. in Computer Engineering

Computer Vision James Hays

Image Classification.

Convolutional Neural Networks for Visual Tracking

RGB-D Image for Scene Recognition by Jiaqi Guo

Image to Image Translation using GANs

Semantic segmentation

GAN Applications.

Lip movement Synthesis from Text

Depth Aware Inpainting for Novel View Synthesis Jayant Thatte

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.

Introduction to Object Tracking

Problems with CNNs and recent innovations 2/13/19

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian

Course Recap and What’s Next?

TPGAN overview.

Abnormally Detection

Unsupervised Perceptual Rewards For Imitation Learning

Department of Computer Science Ben-Gurion University of the Negev

Human-object interaction

Deep Object Co-Segmentation

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

Object Detection Implementations

Learning Deconvolution Network for Semantic Segmentation

Cengizhan Can Phoebe de Nooijer

Yalchin Efendiev Texas A&M University

Point Set Representation for Object Detection and Beyond

Directional Occlusion with Neural Network

CVPR 2019 Poster.

Shengcong Chen, Changxing Ding, Minfeng Liu 2018

Presentation transcript:

Adversarially Tuned Scene Generation Presenter: Kaiyue Zhou

Pipeline Background Motivation Structure Approach Experiments Existing methods for domain shift Existing Scene generation techniques Motivation Structure Approach Experiments

Background Domain Shift: Scene Generation Techniques Manually insert specific attributes such as illumination or pose to achieve invariance. Learning a scene prior to generation for the specific target domain. Scene Generation Techniques Optimal spatial arrangement of 3D models is well studied: Simulated annealing. Reversible-jump MCMC (Monte-Carlo Markov chain) Factor potentials

Motivation Recent works: Augmenting simulated data with a few labelled real samples can ameliorate domain shift. But costly. GAN uses unlabeled samples to obtain better estimation of parameters in generative models. Generator G: generate semantic label maps; Discriminator D: classify real and virtual data.

Structure

Discriminator Discriminator D: AlexNet Standard stochastic gradient descent with backpropagation. Using data augmentation techniques. Tuning G: classifier outputs conditional probability p(v=real|Θ)

Generator Generator G: DeepLab Real world target datasets Using Atrous convolutions for multi-scale objects; Fully connected conditional random field (CRF) for obtain relations of pixel to pixel and instance to instance. Able to generate 7 classes in this case: vehicle, pedestrian, building, vegetation, road, ground, and sky. Real world target datasets CityScapes CamVid

Approach Probabilistic scene generation Geometry: Photometry: two kinds of probabilistic distributions for points and attributes; Photometry: environmental parameters like light, weather and camera. Blender to render. Dynamics (not in this work)

Geometry Spatial non-overlap, cooccurrence and coherence among instances are incorporated with Gibbs potentials. Density of object layouts: Where

Photometry

Experiments

Presenter: Kaiyue Zhou High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs Presenter: Kaiyue Zhou

Pipeline Background and Motivation Structure Approach Experiments Use of GANs and purpose Existing image-to-image techniques and purpose Existing visual manipulation and purpose Structure Approach Experiments

Background and Motivation GANs Unconditional setting Image generation image manipulation representation learning object detection video apps Aim to coarse-to-fine generator and multi-scale discriminator for conditional image generation

Background and Motivation Image-to-image translation Conventional L1 loss Blurry Conditional GANs using adversarial loss function to avoid blurry images Pix2pix framework (hard for high-resolution) Perceptual loss (high-resolution but lack fine details) Aim to high-resolution and fine details

Background and Motivation Deep visual manipulation Works well for style transfer, inpainting, colorization and restoration Lack of interface to operate on the result Or low level manipulation such as color and sketch Aim to provide user interface for object-level semantic editing.

Structure – generator Baseline: pix2pix framework Generator G: U-Net Discriminator D: a patch-based fully convolutional network Resolution: 256*256 Improvement for high-resolution Global generator network G1 Local enhancer network G2

Structure – generator

Structure – discriminator Multi-scale discriminators: 3 discriminators with different scales for different resolution of images. In addition, feature matching loss is incorporated by match intermediate representations of real and synthesized image from multiple layers of discriminator.

Instance map Semantic label map: does not consider the boundary of instances. Concatenated with the one-hot vector representation from semantic label map to be fed to generator. As well to discriminator.

Learning with instance-level feature Using a feature encoder, which is a standard encode- decoder network. Change G(s) to G(s,E(x)) in the loss function.

Experiments (semantic segmentation) PSPNet is used to predict the ground truth label.

Experiments Image generation

Experiments (editing)

Experiments (editing)