Deep Neural Net Scenery Generation

Slides:

Advertisements

Similar presentations

A brief review of non-neural-network approaches to deep learning

Advertisements

Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.

Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.

ImageNet Classification with Deep Convolutional Neural Networks

Support Vector Machines (and Kernel Methods in general)

Multiclass object recognition

Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.

CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.

Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.

CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.

Steven Hibbert 1 st Year Product Design Ideation Portfolio.

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Mingze Zhang, Mun Choon Chan and A. L. Ananda School of Computing

Stock Market Application: Review

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Bayesian Neural Networks

Generative Adversarial Nets

Learning to Compare Image Patches via Convolutional Neural Networks

Face Detection and Notification System Conclusion and Reference

CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.

Deep Feedforward Networks

Environment Generation with GANs

Data Mining, Neural Network and Genetic Programming

Design Process You already know the Scientific Method process. Today we are going to learn about the process engineers use when inventing new products.

CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Computer Science and Engineering, Seoul National University

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Generative Adversarial Networks

Intelligent Information System Lab

Machine Learning Basics

Convolutional Networks

Deep Belief Networks Psychology 209 February 22, 2013.

Adversarially Tuned Scene Generation

By: Kevin Yu Ph.D. in Computer Engineering

Human-level control through deep reinforcement learning

Zan Gao, Deyu Wang, Xiangnan He, Hua Zhang

Introduction to Neural Networks

CSC 578 Neural Networks and Deep Learning

Pose Estimation for non-cooperative Spacecraft Rendevous using CNN

CSC 578 Neural Networks and Deep Learning

On Convolutional Neural Network

GAN Applications.

Design Process You already know the Scientific Method process. Today we are going to learn about the process engineers use when inventing new products.

Lip movement Synthesis from Text

RCNN, Fast-RCNN, Faster-RCNN

Designing Neural Network Architectures Using Reinforcement Learning

Parametric Methods Berlin Chen, 2005 References:

Introduction to Object Tracking

August 22 Bellringer What is engineering?.

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

TPGAN overview.

Abnormally Detection

Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton

Unsupervised Perceptual Rewards For Imitation Learning

by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder

Automatic Handwriting Generation

Introduction to Neural Networks

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Neural Machine Translation using CNN

Image recognition.

Angel A. Cantu, Nami Akazawa Department of Computer Science

Cengizhan Can Phoebe de Nooijer

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Directional Occlusion with Neural Network

Presentation transcript:

Deep Neural Net Scenery Generation Adam Schunk

Motivation Many systems have been developed to turn text into images Want to be able to turn sketches into nicely rendered images This has been done for items, such as handbags etc. This application is mainly aesthetic However if it were applied to something such as profile sketching it could be quite useful

Examples This has been done a lot for indoor images (bedroom dataset shown in class) Adding detail has an effect on the outcome

Image Generation Two main components to current image generation deep learning networks Convolution (CNN) Adversarial (GAN) The combination of these two methods is DCGAN These two types of networks can be used separately separate DCGAN was recently developed

Convolutional Networks (Brief recap) Operate by alternating between convolutional layers and pooling layers Works well with any problem that involves grouping (letters into words, pixels into sub-images)

Convolution and Pooling layers Convolutional layers Extract features from the images by applying filters These filters are learned by the network itself Pooling layers Provide translational invariance Reduces the number of nodes in the network improving training time Size and number of filters is manually defined, however the internal structure of each is determined by the network Can either be used to make an input larger or smaller by sampling, but making the input smaller is more common Attempts to make use of the structure of images and is location independent (does not care about shift) This method assumes that nearby pixels are related to each other so fewer weights are needed overall This is done at every layer so these networks are generally easier to train

Adversarial Networks (GANs) Implemented by creating two neural networks that compete against each other in a zero-sum game. One network generates data while the other attempts to discriminate between the generated data and ground truth Zero sum meaning that for one side to win, the other must lose an equal amount. Thus there is no cooperation In these systems both networks are generally trying to learn the same thing, however you want the generator to win in the end. That being said you still want the discriminator to put up a fight

Deep Convolutional Generative Adversarial Networks (DCGANs) Combination of GAN and CNN Generator tries to fool Discriminator with realistic fake pictures Pictures are generated based on the assumption of an underlying PDF for the dataset Brief review (already did this in class) This is the adversarial part of GAN The generator and discriminator essentially train each other Discriminator trained to separate real and fake images Generator trained to create realistic fake images For this to work the assumption has to be made that there is an underlying PDF for the image set

K-dimensional PDF Break down the PDF into one dimension per pixel Probabilities for each dimension can be maximized based on the values of the other dimensions Given any x (pixel 1) value you can find the maximum likelihood y (pixel 2) value This is a particularly uninteresting scenario (gaussian distribution) Leads to a “most likely image” because of global max

Fake Image Generation Slightly more complicated PDF yields more possible outcomes By starting at any point on the graph you can attempt to converge to a local maximum Computing this is hard, so we just sample it This is what allows us to start with a partially completed image If we start with a sketch it should fall somewhere in that distribution and lead us to a local maximum This is what the generator is trying to learn to produce realistic fake images and what the discriminator is trying to learn to tell the fake images from real ones

DCGAN Weaknesses If the discriminator has a specific weakness then the generator can learn only that. PDF’s of image datasets can be extremely complex leading the generator to only learn a small portion of it. A powerful discriminator might make it very difficult for the generator to learn Certain features might trick the discriminator and thus break the generator if it is not trained well Generator could only end up exploring one section of the PDF as it is not trying to find ALL solutions to fool the discriminator. If the discriminator is too strong then it can be difficult for the generator to ever fool it. Generator will not learn about anything sub-par

Current Method: Model DCGAN is trained on sample images ReLU is used as the activation function for every layer in the generator The discriminator uses leaky ReLU Want to move on to landscapes if this proves successful The number and size of the layers has been changing

Current Method: Data US Adult Faces Database Contains 10,000 images Faces are from multiple angles and backgrounds Will switch the dataset once I have a working system with faces Initially I wanted to start of with landscapes, however there are a lot more databases of faces

Current Model: Results Take in a random sketch from a google search

Current Model: Results Lots of random noise Potential issues: Shape of DCGAN network Attempted to implement from scratch – there’s a lot of documentation on this type of thing Should follow that for my next attempt

Related Work Recurrent adversarial networks (GRAN) Style and Structure GAN (S2-GAN) for 3D modeling Since I do not have any interesting results I thought I would show some people who do

Recurrent Adversarial Networks Proposes adding a recurrent aspect to the generator in a traditional GAN Developed a new way to evaluate the performance of different GANs Talk a little about recurrent networks Generator takes a !sequence! of noise samples as input Training phase has wooden swords and the test phase has metal ones XD In the past GANs were evaluated by looking at nearest neighbors in training data, or by hand Each generator now tries to fool the other team’s discriminator

S2-GAN Attempts to incorporate GAN into computer vision Contains two separate GANs Structure GAN: Generates surface norms Style Gan: Maps images onto those surfaces Surface norms are generally fuzzy and the style does not always line up

S2-GAN Unfortunately the style GAN was not able to fully follow the contours generated of the Structure GAN To fix this each generated image is used to re-construct the surface normal map If the generated image is real enough, the result should be accurate

Images with the new norm constraint and without

Left is images generated by the network, right is the nearest neighbor in the dataset

Questions?

References Im, Daniel Jiwoong, et al. "Generating images with recurrent adversarial networks." arXiv preprint arXiv:1602.05110 (2016). Wang, Xiaolong, and Abhinav Gupta. "Generative image modeling using style and structure adversarial networks." European Conference on Computer Vision. Springer International Publishing, 2016. http://bamos.github.io/2016/08/09/deep-completion/ https://filebox.ece.vt.edu/~jbhuang/teaching/ece6554/sp17/lectures /GAN-topic.pdf