Deep screen image crop and enhance

Slides:



Advertisements
Similar presentations
CAPTCHA solving Tianhui Cai Period 3. CAPTCHAs Completely Automated Public Turing tests to tell Computers and Humans Apart Determines whether a user is.
Advertisements

Logan Lebanoff Mentor: Haroon Idrees
Deep Residual Learning for Image Recognition
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Wenchi MA CV Group EECS,KU 03/20/2017
Big data classification using neural network
Generative Adversarial Nets
Deep Learning for Dual-Energy X-Ray
Learning to Compare Image Patches via Convolutional Neural Networks
Environment Generation with GANs
Summary of “Efficient Deep Learning for Stereo Matching”
Data Mining, Neural Network and Genetic Programming
DeepFont: Identify Your Font from An Image
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Textual Video Prediction Week 2
AUTOMATED PATTERN RECOGNITION SYSTEM FOR SHOE TRACKS
Understanding and Predicting Image Memorability at a Large Scale
Registration of Pathological Images
Inception and Residual Architecture in Deep Convolutional Networks
Project 7: Modeling Social Network Structures and their Dynamic Evolutions with User- Generated Data from IoT REU Student: Emma Ambrosini Graduate mentors:
Chaoyun Zhang, Xi Ouyang, and Paul Patras
Synthesis of X-ray Projections via Deep Learning
Super-resolution Image Reconstruction
Presenter: Hajar Emami
Single Image Super-Resolution
Efficient Deep Model for Monocular Road Segmentation
CS 698 | Current Topics in Data Science
Authors: Jun-Yan Zhu*, Taesun Park*, Phillip Isola, Alexei A. Efros
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Presenter: Hajar Emami
Textual Video Prediction
Low Dose CT Image Denoising Using WGAN and Perceptual Loss
Deep Learning Convoluted Neural Networks Part 2 11/13/
By: Kevin Yu Ph.D. in Computer Engineering
Layer-wise Performance Bottleneck Analysis of Deep Neural Networks
Bird-species Recognition Using Convolutional Neural Network
Visual Question Generation
Face Recognition with Deep Learning Method
Image Classification.
Deep CNN of JPEG 2000 電信所R 林俊廷.
Counting in Dense Crowds using Deep Learning
Bryan Russell Computer Demonstration
By: Behrouz Rostami, Zeyun Yu Electrical Engineering Department
Road Traffic Sign Recognition
Basics of Deep Learning No Math Required
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
Use 3D Convolutional Neural Network to Inspect Solder Ball Defects
Lip movement Synthesis from Text
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Abnormally Detection
Deep Object Co-Segmentation
Textual Video Prediction
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
What's New in eCognition 9
Weeks 1 and 2 Aaron Ott.
Deep screen image crop and enhance
Deep screen image crop and enhance
End-to-End Facial Alignment and Recognition
CRCV REU 2019 Kara Schatz.
Week 3 Volodymyr Bobyr.
Deep screen image crop and enhance
Self-Supervised Cross-View Action Synthesis
End-to-End Speech-Driven Facial Animation with Temporal GANs
Deep screen image crop and enhance
CRCV REU 2019 Aaron Honculada.
Deep screen image crop and enhance
Directional Occlusion with Neural Network
20 November 2019 Output maps Normal Diffuse Roughness Specular
Shengcong Chen, Changxing Ding, Minfeng Liu 2018
Deep CNN for breast cancer histology Image Analysis
Presentation transcript:

Deep screen image crop and enhance Week 7 (Aaron Ott, Amir Mazaheri)

Problem We have taken a photo of an image, and we want the original image. This network for this can be broken into 2 parts: Image Detector/Cropper Image Enhancer

Cropper Uses a frozen VGG-19 model to get feature map Applies convolutions, normalizations, and activations Final dense layer creates 6-number theta value for affine transformation STN takes input image and applies affine transformation

Enhancer Pretrained EDSR (trained on DIV2K) Modified form of Resnet https://github.com/krasserm/super-resolution Pretrained EDSR (trained on DIV2K) Modified form of Resnet Uses modified residual block, which excludes batch normalization and final ReLU layer 16 Residual blocks Subpixel Conv2D layers for upscaling the image Scales the image 4x Lim, Son, Kim, Nah, Lee. “Enhanced Deep Residual Networks for Single Image Super-Resolution”. 10 July 2017

Combined Cropper and Enhancer Trained with 2 outputs and 2 Loss Functions: - Trained Cropper on VGG + Cosine Proximity (Inception Loss) - Trained Enhancer on VGG + MSE

Results Metric\Model Cropper Cropper & Enhancer PSNR 11.1903 16.2060 SSIM 0.4254 0.4909 MSE 0.0796 0.0281 MOS 2.6143 2.8857 Results Cropper & Enhancer Input Cropper Actual

Synthetic Dataset Problem: There is no existing dataset to use when solving this problem, and taking pictures takes too much time Solution: Automatically generate images with various transformations over various backgrounds - Current problems: sometimes image edges get cut out, difficult to get full variety of possible images, doesn’t yet account for discoloration or image noise, dataset only includes birds http://www.vision.caltech.edu/visipedia/CUB-200.html, http://places2.csail.mit.edu/download.html

Synthetic Dataset Results Original Cropper + Enhancer Projective Cropper + Enhancer Cropper w/ SD Cropper w/ 2 SDs Original Cropper Projective Cropper Input Truth * Note: Used separate validation data set that none of the networks had been trained on. PSNR 12.5088 12.3735 12.6044 12.8737 12.8537 13.0741 SSIM 0.3366 0.3335 0.3450 0.3560 0.3437 0.3489 MSE 0.0609 0.0586 0.0554 0.0578 0.0553

New Direction We want to apply our network to solve more diverse problems New problem: Identification and cropping of logos in images 3456 × 2304

Challenges: Dataset, though more diverse, is still difficult to collect Logos come in different orientations, shapes, and formats Have to design a GAN that can train on only adversarial loss Sizing: cropper can only accept one image size

Next Week Create GAN to train cropper on logo dataset Continue working on cropper design (testing projective transformation, different numbers of layers) Try new enhancer networks