Deep screen image crop and enhance

Deep screen image crop and enhance
Week 7 (Aaron Ott, Amir Mazaheri)

Problem We have taken a photo of an image, and we want the original image. This network for this can be broken into 2 parts: Image Detector/Cropper Image Enhancer

Cropper Uses a frozen VGG-19 model to get feature map
Applies convolutions, normalizations, and activations Final dense layer creates 6-number theta value for affine transformation STN takes input image and applies affine transformation

Enhancer Pretrained EDSR (trained on DIV2K) Modified form of Resnet
Pretrained EDSR (trained on DIV2K) Modified form of Resnet Uses modified residual block, which excludes batch normalization and final ReLU layer 16 Residual blocks Subpixel Conv2D layers for upscaling the image Scales the image 4x Lim, Son, Kim, Nah, Lee. “Enhanced Deep Residual Networks for Single Image Super-Resolution”. 10 July 2017

Combined Cropper and Enhancer
Trained with 2 outputs and 2 Loss Functions: - Trained Cropper on VGG + Cosine Proximity (Inception Loss) - Trained Enhancer on VGG + MSE

Results Metric\Model Cropper Cropper & Enhancer PSNR 11.1903 16.2060
SSIM 0.4254 0.4909 MSE 0.0796 0.0281 MOS 2.6143 2.8857 Results Cropper & Enhancer Input Cropper Actual

Synthetic Dataset Problem: There is no existing dataset to use when solving this problem, and taking pictures takes too much time Solution: Automatically generate images with various transformations over various backgrounds - Current problems: sometimes image edges get cut out, difficult to get full variety of possible images, doesn’t yet account for discoloration or image noise, dataset only includes birds

Synthetic Dataset Results
Original Cropper + Enhancer Projective Cropper + Enhancer Cropper w/ SD Cropper w/ 2 SDs Original Cropper Projective Cropper Input Truth * Note: Used separate validation data set that none of the networks had been trained on. PSNR SSIM 0.3366 0.3335 0.3450 0.3560 0.3437 0.3489 MSE 0.0609 0.0586 0.0554 0.0578 0.0553

New Direction We want to apply our network to solve more diverse problems New problem: Identification and cropping of logos in images 3456 × 2304

Challenges: Dataset, though more diverse, is still difficult to collect Logos come in different orientations, shapes, and formats Have to design a GAN that can train on only adversarial loss Sizing: cropper can only accept one image size

Next Week Create GAN to train cropper on logo dataset
Continue working on cropper design (testing projective transformation, different numbers of layers) Try new enhancer networks

Deep screen image crop and enhance

Similar presentations

Presentation on theme: "Deep screen image crop and enhance"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Deep screen image crop and enhance

Similar presentations

Presentation on theme: "Deep screen image crop and enhance"— Presentation transcript:

Similar presentations

About project

Feedback