Unrolling the shutter: CNN to correct motion distortions

Slides:



Advertisements
Similar presentations
Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.
Advertisements

For Internal Use Only. © CT T IN EM. All rights reserved. 3D Reconstruction Using Aerial Images A Dense Structure from Motion pipeline Ramakrishna Vedantam.
IIIT Hyderabad Pose Invariant Palmprint Recognition Chhaya Methani and Anoop Namboodiri Centre for Visual Information Technology IIIT, Hyderabad, INDIA.
X From Video - Seminar By Randa Khayr Eli Shechtman, Yaron Caspi & Michal Irani.
PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2004 Vision as Optimal Inference The problem of visual processing can be thought of as.
A Study of Approaches for Object Recognition
Direct Methods for Visual Scene Reconstruction Paper by Richard Szeliski & Sing Bing Kang Presented by Kristin Branson November 7, 2002.
Many slides and illustrations from J. Ponce
High Dynamic Range Imaging: Spatially Varying Pixel Exposures Shree K. Nayar, Tomoo Mitsunaga CPSC 643 Presentation # 2 Brien Flewelling March 4 th, 2009.
Introduction to Computer Vision CS223B, Winter 2005.
Multiple View Geometry Marc Pollefeys University of North Carolina at Chapel Hill Modified by Philippos Mordohai.
Spatial Pyramid Pooling in Deep Convolutional
CSCE 641: Computer Graphics Image-based Rendering Jinxiang Chai.
Sean Ryan Fanello. ^ (+9 other guys. )
Path-Based Constraints for Accurate Scene Reconstruction from Aerial Video Mauricio Hess-Flores 1, Mark A. Duchaineau 2, Kenneth I. Joy 3 Abstract - This.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
A General Framework for Tracking Multiple People from a Moving Camera
IMAGE MOSAICING Summer School on Document Image Processing
CSCE 5013 Computer Vision Fall 2011 Prof. John Gauch
High-Resolution Interactive Panoramas with MPEG-4 발표자 : 김영백 임베디드시스템연구실.
Geometric Camera Models
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
BAGGING ALGORITHM, ONLINE BOOSTING AND VISION Se – Hoon Park.
EG 2011 | Computational Plenoptic Imaging STAR | VI. High Speed Imaging1 Computational Plenoptic Imaging Gordon Wetzstein 1 Ivo Ihrke 2 Douglas Lanman.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Digital Image Processing CSC331 Image restoration 1.
Feature Matching. Feature Space Outlier Rejection.
Visual Computing Computer Vision 2 INFO410 & INFO350 S2 2015
Skeleton Based Action Recognition with Convolutional Neural Network
Computational Rephotography Soonmin Bae Aseem Agarwala Frédo Durand.
Learning video saliency from human gaze using candidate selection CVPR2013 Poster.
9.913 Pattern Recognition for Vision Class9 - Object Detection and Recognition Bernd Heisele.
Instructor: Mircea Nicolescu Lecture 5 CS 485 / 685 Computer Vision.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Computer vision: models, learning and inference
Deep Learning for Dual-Energy X-Ray
Akash Bapat1, Enrique Dunn1,2 & Jan-Michael Frahm1,
Karel Lebeda, Simon Hadfield, Richard Bowden
DeepCount Mark Lenson.
Deep Predictive Model for Autonomous Driving
A Neural Approach to Blind Motion Deblurring
Perceptual Loss Deep Feature Interpolation for Image Content Changes
Ajita Rattani and Reza Derakhshani,
Recovery from Occlusion in Deep Feature Space for Face Recognition
Real-Time Human Pose Recognition in Parts from Single Depth Image
Mauricio Hess-Flores1, Mark A. Duchaineau2, Kenneth I. Joy3
RANSAC and mosaic wrap-up
Deep Learning and Newtonian Physics
3D Photography: Epipolar geometry
Proposed (MoDL-SToRM)
ISOMAP TRACKING WITH PARTICLE FILTERING
Image Based Modeling and Rendering (PI: Malik)
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
Convolutional Neural Networks for Visual Tracking
Chapter 1: Image processing and computer vision Introduction
Single Image Rolling Shutter Distortion Correction
Change Detection in Rolling Shutter Cameras
Spatial Transformer Networks
RCNN, Fast-RCNN, Faster-RCNN
Introduction to Object Tracking
Neural Network Pipeline CONTACT & ACKNOWLEDGEMENTS
Unsupervised Perceptual Rewards For Imitation Learning
Automatic Handwriting Generation
Fig. 2 Visualization of features.
Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A convolutional Neural Aggregation Network Woojae Kim1, Jongyoo Kim2, Sewoong Ahn1,Jinwoo.
Computing the Stereo Matching Cost with a Convolutional Neural Network
SDSEN: Self-Refining Deep Symmetry Enhanced Network
Directional Occlusion with Neural Network
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Unrolling the shutter: CNN to correct motion distortions Vijay Rengarajan, Yogesh Balaji, A.N. Rajagopalan Indian Institute of Technology Madras Image Processing and Computer Vision lab, Department of Electrical Engineering, IIT Madras

Camera Motion Causes Rolling Shutter Distortions Motion blur Lens distortions Mobile phones Drone cameras Streetview capture

Sequential Exposure of Rolling Shutter Global shutter CCD image sensor Exposure time te Top row time All pixels expose at the same time Bottom row Exposure open Exposure close te Top row time Each row starts exposing sequentially Bottom row Td Total line delay Rolling shutter CMOS image sensor

Rolling Shutter Distortions are Geometric Different rows see the scene at different poses of the moving camera Even short exposure causes distortions Scene Scene rz rotation x z y x z y tx translation time time Captured image Captured image

Correct Rolling Shutter Distortions from a Single Image Disturbs visual appeal Affects scene inference Single image ambiguity curved building or rolling shutter effect?

Prior Works on Rolling Shutter Correction Rengarajan et al. CVPR (2016) for urban scenes Heflin et al. Conf. Biometrics (2010) for faces Rolling shutter Curvatures Corrected Corrected using facial features Ringaby and Forssen CVPR (2010) IJCV (2012) Grundmann et al. ICCP (2012) Video rolling shutter correction Use frame-to-frame point correspondences Need A single method that can be used for different classes of images Different levels of features to correct extract motion and to discard feature outliers

Let machines extract desired features Feature Extraction Motion Estimation Distortion Correction Corrected Image Rolling shutter distorted image Existing approach Motion Fitting Distortion Correction Corrected Image Rolling shutter distorted image CNN 1 2 Convolutional Neural Network Input Rolling shutter image 256x256x3 Output Translation and rotation (tx,rz) 15 tx and 15 rz motion samples of equally spaced rows Train for different classes of images Motion fitting Polynomial trajectory to get tx and rz for each row 3 Distortion correction Inverse warping based on row-wise motion

VanillaCNN with square filters Motion Mean Squared Error tx and rz at 15 rows Vanilla Convolutional Neural Network VanillaCNN 2 1 Translations only Translations and rotations Distorted image Corrected by VanillaCNN Distorted image Corrected by VanillaCNN

Ideas for a new architecture Initial feature extraction Feature combination Any better ideas? Along rows : motion constancy Along columns : temporal motion Rotation can be better estimated if information from image rows are extracted earlier

Use long filters for RowColCNN Captures information in rows early Motion Mean Squared Error Filter h x w x c Captures information along time dimension early

Use long filters for RowColCNN Captures information along row dimension early Training Data Generation Generate random polynomial camera trajectory Apply on undistorted image Datasets Chessboard Urban scenes Faces 7k 300k 250k Sun Oxford Zurich LFW Correction Get camera motion values from CNN Fit a polynomial trajectory and get motion at all rows Correct distorted image using target-to-source mapping The datasets do not have rolling shutter effect. We synthetically generate them. Even so, it works for real images that we captured using mobile phones. Motion Mean Squared Error Corrected by VanillaCNN Corrected by RowColCNN Captures information along time dimension early

Correction Results of RowColCNN

Learning excels in challenging conditions Distorted input Geometry-based Learning-based Rengarajan et al. (2016) fail due to tree branches which are naturally curved Rengarajan et al. 2016 RowColCNN Heflin et al. (2010) fail due to wrong estimation of facial features in varied illumination conditions Heflin et al. 2010 RowColCNN

New CNN filter shapes inspired by application New learning-based method for single image rolling shutter correction CNN learns image to motion mapping Long filters in CNN architecture for rolling shutter exposure Poster 21 AM apvijay.github.io/rs_rect_cnn