Rotational Rectification Network for Robust Pedestrian Detection

Slides:



Advertisements
Similar presentations
Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Advertisements

Object recognition and scene “understanding”
Learning Convolutional Feature Hierarchies for Visual Recognition
Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
A General Framework for Tracking Multiple People from a Moving Camera
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.
SPACE MOUSE. INTRODUCTION  It is a human computer interaction technology  Helps in movement of manipulator in 6 degree of freedom * 3 translation degree.
Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.
Learning to Compare Image Patches via Convolutional Neural Networks
A Discriminative Feature Learning Approach for Deep Face Recognition
Introduction to Skin and Face Detection
Convolutional Neural Network
Deep Neural Net Scenery Generation
Object Detection based on Segment Masks
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Article Review Todd Hricik.
Depth estimation and Plane detection
Compositional Human Pose Regression
Synthesis of X-ray Projections via Deep Learning
Lecture 5 Smaller Network: CNN
Presentation by Ryan Brand
Adversarially Tuned Scene Generation
State-of-the-art face recognition systems
A Convolutional Neural Network Cascade For Face Detection
A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
CSC 578 Neural Networks and Deep Learning
RGB-D Image for Scene Recognition by Jiaqi Guo
The Open World of Micro-Videos
Introduction of MATRIX CAPSULES WITH EM ROUTING
Autonomous Vehicle Competition
Fine-Grained Visual Categorization
Pose Estimation for non-cooperative Spacecraft Rendevous using CNN
CSC 578 Neural Networks and Deep Learning
Creating Data Representations
Object Detection Creation from Scratch Samsung R&D Institute Ukraine
The Big Health Data–Intelligent Machine Paradox
On Convolutional Neural Network
Use 3D Convolutional Neural Network to Inspect Solder Ball Defects
Analysis of Trained CNN (Receptive Field & Weights of Network)
John H.L. Hansen & Taufiq Al Babba Hasan
Introduction to Object Tracking
Problems with CNNs and recent innovations 2/13/19
Airport Parking Space Navigation
Heterogeneous convolutional neural networks for visual recognition
Convolutional Neural Network
by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder
Spatially Supervised Recurrent Neural Networks for Visual Object Tracking Authors: Guanghan Ning, Zhi Zhang, Chen Huang, Xiaobo Ren, Haohong Wang, Canhui.
Human-object interaction
Rotational Rectification Network (R2N):
Image Processing and Multi-domain Translation
Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.
Learning and Memorization
Object Detection Implementations
Presented By: Harshul Gupta
Unrolling the shutter: CNN to correct motion distortions
Report 2 Brandon Silva.
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
Self-Supervised Cross-View Action Synthesis
Motivation The subjects/objects are correlated to each other under semantic relationships.
Multi-Target Detection and Tracking of UAVs from a UAV
Learning to Detect Human-Object Interactions with Knowledge
Jiahe Li
CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.
Presentation transcript:

Rotational Rectification Network for Robust Pedestrian Detection Xinshuo Weng Shangxuan Wu Wentao Han RI MSCV May 3, 2017 This work has been submitted to BMVC2017

Motivations “Upright” assumption in benchmark datasets (Caltech, ETH and INRIA): Pedestrians stand upright on the ground. Camera image plane is roughly orthogonal to the ground plane. This assumption is easily invalidated in real world Mobile phone camera UAV Construction vehicles on rugged terrain This lead to poor performance of general pedestrian detector Visualization 先讲Main Task再讲Motivation

Detection results from general detector without rotational robustness Visualization Rotated Caltech dataset Frames from YouTube videos 先讲Main Task再讲Motivation Detection results from general detector without rotational robustness

Overall Architecture

Approach Global Polar Pooling (GP-Pooling) Rotation Estimation Module Rotational Rectification Network (R2N)

Global Polar Pooling

Visualization: Global Polar Pooling Rotational changes on input image Translational shifts on responses

Rotational Estimation Module Layers: Red: Convolution Yellow: Max Pooling Gray: Normalization Green: Concatenation Magenta: Flatten Cyan: Fully-Connected Module Input: Image features Module Output: Rotation angle present in the image features

Rotational Rectification Network Plug-in property: R2N could be flexibly inserted between any intermediate layers prior to detection

Evaluation Results the GP-Pooling operator Rotated MNIST Dataset the GP-Pooling operator On Rotated MNIST Dataset On Rotated Caltech Dataset Rotation Invariant Pedestrian Detection Rotated Caltech Dataset So Let’s talk about the evaluation results. First, we did some evaluation on our proposed GP-Pooling operator, both on rotated MNIST digit dataset and rotated Caltech dataset. On both of these datasets, we achieve state-of-the-art results on rotation angle estimation task. Secondly, we tested on our main task, pedestrian detection. As you can see from the lower figures, we got much less miss rate in the rotated images than state-of-the-art pedestrian detector, while the performance on upright images only drop 1%. Original Caltech Dataset Rotated Caltech Dataset Camera Naturally Rotated Dataset (from YouTube videos)

Qualitative Results Rotation-Invariant Pedestrian Detection Here are some qualitative results of our rotated pedestrian detection experiment. The yellow bounding box is the output of state-of-the-art pedestrian detector, and the red bounding box is the output of our rotation-invariant pedestrian detector. As you can see, traditional pedestrian detectors fail when the the rotation angle is larger than 45 degree, while our detector could always detect people. We can even detect heavily-rotated pedestrians that are almost in 90 degrees of rotation. What’s more, we can detect small people in the scene while general detectors fail.

Main Contributions Proposed a Rotation-Invariant GP-Pooling operator, which can be used to encode radial distribution of features Proposed a Rotational Rectification Network (R2N) that can be inserted into a wide range of CNN-based detectors to achieve rotational invariance In conclusion, our main contribution in this project is two fold. First, we proposed a rotati………. Secondly, …… Evaluation results show that our rotational rectification network is capable of detecting heavily rotated pedestrian while state-of-the-art detectors fail.