Rotational Rectification Network for Robust Pedestrian Detection

Slides:

Advertisements

Similar presentations

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.

Advertisements

Object recognition and scene “understanding”

Learning Convolutional Feature Hierarchies for Visual Recognition

Unsupervised Learning With Neural Nets Deep Learning and Neural Nets Spring 2015.

Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.

Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab

A General Framework for Tracking Multiple People from a Moving Camera

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.

Date of download: 7/8/2016 Copyright © 2016 SPIE. All rights reserved. A scalable platform for learning and evaluating a real-time vehicle detection system.

SPACE MOUSE. INTRODUCTION  It is a human computer interaction technology  Helps in movement of manipulator in 6 degree of freedom * 3 translation degree.

Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Learning to Compare Image Patches via Convolutional Neural Networks

A Discriminative Feature Learning Approach for Deep Face Recognition

Introduction to Skin and Face Detection

Convolutional Neural Network

Deep Neural Net Scenery Generation

Object Detection based on Segment Masks

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Article Review Todd Hricik.

Depth estimation and Plane detection

Compositional Human Pose Regression

Synthesis of X-ray Projections via Deep Learning

Lecture 5 Smaller Network: CNN

Presentation by Ryan Brand

Adversarially Tuned Scene Generation

State-of-the-art face recognition systems

A Convolutional Neural Network Cascade For Face Detection

A Comparative Study of Convolutional Neural Network Models with Rosenblatt’s Brain Model Abu Kamruzzaman, Atik Khatri , Milind Ikke, Damiano Mastrandrea,

Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong

CSC 578 Neural Networks and Deep Learning

RGB-D Image for Scene Recognition by Jiaqi Guo

The Open World of Micro-Videos

Introduction of MATRIX CAPSULES WITH EM ROUTING

Autonomous Vehicle Competition

Fine-Grained Visual Categorization

Pose Estimation for non-cooperative Spacecraft Rendevous using CNN

CSC 578 Neural Networks and Deep Learning

Creating Data Representations

Object Detection Creation from Scratch Samsung R&D Institute Ukraine

The Big Health Data–Intelligent Machine Paradox

On Convolutional Neural Network

Use 3D Convolutional Neural Network to Inspect Solder Ball Defects

Analysis of Trained CNN (Receptive Field & Weights of Network)

John H.L. Hansen & Taufiq Al Babba Hasan

Introduction to Object Tracking

Problems with CNNs and recent innovations 2/13/19

Airport Parking Space Navigation

Heterogeneous convolutional neural networks for visual recognition

Convolutional Neural Network

by Khaled Nasr, Pooja Viswanathan, and Andreas Nieder

Spatially Supervised Recurrent Neural Networks for Visual Object Tracking Authors: Guanghan Ning, Zhi Zhang, Chen Huang, Xiaobo Ren, Haohong Wang, Canhui.

Human-object interaction

Rotational Rectification Network (R2N):

Image Processing and Multi-domain Translation

Motivation State-of-the-art two-stage instance segmentation methods depend heavily on feature localization to produce masks.

Learning and Memorization

Object Detection Implementations

Presented By: Harshul Gupta

Unrolling the shutter: CNN to correct motion distortions

Report 2 Brandon Silva.

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning

Self-Supervised Cross-View Action Synthesis

Motivation The subjects/objects are correlated to each other under semantic relationships.

Multi-Target Detection and Tracking of UAVs from a UAV

Learning to Detect Human-Object Interactions with Knowledge

CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Presentation transcript:

Rotational Rectification Network for Robust Pedestrian Detection Xinshuo Weng Shangxuan Wu Wentao Han RI MSCV May 3, 2017 This work has been submitted to BMVC2017

Motivations “Upright” assumption in benchmark datasets (Caltech, ETH and INRIA): Pedestrians stand upright on the ground. Camera image plane is roughly orthogonal to the ground plane. This assumption is easily invalidated in real world Mobile phone camera UAV Construction vehicles on rugged terrain This lead to poor performance of general pedestrian detector Visualization 先讲Main Task再讲Motivation

Detection results from general detector without rotational robustness Visualization Rotated Caltech dataset Frames from YouTube videos 先讲Main Task再讲Motivation Detection results from general detector without rotational robustness

Overall Architecture

Approach Global Polar Pooling (GP-Pooling) Rotation Estimation Module Rotational Rectification Network (R2N)

Global Polar Pooling

Visualization: Global Polar Pooling Rotational changes on input image Translational shifts on responses

Rotational Estimation Module Layers: Red: Convolution Yellow: Max Pooling Gray: Normalization Green: Concatenation Magenta: Flatten Cyan: Fully-Connected Module Input: Image features Module Output: Rotation angle present in the image features

Rotational Rectification Network Plug-in property: R2N could be flexibly inserted between any intermediate layers prior to detection

Evaluation Results the GP-Pooling operator Rotated MNIST Dataset the GP-Pooling operator On Rotated MNIST Dataset On Rotated Caltech Dataset Rotation Invariant Pedestrian Detection Rotated Caltech Dataset So Let’s talk about the evaluation results. First, we did some evaluation on our proposed GP-Pooling operator, both on rotated MNIST digit dataset and rotated Caltech dataset. On both of these datasets, we achieve state-of-the-art results on rotation angle estimation task. Secondly, we tested on our main task, pedestrian detection. As you can see from the lower figures, we got much less miss rate in the rotated images than state-of-the-art pedestrian detector, while the performance on upright images only drop 1%. Original Caltech Dataset Rotated Caltech Dataset Camera Naturally Rotated Dataset (from YouTube videos)

Qualitative Results Rotation-Invariant Pedestrian Detection Here are some qualitative results of our rotated pedestrian detection experiment. The yellow bounding box is the output of state-of-the-art pedestrian detector, and the red bounding box is the output of our rotation-invariant pedestrian detector. As you can see, traditional pedestrian detectors fail when the the rotation angle is larger than 45 degree, while our detector could always detect people. We can even detect heavily-rotated pedestrians that are almost in 90 degrees of rotation. What’s more, we can detect small people in the scene while general detectors fail.

Main Contributions Proposed a Rotation-Invariant GP-Pooling operator, which can be used to encode radial distribution of features Proposed a Rotational Rectification Network (R2N) that can be inserted into a wide range of CNN-based detectors to achieve rotational invariance In conclusion, our main contribution in this project is two fold. First, we proposed a rotati………. Secondly, …… Evaluation results show that our rotational rectification network is capable of detecting heavily rotated pedestrian while state-of-the-art detectors fail.