Iterative Crowd Counting

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.

Limin Wang, Yu Qiao, and Xiaoou Tang

Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.

Spatial Pyramid Pooling in Deep Convolutional

Detection, Segmentation and Fine-grained Localization

Fully Convolutional Networks for Semantic Segmentation

Convolutional Neural Networks at Constrained Time Cost (CVPR 2015) Authors : Kaiming He, Jian Sun (MSR) Presenter : Hyunjun Ju 1.

Learning to Compare Image Patches via Convolutional Neural Networks SERGEY ZAGORUYKO & NIKOS KOMODAKIS.

Deeply-Recursive Convolutional Network for Image Super-Resolution

Recent developments in object detection

Deep Learning for Dual-Energy X-Ray

CS 4501: Introduction to Computer Vision Object Localization, Detection, Semantic Segmentation Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy.

Object Detection based on Segment Masks

Compact Bilinear Pooling

Object detection with deformable part-based models

ISBI Camelyon16 Challenge Prague, April 13, 2016

Data Mining, Neural Network and Genetic Programming

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

Deep Predictive Model for Autonomous Driving

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Saliency-guided Video Classification via Adaptively weighted learning

Depth estimation and Plane detection

Regularizing Face Verification Nets To Discrete-Valued Pain Regression

Compositional Human Pose Regression

Part-Based Room Categorization for Household Service Robots

Hierarchical Deep Convolutional Neural Network

Synthesis of X-ray Projections via Deep Learning

Efficient Deep Model for Monocular Road Segmentation

CS6890 Deep Learning Weizhen Cai

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Counting In High Density Crowd Videos

Counting in High-Density Crowd Videos

Zan Gao, Deyu Wang, Xiangnan He, Hua Zhang

Introduction to Neural Networks

Presenter: Usman Sajid

Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong

Convolutional Neural Networks for Visual Tracking

Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.

Counting in Dense Crowds using Deep Learning

Object Detection + Deep Learning

CornerNet: Detecting Objects as Paired Keypoints

KFC: Keypoints, Features and Correspondences

A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE

Outline Background Motivation Proposed Model Experimental Results

Object Tracking: Comparison of

RCNN, Fast-RCNN, Faster-RCNN

Good View Hunting: Learning Photo Composition from Dense View Pairs Zijun Wei1, Jianming Zhang2, Xiaohui Shen2, Zhe Lin2, Radomír Měch2, Minh Hoai1, Dimitris.

Comparison of EET and Rank Pooling on UCF101 (split 1)

View Inter-Prediction GAN: Unsupervised Representation Learning for 3D Shapes by Learning Global Shape Memories to Support Local View Predictions 1,2 1.

边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University

Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis Speaker: ZHAO Jian

Neural Architecture Search: Basic Approach, Acceleration and Tricks

Department of Computer Science Ben-Gurion University of the Negev

Human-object interaction

Deep Object Co-Segmentation

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Learning and Memorization

Multi-UAV to UAV Tracking

End-to-End Facial Alignment and Recognition

Eliminating Background-Bias for Robust Person Re-identification

Point Set Representation for Object Detection and Beyond

Recent Developments on Super-Resolution

Computing the Stereo Matching Cost with a Convolutional Neural Network

SDSEN: Self-Refining Deep Symmetry Enhanced Network

A-CCNN: ADAPTIVE CCNN FOR DENSITY ESTIMATION AND CROWD COUNTING

Counting in High-Density Crowd Videos

CVPR2019 Jiahe Li SiamRPN introduces the region proposal network after the Siamese network and performs joint classification and regression.

Shengcong Chen, Changxing Ding, Minfeng Liu 2018

Presentation transcript:

Iterative Crowd Counting Viresh Ranjan, Hieu Le, Minh Hoai Department of Computer Science, Stony Brook University Introduction Iterative Counting CNN Results Datasets ShanghaiTech [2], UCF CC [5], World Expo [11] Evaluation metrics Mean Absolute Error (MAE), Root Mean Squared Error (RMSE) We present a method for crowd counting via density estimation 512 people Results on ShanghaiTech Part A & Part B Ablation study on ShanghaiTech Part A Approach Part A Part B MAE RMSE Crowd CNN [9] 181.8 277.7 32.0 49.8 MCNN [2] 110.2 173.2 26.4 41.3 Switch CNN [3] 90.4 135.0 21.6 33.4 CP-CNN [4] 73.6 106.4 20.1 30.1 Semi-supervised [6] 112.0 13.7 21.4 DecideNet [7] -- 20.7 29.4 ic-CNN (1 stage) 69.8 117.3 10.4 16.7 ic-CNN (2 stages) 68.5 116.2 10.7 16.0 Approach MAE RMSE LR-CNN alone 78.5 133.2 HR-CNN alone 136.2 204.0 HR-CNN + low res prediction 75.1 129.0 HR-CNN + low res features 77.4 130.4 ic-CNN 69.8 117.3 Loss: We propose iterative counting CNN (ic-CNN), a two branch architecture for coarse-to-fine estimation of crowd density maps ic-CNN estimates a high resolution crowd density map in two stages A low resolution density map at ¼ the size of the original image is predicted first. Low resolution density map is refined, and transformed into the final high resolution crowd density map. Highlights of ic-CNN architecture: Achieve state-of-the-art performance Can be trained end-to-end Has significantly fewer parameters than previous approaches Faster training We also present a multi-stage extension of ic-CNN which refines its prediction across multiple stages Low Resolution CNN (LR-CNN) fully convolutional branch with 11 conv layers. Max-pooling layers for down-sampling feature maps Density map is ¼ in size of the original image High Resolution CNN (HR-CNN) fully convolutional branch with 9 conv layers, max-pooling layers for down-sampling Bilinear interpolation for up sampling To handle variations in crowd density, high resolution branch incorporates features from the low resolution branch Low res prediction passed as feature map to HR-CNN Separate weighted mean squared loss terms for the two branches. Comparing model complexity, training time Approach Training time # Parameters MAE MCNN [2] unknown .12 million 110.2 Switch CNN [3] 22 hrs 12 million 90.4 CP-CNN [4] 63 million 73.6 ic-CNN 10 hrs 7.9 million 69.8 Results on UCF CC dataset Approach MAE RMSE Zisserman et al [1] 493.4 487.1 Idrees et al [8] 419.5 541.6 Crowd CNN [9] 467.0 498.5 MCNN [2] 377.6 509.1 Hydra-2s [10] 333.7 425.2 Switching CNN [3] 318.1 439.2 CP-CNN [4] 295.8 320.9 Semi-supervised [7] 279.6 388.9 ic-CNN 260.9 365.5 Results on World Expo dataset Approach S1 S2 S3 S4 S5 Avg MCNN[2] 3.4 20.6 12.9 13.0 8.1 11.6 Switch CNN[3] 4.2 14.9 14.2 18.7 4.3 11.2 CP-CNN[4] 2.9 14.7 10.5 10.4 5.8 8.8 ic-CNN 17.0 12.3 9.2 4.7 10.3 References Image GT Low Res High Res [1] Lempitsky, V. & Zisserman, A. Learning to count objects in images, NIPS10 [2] Zhang, Y., Zhou, D., Chen, S., Gao, S., & Ma, Y., Single-image crowd counting via multi-column convolutional neural network, CVPR15 [3] Sam, D. B., Surya, S., & Babu, R. V., Switching convolutional neural network for crowd counting, CVPR17 [4] Sindagi, V. A. & Patel, V. M. Generating High-Quality Crowd Density Maps using Contextual Pyramid CNNs, ICCV17 [5] Saad, A. and Shah, M., A Lagrangian particle dynamics approach for crowd flow segmentation and stability analysis, CVPR07 [6] Liu, X., van de Weijer, J., & Bagdanov, A. D. Leveraging Unlabeled Data for Crowd Counting by Learning to Rank, CVPR18 [7] Liu, J., Gao, C., Meng, D., & Hauptmann, A. G. Decidenet: Counting varying density crowds through attention guided detection and density estimation, CVPR18 [8] Idrees, Haroon and Saleemi, Imran and Seibert, Cody and Shah, Mubarak, Multi-source multi-scale counting in extremely dense crowd images, CVPR13 [9] Zhang, C., Li, H., Wang, X., and Yang, X., Cross-scene crowd counting via deep convolutional neural networks, CVPR15 [10 Onoro-Rubio, Daniel and Lopez-Sastre, Roberto J, Towards perspective-free object counting with deep learning, ECCV16 [11] Zhang, Cong and Li, Hongsheng and Wang, Xiaogang and Yang, Xiaokang, Cross-scene crowd counting via deep convolutional neural networks, CVPR15 Multi-stage extension Multiple ic-CNN blocks Each block uses the predictions from all previous blocks Acknowledgements. This work was supported by SUNY2020 Infrastructure Transportation Security Center. The authors would like to thank Boyu Wang for participating on the discussions and experiments related to an earlier version of the proposed technique. Contemporary Crowd Counting papers at ECCV 18. [a] Idrees, Haroon and Tayyab, Muhmmad and Athrey, Kishan and Zhang, Dong and Al-Maadeed, Somaya and Rajpoot, Nasir and Shah, Mubarak, Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds [b] Cao, Xinkun and Wang, Zhipeng and Zhao, Yanyun and Su, Fei, Scale Aggregation Network for Accurate and Efficient Crowd Counting