Visual Attention Accelerated Vehicle Detection in Low-Altitude Airborne Video of Urban Environment Xianbin Cao, Senior Member, IEEE, Renjun Lin, Pingkun Yan, Senior Member, IEEE,and Xuelong Li, Senior Member, IEEE IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY MARCH 2012
Goal
Outline Introduction Salient region extraction Obtain regions from saliency map Classify vehicles Experiments
Introduction For improving road safety and reducing urban traffic congestions caused by the increasing number of vehicles. Most of the AVDSs adopt expensive devices suchas infrared cameras, GPS, and high resolution satellite cameras for sensing more information Use single optical camera is more efficient.
Introduction
Salient region extraction
For color features r, g,b,R,G,B,Y seven features R=r-(g+b)/2,G=g-(r+b)/2 B=b-(r+g)/2, Y=(r+g+b)/3
Salient region extraction For orientation features Use Gabor filters to generate local orientation feature maps from intensity image I G(σ, θ, f ), σ = 2, f = 1 θ as {0°, 45°, 90°, 135°} four features
Salient region extraction For motion features the temporal differences between the current frame and the three previous frames were computed with intervals of {1, 2, 3} Three features
Salient region extraction 14 feature maps are computed for salient region extraction
Salient region extraction i ∈ {0, 1, 2} represents j ∈ {0, 1, 2,…} represents the serial numbers operator N(*) normalize
Salient region extraction
Difference without N
Obtain regions from saliency map
To effectively obtain the salient regions from the final saliency map, we designed an iterative strategy using inhibition map (IM) and enhancement map (EM). IM:avoid picking same area again EM:enhance regions around the detected vehicle.
Obtain regions from saliency map
Filter by size
Classify vehicles Use cascaded classifier 4000 vehicle (positive) samples,2000 for train and 2000 for test non-vehicle (negative) samples All samples scaled to 32*16
Experiments Xeon x GHz computer 4 GB DDR h of video in both the urban and highway environments The testing videos of traffic were captured with the height around 90 m. size of the video frames is 511×286
Experiments
ratio of recall rate (RR) and salient region percentage (SRP), which represents the efficiency of the salient regions extraction, is used as the evaluation criterion. High RR/SRP indicates that more vehicles can be covered by less extracted salient regions.
Experiments