Download presentation
Presentation is loading. Please wait.
Published byKara Paradine Modified over 10 years ago
1
Human Detection Phanindra Varma
2
Detection -- Overview Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding of images Training set consists of positive windows (containing humans) and negative images For each window in the training set the HOG feature vector is computed and linear SVM is used for learning the classifier For any test image, the feature vector is computed on densely spaced windows at all scales and classified using the learned SVM
3
HOG encoding Preprocessing:- Gamma normalize each channel using square root transformation in the given window For each channel compute gradients using [-1 0 1] and [-1 0 1] T and find the channel with the largest gradient magnitude for each pixel Compute gradient orientation (0 – 180) for each pixel in this dominant channel Descriptor computation :- Divide the window (64x128) into dense grid of points with horizontal and vertical spacing equal to 8 pixels Divide the 16x16 region (block) centered on each point on the grid into cells of size 8x8 (i.e 4 cells for each grid point) For each pixel in the current block use Trilinear interpolation based on gradient strength to vote into a 2x2x9 histogram
4
HOG encoding (Contd..) Different voting schemes were used for each of the colored regions Block normalization for illumination invariance is done on each block independently using the norm of the 2x2x9 vector The final feature vector is the collection of all the 2x2x9 feature vectors from all the grid points A Block of 16x16 pixels Cell centers Grid point
5
Training The training set has been obtained from http://pascal.inriaples.fr/data/human/INRIAPerson.tar http://pascal.inriaples.fr/data/human/INRIAPerson.tar The training set consists of positive 64x128 windows (2416) containing humans and negative images Negative windows are sampled from the negative images at random locations (12000) Initial Phase learning :- Learn the SVM classifier on the original training set Generate Hard examples :- Run the learned SVM on the negative images at all scales and window locations and save all the false positives (approx.6000)
6
Training (Contd..) Second Phase learning :- Using the newly generated negative examples learn the new linear SVM (total positive windows 2400, negative windows 17000 approx) Following this procedure, 375 windows were misclassified out of the possible 19400 windows (using SVMLight)
7
Testing Given an Image :- HOG feature vector is computed across all scales and window locations and the locations and scales of all positive windows are saved (window size 64x128) This procedure gives multiple detections (at many scales and locations) To fuse overlapping detections the Mean Shift mode detection algorithm is used Represent each detection in a 3D space ([x y log(s)]) and iteratively compute the mean shift vector at each point The resulting modes give the final detections and the bounding boxes are drawn using this final scale
8
Results - Detection An example image Detections when threshold is zero
9
Results – Detection (Contd..) Previous image Detections when threshold is equal to one
10
Results - Detection An example image Detections when threshold is zero
11
Results – Detection (Contd..) Result of Mean Shift mode detection
12
Comparision Detection Video
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.