Presentation is loading. Please wait.

Presentation is loading. Please wait.

Human Detection Phanindra Varma. Detection -- Overview  Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding.

Similar presentations


Presentation on theme: "Human Detection Phanindra Varma. Detection -- Overview  Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding."— Presentation transcript:

1 Human Detection Phanindra Varma

2 Detection -- Overview  Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding of images  Training set consists of positive windows (containing humans) and negative images  For each window in the training set the HOG feature vector is computed and linear SVM is used for learning the classifier  For any test image, the feature vector is computed on densely spaced windows at all scales and classified using the learned SVM

3 HOG encoding  Preprocessing:- Gamma normalize each channel using square root transformation in the given window For each channel compute gradients using [-1 0 1] and [-1 0 1] T and find the channel with the largest gradient magnitude for each pixel Compute gradient orientation (0 – 180) for each pixel in this dominant channel  Descriptor computation :- Divide the window (64x128) into dense grid of points with horizontal and vertical spacing equal to 8 pixels Divide the 16x16 region (block) centered on each point on the grid into cells of size 8x8 (i.e 4 cells for each grid point) For each pixel in the current block use Trilinear interpolation based on gradient strength to vote into a 2x2x9 histogram

4 HOG encoding (Contd..)  Different voting schemes were used for each of the colored regions  Block normalization for illumination invariance is done on each block independently using the norm of the 2x2x9 vector  The final feature vector is the collection of all the 2x2x9 feature vectors from all the grid points A Block of 16x16 pixels Cell centers Grid point

5 Training  The training set has been obtained from http://pascal.inriaples.fr/data/human/INRIAPerson.tar http://pascal.inriaples.fr/data/human/INRIAPerson.tar  The training set consists of positive 64x128 windows (2416) containing humans and negative images  Negative windows are sampled from the negative images at random locations (12000)  Initial Phase learning :- Learn the SVM classifier on the original training set  Generate Hard examples :- Run the learned SVM on the negative images at all scales and window locations and save all the false positives (approx.6000)

6 Training (Contd..)  Second Phase learning :- Using the newly generated negative examples learn the new linear SVM (total positive windows 2400, negative windows 17000 approx)  Following this procedure, 375 windows were misclassified out of the possible 19400 windows (using SVMLight)

7 Testing  Given an Image :- HOG feature vector is computed across all scales and window locations and the locations and scales of all positive windows are saved (window size 64x128)  This procedure gives multiple detections (at many scales and locations)  To fuse overlapping detections the Mean Shift mode detection algorithm is used  Represent each detection in a 3D space ([x y log(s)]) and iteratively compute the mean shift vector at each point  The resulting modes give the final detections and the bounding boxes are drawn using this final scale

8 Results - Detection An example image Detections when threshold is zero

9 Results – Detection (Contd..) Previous image Detections when threshold is equal to one

10 Results - Detection An example image Detections when threshold is zero

11 Results – Detection (Contd..) Result of Mean Shift mode detection

12 Comparision Detection Video


Download ppt "Human Detection Phanindra Varma. Detection -- Overview  Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding."

Similar presentations


Ads by Google