Download presentation
Presentation is loading. Please wait.
1
Improved Object Detection
Boosted Histograms for Improved Object Detection 2. Interest points - Motivation - Detection of invariant interest points -- Harris, Harris-Laplace, Harris-affine (Schmid, Mikolajczyk) -- SIFT (Lowe) -- Region-based (Matas / Tuytelaars, Van Gool) -- Histogram-based (Kadir, Brady) - Evaluation of stability - Applications -- Wide base-line matching -- Object recognition -- Video indexing Ivan Laptev IRISA/INRIA, Rennes, France December 08, 2006
2
Histograms for object recognition
Remarkable success of recognition methods using histograms of local image measurements: [Swain & Ballard 1991] - Color histograms [Schiele & Crowley 1996] - Receptive field histograms [Lowe 1999] - localized orientation histograms (SIFT) [Schneiderman & Kanade 2000] - localized histograms of wavelet coef. [Leung & Malik 2001] - Texton histograms [Belongie et.al. 2002] - Shape context [Dalal & Triggs 2005] - Dense orientation histograms Likely explanation: Histograms are robust to image variations such as limited geometric transformations and object class variability.
3
Histograms: What vs. Where
What to measure? Color [SB91] Gaussian derivatives [SC96] Wavelet coeff. [SK00] Textons [LM01] Gradient orientation [L99,DT05] Histograms Where to measure? Whole image [SB91,SC96] Pre-defined grid [SK00,BMP02,DT05] Key points [L99] No guarantee for optimal recognition Different regions may have different discriminative power Use of interest regions might be suboptimal: IR-detectors are designed for good localization/matching Given rough localization, IRs are not necessary best regions for recognition A B C D
4
Idea selected features boosting weak classifier
weak classifier Efficient discriminative classifier [Freund&Schapire’97] Good performance for face detection [Viola&Jones’01] AdaBoost: Haar features optimal threshold SVM, ANN 1-bin classifier Histogram features Fisher discriminant
5
Histogram features ~10^5 rectangle features
Histograms over 4 gradient orientations, 4 subdivisions for each rectangle ~10^5 rectangle features
6
+ Training data Crop and resize Perturb annotation
Increase training set X 10 +
7
Training: Selected Features
0.999 correct classification 10^-5 false positives 376 of ~10^5 features selected
8
Object detection Scan and classify image windows at different positions and scales Conf.=5 Cluster detections in the space-scale space Assign cluster size to the detection confidence
9
PASCAL Visual Object Classes
Challenge 2005 (VOC’05) motorbikes #217 / #220 bicycles #123 / #123 people #152 / #149 cars #320 / #341
10
Evaluation criteria Ground truth annotation Detection results:
>50 % overlap of bounding box with GT one bounding box for each object confidence value for each detection Detection results: >50 % overlap of bounding box with GT one bounding box for each object confidence value for each detection Detection results: >50 % overlap of bounding box with GT one bounding box for each object confidence value for each detection Detection results: >50 % overlap of bounding box with GT one bounding box for each object confidence value for each detection Precision-Recall (PR) curve: Average Precision (AP) value:
11
Evaluation of detection
PR-curves for the “Motorbike” validation dataset: FLD learner [Levi and Weiss, CVPR 2004] “Learning object detection from a small number of examples: The importance of good features” + 1-bin classifier
12
Results for VOC’05 Challenge
Bicycles test1 People test1 Motorbikes test1 cars test1
13
Results for VOC’05 Challenge
Average Precision values:
16
PASCAL Visual Object Classes
Challenge 2006 (VOC’06)
17
Competition "comp3" (train on VOC data)
Results for VOC’06 Challenge Competition "comp3" (train on VOC data) Class “bicycle" examples
18
Competition "comp3" (train on VOC data)
Results for VOC’06 Challenge Competition "comp3" (train on VOC data) Class “cow" examples
19
Competition "comp3" (train on VOC data)
Results for VOC’06 Challenge Competition "comp3" (train on VOC data) Class “horse" examples
20
Competition "comp3" (train on VOC data)
Results for VOC’06 Challenge Competition "comp3" (train on VOC data) Class “motorbike"
21
Competition "comp3" (train on VOC data)
Results for VOC’06 Challenge Competition "comp3" (train on VOC data) Class “person"
22
Results for VOC’06 Challenge
Average Precision values: bicycle bus car cat cow dog horse motorbike person sheep Cambridge 0.249 0.138 0.254 0.151 0.149 0.118 0.091 0.178 0.030 0.131 ENSMP - 0.398 0.159 INRIA_Douze 0.414 0.117 0.444 0.212 0.390 0.164 0.251 INRIA_Laptev 0.440 0.224 0.140 0.318 0.114 TUD 0.153 0.074 TKK 0.303 0.169 0.222 0.160 0.252 0.113 0.137 0.265 0.039 0.227
23
How “interesting” are boosted regions?
Affine Harris regions Boosted regions Harris value is not a significant measure for boosted regions “Interest regions” are not necessarily good for recognition?
24
Final Notes Open questions:
All results are obtained with a single set of parameters Small number of training samples is sufficient Efficient detection: 10fps on 320x280 images Extension to texton/color histogram features is straightforward Open questions: Other free-shape regions better? How to find them? Better weak learner that takes advantage of histogram properties View transformations
25
Final Notes Open questions:
All results are obtained with a single set of parameters Small number of training samples is sufficient Efficient detection: 10fps on 320x280 images Extension to texton/color histogram features is straightforward Open questions: Other free-shape regions better? How to find them? Better weak learner that takes advantage of histogram properties View transformations
26
Final Notes Open questions:
All results are obtained with a single set of parameters Small number of training samples is sufficient Efficient detection: 10fps on 320x280 images Extension to texton/color histogram features is straightforward Open questions: Other free-shape regions better? How to find them? Better weak learner that takes advantage of histogram properties View transformations
27
Final Notes Open questions:
All results are obtained with a single set of parameters Small number of training samples is sufficient Efficient detection: 10fps on 320x280 images Extension to texton/color histogram features is straightforward Open questions: Other free-shape regions better? How to find them? Better weak learner that takes advantage of histogram properties View transformations
28
Final Notes Open questions:
All results are obtained with a single set of parameters Small number of training samples is sufficient Efficient detection: 10fps on 320x280 images Extension to texton/color histogram features is straightforward Open questions: Other free-shape regions better? How to find them? Better weak learner that takes advantage of histogram properties View transformations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.