Download presentation
Presentation is loading. Please wait.
1
Learning Spatial Context: Using stuff to find things Geremy Heitz Daphne Koller Stanford University October 13, 2008 ECCV 2008
2
Things vs. Stuff Stuff (n): Material defined by a homogeneous or repetitive pattern of fine-scale properties, but has no specific or distinctive spatial extent or shape. Thing (n): An object with a specific size and shape. From: Forsyth et al. Finding pictures of objects in large collections of images. Object Representation in Computer Vision, 1996.
3
Finding Things Context is key!
4
Outline What is Context? The Things and Stuff (TAS) model Results
5
Satellite Detection Example D(W) = 0.8
6
Error Analysis Typically… We need to look outside the bounding box! False Positives are OUT OF CONTEXT True Positives are IN CONTEXT
7
Types of Context Scene-Thing: Stuff-Stuff: gist car “likely” keyboard “unlikely” Thing-Thing: [ Torralba et al., LNCS 2005 ] [ Gould et al., IJCV 2008 ] [ Rabinovich et al., ICCV 2007 ]
8
Types of Context Stuff-Thing: Based on spatial relationships Intuition: Trees = no cars Houses = cars nearby Road = cars here “Cars drive on roads” “Cows graze on grass” “Boats sail on water” Goal: Unsupervised
9
Outline What is Context? The Things and Stuff (TAS) model Results
10
Things Detection “candidates” Low detector threshold -> “over-detect” Each candidate has a detector score
11
Things Candidate detections Image Window W i + Score Boolean R.V. T i T i = 1: Candidate is a positive detection Thing model TiTi Image Window W i
12
Stuff Coherent image regions Coarse “superpixels” Feature vector F j in R n Cluster label S j in {1…C} Stuff model Naïve Bayes SjSj FjFj
13
Relationships Descriptive Relations “Near”, “Above”, “In front of”, etc. Choose set R = { r 1 …r K } R ijk =1: Detection i and region j have relation k Relationship model S 72 = Trees S 4 = Houses S 10 = Road T1T1 R ijk TiTi SjSj R 1,10,in =1
14
The TAS Model R ijk TiTi SjSj FjFj Image Window W i W i : Window T i : Object Presence S j : Region Label F j : Region Features R ijk : Relationship N J K Supervised in Training Set Always Observed Always Hidden
15
Unrolled Model T1T1 S1S1 S2S2 S3S3 S4S4 S5S5 T2T2 T3T3 R 2,1,above = 0 R 3,1,left = 1 R 1,3,near = 0 R 3,3,in = 1 R 1,1,left = 1 Candidate Windows Image Regions
16
Learning the Parameters Assume we know R S j is hidden Everything else observed Expectation-Maximization “Contextual clustering” Parameters are readily interpretable R ijk TiTi SjSj FjFj Image Window W i N J K Supervised in Training Set Always Observed Always Hidden
17
Learned Satellite Clusters
18
Which Relationships to Use? Rijk = spatial relationship between candidate i and region j Rij1 = candidate in region Rij2 = candidate closer than 2 bounding boxes (BBs) to region Rij3 = candidate closer than 4 BBs to region Rij4 = candidate farther than 8 BBs from region Rij5 = candidate 2BBs left of region Rij6 = candidate 2BBs right of region Rij7 = candidate 2BBs below region Rij8 = candidate more than 2 and less than 4 BBs from region … RijK = candidate near region boundary How do we avoid overfitting?
19
Learning the Relationships Intuition “Detached” R ijk = inactive relationship Structural EM iterates: Learn parameters Decide which edge to toggle Evaluate with l (T|F,W,R) Requires inference Better results than using standard E[ l (T,S,F,W,R)] R ij1 TiTi SjSj FjFj R ij2 R ijK
20
Inference Goal: Block Gibbs Sampling Easy to sample T i ’s given S j ’s and vice versa
21
Outline What is Context? The Things and Stuff (TAS) model Results
22
Base Detector - HOG [ Dalal & Triggs, CVPR, 2006 ] HOG Detector: Feature Vector XSVM Classifier
23
Results - Satellite Prior: Detector Only Posterior: Detections Posterior: Region Labels
24
Results - Satellite 4080120160 0 0.2 0.4 0.6 0.8 1 False Positives Per Image Recall Rate Base Detector TAS Model ~10% improvement in recall at 40 fppi
25
PASCAL VOC Challenge 2005 Challenge 2232 images split into {train, val, test} Cars, Bikes, People, and Motorbikes 2006 Challenge 5304 images plit into {train, test} 12 classes, we use Cows and Sheep
26
Base Detector Error Analysis Cows
27
Discovered Context - Bicycles Bicycles Cluster #3
28
TAS Results – Bicycles Examples Discover “true positives” Remove “false positives” BIKE ? ? ?
29
Results – VOC 2005
30
Results – VOC 2006
31
Conclusions Detectors can benefit from context The TAS model captures an important type of context We can improve any sliding window detector using TAS The TAS model can be interpreted and matches our intuitions We can learn which relationships to use
32
Merci!
33
Object Detection Task: Find the things Example: Find all the cars in this image Return a “bounding box” for each Evaluation: Maximize true positives Minimize false positives
34
Sliding Window Detection Consider every bounding box All shifts All scales Possibly all rotations Each such window gets a score: D(W) Detections: Local peaks in D(W) Pros: Covers the entire image Flexible to allow variety of D(W)’s Cons: Brute force – can be slow Only considers features in box D = 1.5 D = -0.3
35
Sliding Window Results PASCAL Visual Object Classes Challenge Cows 2006 score(A,B) > 0.5 TRUE POSITIVE score(A,B) ≤ 0.5 FALSE POSITIVE B A Recall(T) = TP / (TP + FN) Precision(T) = TP / (TP + FP) score(A,B) = |A∩B| / |AUB| D(W) > T
36
Quantitative Evaluation 04080120160 0.2 0.4 0.6 0.8 1 False Positives Per Image Recall Rate
38
Prior: Detector Only Posterior: TAS Model Region Labels Detections in Context Task: Identify all cars in the satellite image Idea: The surrounding context adds info to the local window detector + = Houses Road
39
Equations
40
Features: Haar wavelets Haar filters and integral image Viola and Jones, ICCV 2001 The average intensity in the block is computed with four sums independently of the block size. BOOSTING!
41
Features: Edge fragments Weak detector = Match of edge chain(s) from training image to edgemap of test image Opelt, Pinz, Zisserman, ECCV 2006 BOOSTING!
42
Histograms of oriented gradients Dalal & Trigs, 2006 SIFT, D. Lowe, ICCV 1999 SVM!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.