Download presentation
Presentation is loading. Please wait.
Published byDora Rice Modified over 6 years ago
1
LOCUS: Learning Object Classes with Unsupervised Segmentation
16-721: Advanced Perception Nik Melchior April 10, 2006
2
Learning Flexible Sprites in Video Layers (Jojic & Frey 2001)
Outline Learning Flexible Sprites in Video Layers (Jojic & Frey 2001) LOCUS 4/10/06 Nik Melchior - LOCUS
3
Motivation Challenges Object category recognition and segmentation in
cluttered scenes Challenges Spatial variability Texture variability Occlusion Pose Lighting 4/10/06 Nik Melchior - LOCUS slide: Kumar, Torr, Zisserman
4
Motivation Segmentation should (ideally) be
Given an image and object category, to segment the object Object Category Model Segmentation Cow Image so far, we've seen attempts to do this using: - template-like outline matching - DT, Chamfer distance - HoG - fragments - ISM Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion 4/10/06 Nik Melchior - LOCUS slide: Fei-Fei
5
Flexible Sprites Provides a stable segmentation of whole moving objects in video Each flexible sprite is a separate layer containing the transformation of a rigid sprite 4/10/06 Nik Melchior - LOCUS
6
Big Picture Generative model
Seeks the best explanation for the pixels of all images, constrained by number of layers Operates directly on pixels rather than low-level features Rigorous formulation similar to DDMCMC 4/10/06 Nik Melchior - LOCUS
7
Flexible Sprites User specifies number of layers Each layer contains:
Sprite appearance 's' Real-valued mask 'm' Discretized simple transformation 'T' βpermutation matrix that rearranges the pixels in sβ π₯=ππβππ ξπ π ξ βπξππππ π 4/10/06 Nik Melchior - LOCUS
8
Appearance, mask, and transform
Zero-mean Gaussian noise Recursive formula for image appearance Formulation πξπ₯β£ π π , π π , π π ξ=π ξπ₯; πΏ ξ ξ πΏ π π π π ξ ξ β π π π π β π π π π ξ ,ξΈξ Assuming independence of each sprite class, appearance, mask, and transformation in each layer Transformation prior (uniform) Class prior (to be learned) Mask Appearance priors: - gaussian for s, m - uniform for T need to learn: - prior over c (pi) - means and variances of sprite appearances and masks - noise beta - uniform prior over transformation due in part to discretization used πξπ₯, π π , π π , π π ξ=π ξπ₯; πΏ ξ ξ πΏ π π π π ξ ξ β π π π π β π π π π ξ ,ξΈξ β πΏ ξπ ξ π π ; ξ π π , ξ π π ξ π ξ π π ; ξ½ π π , ξ π π ξ ξ π π ξ π π ξ 4/10/06 Nik Melchior - LOCUS
9
Formulation Use EM to maximize the posterior
π ξ π π , π π , π π , π π β£πξ = π ξπ₯, π π , π π , π π , π π ξ π ξπ₯, π π , π π , π π , π π ξ β πΏ πξ π π , π π ξπξ π π ξπξ π π ξ exact posterior intractable - (CJ)^L - C = # classes - J = # transformations - L = # layers - factorize to make it linear CJL compute KL divergence of 2 distributions π·= ξ πΏ πξ π π , π π ξπξ π π ξπξ π π ξ ξ =ln π ξ π π , π π , π π , π π β£π₯ξ πΏ π ξ π π , π π ξπξ π π ξπξ π π ξ 4/10/06 Nik Melchior - LOCUS
10
Formulation Maximize F
πΉ=π·ξlnπξπ₯ξ = ξ πΏ πξ π π , π π ξπξ π π ξπξ π π ξ ξ =ln π ξπ₯, π π , π π , π π , π π ξ πΏ π ξ π π , π π ξπξ π π ξπξ π π ξ Maximize F log of numerator is a sum of Mahalanobis distances q ln 1/q is the entropy of q 4/10/06 Nik Melchior - LOCUS
11
E step For each single image:
the mean should stay close to the overall mean, but large where image does not match background variance should be high where the transformed image is similar to the background similar update for xi_T, the transform, which ensures that image matches transformed sprite 4/10/06 Nik Melchior - LOCUS
12
M step 4/10/06 Nik Melchior - LOCUS
13
Results 1 fps at 320x240 4/10/06 Nik Melchior - LOCUS
14
LOCUS Learning Object Classes with Unsupervised Segmentation 4/10/06
Nik Melchior - LOCUS
15
Comparison to Flexible Sprites
Classes instead of single sprites more complex appearance model addition of edge model 2 layers: object and model weak prior of palettes (appearance) one for object, one for background 4/10/06 Nik Melchior - LOCUS
16
Background appearance Ξ»0
LOCUS model Shared between images Class shape Ο Class edge sprite ΞΌo,Οo Deformation field D Position & size T Different for each image Emphasise class model (edge + mask) (shared) β all other variables per-image. Emphasise LEARN EVERYTHING SIMULTANEOUSLY. Mask m Edge image e Background appearance Ξ»0 Object appearance Ξ»1 4/10/06 Nik Melchior - LOCUS Image Winn and Jojic, 2005
17
LOCUS model Deformation field D Markov Random Field Transform T
scaling and translation rotation or full affine transform possible Mask m Edge model e Appearance model Gaussian mixture model (histogram of colors or textures) D: MRF penalises squared distance b/w neighboring vectors m: favors short segmentation boundaries e: not applied to background due to expected noise 4/10/06 Nik Melchior - LOCUS
18
EM formulation Posterior similar to flexible sprites
πξξ, ξ π , ξ π , ξ π , ξ π β£πΌξβ π π ξ ξ π ξπξ ξ π π , ξ π π ξπξ ξ π π , ξ π π ξ Posterior similar to flexible sprites Iteratively optimize over 9 latent variables - mixture model parameters - binary mask - position and size transformation - deformation field - real-valued mask - object edge model - background edge model ξ,π,π,π·,ξ, ξ π , ξ π , ξ π , ξ π 4/10/06 Nik Melchior - LOCUS
19
Results 4/10/06 Nik Melchior - LOCUS
20
Results Comparison to Borenstein fragments
Percentages indicate correctly labeled pixels across all images Last row removes class shape and edge information 4/10/06 Nik Melchior - LOCUS
21
Results 4/10/06 Nik Melchior - LOCUS
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.