Download presentation
Presentation is loading. Please wait.
Published byAndra Palmer Modified over 9 years ago
1
CVPR/ICCV 09 Paper Reading Dan Wang Nov. 6, 2009
2
Papers CVPR: – Learning Color and Locality Cues for Moving Object Detection and Segmentation ICCV: – Texel-based Texture Segmentation
3
Learning Color and Locality Cues for Moving Object Detection and Segmentation Feng Liu and Michael Gleicher
4
Authors What does the paper do? Problem of previous methods? How to do?
5
Authors What does the paper do? Problem of previous methods? How to do?
6
Author 1 Feng Liu Computer Sciences Department at the University of Wisconsin, Madison Graduate student Publications – Feng Liu, Michael Gleicher, Hailin Jin and Aseem Agarwala. Content- Preserving Warps for 3D Video Stabilization. ACM SIGGRAPH 2009 – Feng Liu, Yu-hen Hu and Michael Gleicher. Discovering Panoramas in Web Videos. ACM Multimedia 2008 – Feng Liu and Michael Gleicher. Texture-Consistent Shadow Removal. ECCV 2008 – Feng Liu and Michael Gleicher. Video Retargeting: Automating Pan and Scan. ACM Multimedia 2006
7
Author 2 Michael Gleicher Computer Sciences Department at the University of Wisconsin, Madison Professor Positions – 2009 – present – Professor – 2004 – 2009 – Associate Professor – 1998 - 2004 – Assistant Professor http://pages.cs.wisc.edu/~gleicher/CV.pdf
8
Authors What does the paper do? Problem of previous methods? How to do?
9
Problem of Previous Methods methods rely on object or camera motion Most previous automatic methods rely on object or camera motion to detect the moving object. Small motion Small motion of object or camera do not provide sufficient information for these methods.
10
Abstract The paper presents an algorithm for automatically detecting and segmenting a moving object from a monocular video. Existing method: Rely on motion to detect moving object. When motion is sparse and insufficient, it fails… What does this paper do ? – Unsupervised algorithm to learn object color and locality cues from the sparse motion information. How to do? – Detect key frames, sub-objects – Learning from the sub-objects: color and locality cues – Combining cues in MRF framework
11
Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects
12
Moving Object What is “Moving Object”? – Some compact regions with different apparent motion from background. How to detect moving object – Estimate the global motion in a video – Calculate the discrepancy at each between the object motion and global motion
13
Detect Moving Object Model: – Use Homography to model the global motion between two consecutive frames.Homography Feature: – Use a SIFT feature-based method to estimate the homography projective transformation Details in [19] Extract SIFT from each frame Establish feature correspondence between neighboring frames Estimate homography
14
Detect Moving Object With the homography, we calculate the motion cue(mc) at pixel (x,y) as :
15
Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects
16
Key Frame Extraction Definition – The frame where a moving object or its part can be reliably inferred from motion cues. – Motion cues are likely reliable when they are strong and compact.
17
Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects
18
Segment Moving Sub-objects from Key Frames Problem – Not all pixels of the moving object have significant motion cues. Motion cues are Sparse!
19
Segment Moving Sub-objects from Key Frames Solution – Neighboring pixels are likely to have the same label. – Neighboring pixels with similar colors are more likely to have the same label. Solved by Graph cut algorithm
20
Segment Moving Sub-objects from Key Frames MRF priors on labels to model the interaction Pixel label Number of pixels Neighbor of pixel i
21
Segment Moving Sub-objects from Key Frames The likelihood of image I given a labeling can be modeled as follows:
22
Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects
23
Learning Color and Locality Cues Assumption – The moving sub-objects from all the key frames form a complete sampling of the moving objects. Procedure – Lab color space – Build GMM
24
Learning Color and Locality Cues The spatial affinity of pixel I to the moving object: Location likelihood: F: sub-object parameter position
25
Learming Object Cues Motion Cues Key Frame Extraction Segment Moving Sub-objects from Key Frames Learning Color and Locality Cues Motion Object Segmentation Propagate locality cues from key frames Segment the moving objects
27
Experiments Insignificant camera and object motion
28
Significant camera and uneven object
29
Significant camera and object motion
30
Discussion Extracting moving object from a video with less object and camera motion is not easier than more objects and camera motion Contribution – Unsupervised? Currently – Off-line – Motion estimation is time consuming Future – Parameters – Background modeling
31
Texel-based Texture Segmentation Sinisa Todorovic, Narendra Ahuja Reporter: Wang Dan
32
Authors Sinisa Todorovic – Assistant Professor – School of EECS, Oregon State University – Publication: CVPR 2009, ICCV 2009, TPAMI 2008... Narendra Ahuja – Donald Biggar Willet Professor – Beckman Institute , UIUC – Publication: ICCV 2009, IJCV 2008, TPAMI…
33
Problem Given an arbitrary image, segment all texture subimages – Texture = Spatial repetition of texture elements, i.e.,texels – Texels are not identical, but only statistically similar to one another. – Texel placement along the texture surface is not periodic, but only statistically uniform. – Texels in the image are not homogeneous, but regions that may contain subregions.
34
Rational Texels occupy image regions – If the image contains texture, many regions will have similar properties color, shape, layout of subregions orientation, relative displacements – The pdf of region properties will have modes Texture detection and segmentation Detection of modes of pdf of region properties
35
Method Overview
36
Contributions No assumptions about the pdf of texel properties Both appearance and placement of the texels are allowed to be stochastic and correlated New hierarchical, adaptive-bandwidth kernel to capture texel structural properties
37
Method Description region properties Define a feature space of region properties Descriptor of each region = Data point in the feature space Partition the feature space Partition the feature space into bins by Voronoi tessellation new, hierarchical kernel Run the meanshift with the new, hierarchical kernel Regions under a pdf mode comprise the texture subimage
38
Key steps The feature space of region properties Voronoi-based binned Meanshift The hierarchical kernel Key steps The feature space of region properties Voronoi-based binned Meanshift The hierarchical kernel
39
The feature space of region properties Image hierarchical structure Use multiscale segmentation algorithm[1], [4]
40
The feature space of region properties A descriptor vector of prperties x i of image region i 1.Average contrast across i ’s boundary 2.Area 3.Standard deviation of i ’s children area 4.Displacement vector between the centroids of i and its parent region 5.Perimeter 6.Aspect ratio of intercepts of i ’s principal axes with the i ’s boundary 7.Orientation: the angle between principal axes and x-axis 8.Centroid of I PCA 95%Not scale and rotation invariance
41
Key steps The feature space of region properties Voronoi-based binned Meanshift The hierarchical kernel
42
Voronoi Diagram A B C
43
Definition: – Let P = {p 1,p 2,..., p n } be a set of points in the plane (or in any dimensional space), which we call sites. – Define V(p i ), the Voronoi cell for p i, to be the set of points q in the plane that are closer to p i than to any other site. That is, the Voronoi cell for p i is defined to be: V(p i ) = {q | dist(p i,q) < dist(p j, q), for j != i}: Anyway, it can partition the feature space… Anyway, it can partition the feature space… http://www.dma.fi.upm.es/mabellanas/tfcs/fvd/vo ronoi.html http://www.dma.fi.upm.es/mabellanas/tfcs/fvd/vo ronoi.html http://www.dma.fi.upm.es/mabellanas/tfcs/fvd/vo ronoi.html http://www.dma.fi.upm.es/mabellanas/tfcs/fvd/vo ronoi.html
44
Voronoi-based Binned Meanshift New variable-bandwidth matrix
45
Key steps The feature space of region properties Voronoi-based binned Meanshift The hierarchical kernel
46
Gaussian kernel Hierarchical kernel Motivation – Texels, in general, are not homogenous-intensity regions, but may contain hierarchically embedded subregions. – Since region descriptors represent image regions, we can define hierarchical relationships between the descriptors based on the embedding of corresponding smaller regions within larger regions in the image.
47
Voronoi partitioning of the feature space – Suppose x belongs to B i, and x’ belongs to B j Arbitrary points Region descriptors xx’ xixi xjxj
48
x belongs to B i. Compute by finding the maximum subtree isomorphism between two trees rooted at at x i and x j as:
49
Experimental Evaluation Qualitative Evaluation Quantitative evaluation – G: the area of true texture – D: the area of a subimage that our approach segments – Segmentation error per texture:
50
(1) 100 collages of randomly mosaicked, 111 distinct Brodatz textures, where each texture occupies at least 1/6 of the collage
51
(2) 180 collages of randomly mosaicked, 140 distinct Prague textures from 10 thematic classes (e.g., flowers, plants, rocks, textile, wood, etc.), where each texture occupies at least 1/6 of the collage.
52
(3) 100 Aerial-Produce images, where 50 aerial images show housing developments, agricultural fields, and landscapes, and 50 images show produce aisles in supermarkets.
53
(4) Berkeley segmentation dataset
54
Quantitative evaluation(Brodaz) 93.3%+3.7 structural properties vs ignoring them 77.9%+4.1 Simple Gaussian kernel, Variable bandwidth Variable bandwidth in this paper 62.3%+7.8 Simple Gaussian kernel, Variable bandwidth [7] Quantitative evaluation – G: the area of true texture – D: the area of a subimage that our approach segments – Segmentation error per texture:
55
Quantitative evaluation(Prague) – G: the area of true texture – D: the area of a subimage that our approach segments – CS: Correct segmentation – OS: Over segmentation means G that is split into smaller regions D – US: D G – ME counts every G that does not belong to CS – NE counts every D that does not belong to US “The state-of-the-art unsrpervised texture segmentation”
57
Conclusion intrinsic and placement We have presented a texel-based approach to segmenting image parts occupied by distinct textures. This is done by capturing intrinsic and placement properties of distinct groups of texels. structural properties Experimental evaluation on texture mosaics and real-world images suggests that capturing structural properties of texels is very important for texture segmentation. a hierarchical, variable bandwidth kernel To account for texel substructure, we have derived and used a hierarchical, variable bandwidth kernel in the meanshift. Slices are partly from Sinisa Todorovic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.