Download presentation
Presentation is loading. Please wait.
1
Tensor Voting and Applications
Based on a CVPR Tutorial by Gérard Medioni and Chi-Keung Tang
2
The Problem Develop a flexible model for extraction of salient geometric features junctions, lines, surfaces, …, subjective contours Traditional bottom-up approaches (Marr '78, '82): Low-level vision (edge detection, etc) noisy results "we will correct them later" - using higher-level knowledge Mid-level vision? High-level vision (recognition, interpretation, tracking) applications that work well with perfect data
3
Motivation Computational framework to address a wide range of computer vision problems Computer Vision attempts to infer scene descriptions from one or more images Primitives and constraints might vary from problem to problem Many problems can be formulated as perceptual organization problems in an appropriate space
4
Need for Constraints Since the problem has infinite number of solutions, constraints need to be imposed Constraints may be non-consistent difficult to implement
5
Perceptual Organization
Gestalt principles: Proximity Similarity Good continuation Closure Common fate Simplicity A B C
6
The Smoothness Constraint
Matter is cohesive Smoothness Difficult to implement – it is true “almost everywhere”
7
Overview Related Work Tensor Voting in 2-D Tensor Voting in 3-D
Tensor Voting in N-D Application to Vision Problems Stereo Visual Motion Binary-Space-Partitioned Images 3-D Surface Extraction from Medical Data Epipolar Geometry Estimation for Non-static Scenes Image Repairing Range and 3-D Data Repairing Video Repairing Luminance Correction Conclusions
8
Related Work
9
Regularization Computer vision problems are inverse problems and ill-posed Constraints needed to derive solution Can be formulated as optimization Selection of objective function is not trivial Iterative
10
Relaxation Labeling Problems posed as the assignment of labels to tokens Remove labels that violate constraints and iteratively restrict solution space Continuous, discrete, deterministic and stochastic implementations Iterative
11
Robust Methods Model fitting based on robust statistics of data
Classification of data as inliers and outliers M-estimators, LMedS, RANSAC etc. Very robust to noise Can operate only with limited and known prior models
12
Clustering Group or partition the data according to affinity measures
Affinities encoded as edges of graph Partition the data by cutting the graph in a way that results in minimum disassociation between clusters (global decision) Generalized eigenvalue problem
13
Structural Saliency Structural Saliency is a property of the structure as a whole Parts of the structure are not salient in isolation Shashua and Ullman defined saliency measure based on proximity and curvature variation Large number of methods from Computer Science and Neuroscience Based on local interactions between tokens Saliency defined as scalar
14
Tensor Representation
generic tensor stick tensor – surface element In the TV formalism, each token is represented as a tensor, which in the 3-D case can be visualized as an ellipsoid. The tensor is defined by 3 vectors (e1, e2, e3) that give its orientation, and by 3 scalars (1, 2, 3) which give its size and shape. In 3-D there are 3 types of features – surfaces, curves, and points, which correspond to 3 elementary tensors. The knowledge that a token lies on a surface is intuitively encoded as a stick tensor, where only one dimension dominates, along the surface normal. The length of the stick encodes the saliency, which is the confidence in this knowledge. A tensor that lies on a curve is encoded as a plate tensor, where 2 dimensions dominate, in the plane of curve normals. Finally, a point is encoded as ball tensor, where no dimension dominates, showing no particular preference for any orientation. plate tensor – curve element ball tensor – point element
15
Voting z x y z In the TV formalism, each token is represented as a tensor, which in the 3-D case can be visualized as an ellipsoid. The tensor is defined by 3 vectors (e1, e2, e3) that give its orientation, and by 3 scalars (1, 2, 3) which give its size and shape. In 3-D there are 3 types of features – surfaces, curves, and points, which correspond to 3 elementary tensors. The knowledge that a token lies on a surface is intuitively encoded as a stick tensor, where only one dimension dominates, along the surface normal. The length of the stick encodes the saliency, which is the confidence in this knowledge. A tensor that lies on a curve is encoded as a plate tensor, where 2 dimensions dominate, in the plane of curve normals. Finally, a point is encoded as ball tensor, where no dimension dominates, showing no particular preference for any orientation. y x
16
Voting This example illustrates how the voting process works. We start with a set of points, where some points produce a salient perception of a curve, and the others are just noise. As no information is initially known, each point is encoded as a ball tensor, then each tensor casts a vote in its neighborhood. At each recipient the votes are added, and consequently the tensors which lie on the salient curve will reinforce each other, so the tensors deform along the prevailing orientations, while the outliers will receive no significant support. The remaining question is how the votes are generated. This is illustrated here in the 2-D case, for simplicity. In 2-D there are only 2 features – curves and points, and therefore only ball and stick tensors (here stick corresponds to a curve). Suppose for a moment that we know that a curve is passing through point P. The vote sent by P to Q must encode the smoothest continuation of this curve from P to Q, which is a circular arc tangent to the curve at P. This will give the vote orientation. The vote strength decays exponentially with distance and curvature, which is illustrated here by Q’ and Q’’ that receive weaker votes than Q. In practice, votes are generated according to predefined voting fields, one for each feature type. The stick field is shown here with its color-coded strength. If no orientation is initially known at point P (which is typically the case), the vote is generated by rotating a stick vote and integrating all the contributions. The corresponding isotropic ball field is shown here.
17
3-D Examples Input Surface Curves Input Surfaces
The TV framework is a methodology developed over the last years in our lab, that is able to extract statistically salient features from sparse and noisy data, as shown in this figure for the 3-D case, where the features can be surfaces, curves and points. The method is non-iterative and the only free parameter is the scale of analysis, which is an inherent property of human vision. Input Surfaces
18
Overview Related Work Tensor Voting in 2-D Tensor Voting in 3-D
Tensor Voting in N-D Application to Vision Problems Stereo Visual Motion Binary-Space-Partitioned Images 3-D Surface Extraction from Medical Data Epipolar Geometry Estimation for Non-static Scenes Image Repairing Range and 3-D Data Repairing Video Repairing Luminance Correction Conclusions
19
Tensor Voting in 2-D Representation with tensors
Tensor voting and voting fields First order voting Vote analysis and structure inference Examples Illusory contours
20
The Tensor Voting Framework
Data Representation: Tensors Constraint Representation: Voting fields enforce smoothness and proximity Communication: Voting non-iterative no initialization required
21
Desirable Properties of the Representation
Local Layered Object-centered Discrimination between local model misfits and noise
22
The Approach in a Nutshell
Each input site propagates its information in a neighborhood Each site collects the information cast there Salient features correspond to local extrema
23
Properties of Tensor Voting
Non-Iterative Can extract all features simultaneously One parameter (scale) Non-critical thresholds General – no parametric model imposed
24
Second Order Symmetric Tensors
Second order, symmetric non-negative definite tensors Equivalent to: Ellipse Special cases: “ball” and “stick” tensors 2x2 matrix + =
25
Second Order Symmetric Tensors
Properties captured by second order symmetric Tensor shape: orientation certainty size: feature saliency
26
Representation with Second Order Symmetric Tensors
27
Design of the Voting Field
?
28
Vote Strength = scale of voting, s = arc length, = curvature
x y P s 2 l C O = scale of voting, s = arc length, = curvature Votes attenuate with length of smoothest path Straight continuation is favored over curved
29
Scale of Voting The scale of voting is the single critical parameter in the framework Essentially defines the size of the voting neighborhood Gaussian decay has infinite extent, but it is cropped to where votes remain meaningful (e.g. 1% of voter saliency)
30
Scale of Voting The scale is a measure of the degree of smoothness
Smaller scales correspond to small voting neighborhoods, fewer votes Preserve details More susceptible to outlier corruption Larger scales correspond to large voting neighborhoods, more votes Bridge gaps Smooth perturbations Robust to noise
31
Scale of Voting Results are not sensitive to reasonable selections of scale Quantitative evaluations – later
32
Fundamental Stick Voting Field
33
Fundamental Stick Voting Field
All other fields in any N-D space are generated from the Fundamental Stick Field: Ball Field in 2-D Stick, Plate and Ball Field in 3-D Stick, …, Ball Field in N-D
34
2-D Ball Field P P S(P) B(P)
Ball field – computed by integrating the contributions of rotating stick:
35
Each input site propagates its information in a neighborhood
2-D Voting Fields Each input site propagates its information in a neighborhood votes with votes with + votes with
36
Voting Voting from a ball tensor is isotropic
Function of distance only The stick voting field is aligned with the orientation of the stick tensor O P
37
Need for First Order Information
B C D A` B` C` D` Tensors B, C, D: stick Second order tensors are insensitive to signed orientation They cannot discriminate between interior points and terminations of perceptual structures
38
Polarity Vectors Representation augmented with Polarity Vectors (first order tensors) Sensitive to direction from which votes are received Exploit property of boundaries to have all their neighbors on the same side of the half-space
39
Polarity Input Saliency Polarity
40
First Order Voting Votes are cast along the tangent of the smoothest path Vector votes instead of tensor votes Accumulated by vector addition y Second-order vote P First-order vote x
41
First Order Voting Fields
Magnitude is the same as in the second order case
42
Vote Collection Each site collects the information cast there
By tensor addition (for second order votes): By vector addition (for first order votes):
43
Each site accumulates second order votes by tensor addition:
+ = + = + = + = Results of accumulation are usually generic tensors
44
Second Order Vote Interpretation
Salient features correspond to local extrema At each site + Saliency maps SMap BMap
45
First Order Vote Interpretation (curves)
Tokens near terminations accumulate first order votes from consistent direction Tokens along smooth structures receive opposite votes that cancel out
46
First Order Vote Interpretation (regions)
Tokens near discontinuities accumulate first order votes from consistent direction Tokens in the interior of smooth structures receive contradicting votes that cancel out
47
Structure Inference in 2-D
A B
48
Sensitivity to Scale Input: 166 un-oriented inliers, 300 outliers
B Input: 166 un-oriented inliers, 300 outliers Dimensions: 960x720 Scale: [50, 5000] Voting neighborhood: [12, 114] Input = 50 = 500 Curve saliency as a function of scale Blue: curve saliency at A Red: curve saliency at B = 5000
49
Sensitivity of Orientation to Scale
Circle with radius 100 (unoriented tokens) As more information is accumulated, the tokens better approximate circle
50
Sensitivity of Orientation to Scale
Square 200x200 (unoriented tokens) Junctions are detected and excluded As scale increases to unreasonable levels (>1000) corners get rounded
51
Examples in 2-D Input Gray: curve inliers Black: curve endpoints
Red: junctions
52
Examples in 2-D Input Curves and endpoints only Curves, endpoints
and regions
53
Curve and region inliers
Examples in 2-D Input Curve and region inliers
54
Illusory Contours Aligned endpoints interpreted as forming illusory contours Layered scene interpretation
55
Illusory Contours in the Tensor Voting Framework
Endpoint detection Used as inputs for illusory contour inference Use polarity vector (parallel to the real curve tangent) as the normal to illusory curve
56
Illusory Contours and Voting Fields
Since polarity vector (tangent of detected curve segment) serves as curve normal: => fields orthogonal to regular ones Votes cast only to half-space away from original curve segments
57
Illusory Contour Example
Input Illusory contour Detected endpoints and polarity vectors
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.