Grouping and Segmentation Previously: model-based grouping (e.g., straight line, figure 8…) Now: general ``bottom-up” image organization (Like detecting-specific-brightness-patterns vs “interesting” patterns/interest points/corners)
Grouping Grouping is the process of associating similar image features together
Grouping Grouping is the process of associating similar image features together The Gestalt School: –Proximity: nearby image elements tend to be grouped. –Similarity: similar image elements... –Common fate: image elements with similar motion... –Common region: image elements in same closed region... –Parallelism: parallel curves or other parallel image elements... –Closure: image elements forming closed curves... –Symmetry: symmetrically positioned image elements... –Good continuation: image elements that join up nicely... –Familiar Conguration: image elements giving a familiar object...
Good Continuation
Good continuation
Common Form: (includes color and texture)
Connectivity
Symmetry
Convexity (stronger than symmetry?)
Good continuation also stronger than symmetry?
Closure closed curves are much easier to pick out than open ones
Grouping and Segmentation Segmentation is the process of dividing an image into regions of “related content” Courtesy Uni Bonn
Grouping and Segmentation Segmentation is the process of dividing an image into regions of “related content” Courtesy Uni Bonn
Grouping and Segmentation Both are ill defined problems “Related” or “similar” are “high level” concepts. Hard to apply directly to image data How are they used? Are they “early” process that precedes recognizing objects? A lot of research, some pretty good algorithms. No single best approach.
Boundaries
Problem: Find best path between two boundary points. User can help by clicking on endpoints
How do we decide how good a path is? Which of two paths is better?
Discrete Grid Curve is good if it follows object boundary So it should have: Strong gradients along curve (on average) Curve directions roughly perpendicular to gradients (since these is usually true for boundaries) Curve should be smooth, not jagged Since: Object boundaries typically smooth Curves that follow “noise edges” typically jagged So good curve has: Low curvature (on average) along curve Slow change in gradient direction along curve
Discrete Grid How to find the best curve? Approach - Define cost function on curves (Given any curve, rule for computing its cost) - Define so good curves have lower cost (non-wiggly curves passing mostly through edgels have low cost) - Search for best curve with lowest cost (minimize cost function over all curves)
Discrete Grid How to find the best curve? Approach - Define cost function on curves (Given any curve, rule for computing its cost) - Define so good curves have lower cost (non-wiggly curves passing mostly through edgels have low cost) - Search for best curve with lowest cost (minimize cost function over all curves) How to compute cost? - At every pixel in image, define costs for “curve fragments” connecting it to adjoining pixels - Compute curve cost by summing up costs for fragments.
Defining cost: smoothness Path: series of pixels,, …,, Good curve has small curvature at each pixel –Small direction changes. – small on average (summing along curve) Good curve has small changes in gradient direction One possible term in cost definition : Fragment Fragment Cost
Defining cost: smoothness Path: series of pixels,, …,, Good curve has small curvature at each pixel –Small direction changes. – small on average (summing along curve) Good curve has small changes in gradient direction One possible term in cost definition : small when gradients at both nearly perpendicular to fragment direction Fragment Fragment Cost
Defining cost: smoothness Path: series of pixels,, …,, Good curve has small curvature at each pixel –Small direction changes. – small on average (summing along curve) Good curve has small changes in gradient direction One possible term in cost definition : small when gradients at both nearly perpendicular to fragment direction If this small at both i, i+1, then small at i. Low average value of this quantity low average value (good curve) Fragment Fragment Cost
Path Cost Function (better path has smaller cost) Path:,,,, Total cost: sum cost for each pixel over whole path One possible cost definition:
Path Cost Function (better path has smaller cost) Path:,,,, Total cost: sum cost for each pixel over whole path One possible cost definition: Small for high gradient, small direction change
How do we find the best Path? Computer Science… Remember: Curve is path through grid. Cost sums over each step of path. We want to minimize cost.
Map problem to Graph Edges connect pixels Edge weight = cost of fragment connecting the two pixels Each pixel is a node
Map problem to Graph Edge weight Example Cost: Edge weight = cost of fragment connecting the two pixels
Map problem to Graph Edge weight Example Cost: Edge weight = cost of fragment connecting the two pixels Note: this is just an example of a cost function. The cost function actually used differs
Algorithm: basic idea Two steps: 1)Compute least costs for best paths from start pixel to all other pixels. 2)Given a specific end pixel, use the cost computation to get best path between start and end pixels.
Dijkstra’s shortest path algorithm Algorithm: compute least costs from start node to all other nodes Iterate outward from start node, adding one node at a time. Cost estimates start high, nodes gradually “learn” their true cost Always choose next node so can calculate its correct cost. link cost
Dijkstra’s shortest path algorithm Algorithm: compute least costs from start node to all other nodes Iterate outward from start node, adding one node at a time. Cost estimates start high, gradually improve Always choose next node so can calculate its correct cost. Example: choose left node L with cost 1. Min cost to L equals 1 since any other path to L gets cost>1 on first step from start node L
Dijkstra’s shortest path algorithm Computes minimum costs from seed to every pixel –Running time (N pixels): O(N log N) using active priority queue (heap) fraction of a second for a typical (640x480) image Then find best path from seed to any point in O(N). Real time!
Intelligent Scissors
Results
Voting again Recall Hough Transform –Idea: data votes for best model (eg, best line passing through data points) –Implementation Choose type of model in advance (eg, straight lines) Each data point votes on parameters of model (eg, location + orientation of line)
Another type of voting Popularity contest –Each data point votes for others that it “likes,” that it feels “most related to” –“Most popular” points form a distinct group –(Like identifying social network from cluster of web links)
B A B “feels related” to A and votes for it.
C A C also “feels related” to A and votes for it.
D A So do D…
A So do D and E E
“Popular” points form a group, with “friends” reinforcing each other
feels unconnected to, doesn’t vote for it (or votes against)
also feels unconnected and doesn’t vote for
Only B feels connected and votes for (but it’s not enough) B
is unpopular and left out of group
Note: even though can connect smoothly to, it doesn’t feel related since it’s too far away
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes)
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes) Grouping: Start at salient point
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes) Grouping: Start at salient point Grow curve toward most related salient point
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes) Grouping: Start at salient point Grow curve toward most related salient point
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes) Grouping: Start at salient point Grow curve toward most related salient point
Identifying groups by “popularity” In vision, technical name for “popularity” is saliency Procedure –Voting: Pixels accumulate votes from their neighbors –Grouping: Find smooth curves made up of the most salient pixels (the pixels with the most votes) Grouping: Start at salient point Grow curve toward most related salient point
Affinity In vision, the amount of “relatedness” between two data points is called their affinity Related curve fragments: high affinity
Affinity In vision, the amount of “relatedness” between two data points is called their affinity Related curve fragments: high affinity (Since can be connected by smooth curve)
Affinity In vision, the amount of “relatedness” between two data points is called their affinity Related curve fragments: high affinity (Since can be connected by smooth curve) Unrelated curve fragments: low affinity
Affinity In vision, the amount of “relatedness” between two data points is called their affinity Related curve fragments: high affinity (Since can be connected by smooth curve) Unrelated curve fragments: low affinity (Can’t be connected by smooth curve)
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers is angle between chord and x-axis
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers is angle between chord and x-axis Angle of first fragment with X axis
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers is angle between chord and x-axis Angle of second fragment with X axis
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers is angle between chord and x-axis
Computing Affinity Co-circularity: a simple measure of affinity between line fragments –Two fragments are smoothly related if both lie on same circle. –Two fragments on the same circle have property that Chord joining line fragment centers is angle between chord and x-axis Co-circularity: Second fragment angle determined by angle of first, and angle of connecting line
Computing Affinity is the orientation that second fragment should have if it connects smoothly to first fragment. Measure affinity by difference with actual orientation: Example (use Gaussians so affinity falls off smoothly) : Edgel locations
Computing Affinity is the orientation that second fragment should have if it connects smoothly to first fragment. Measure affinity by difference with actual orientation: Example (use Gaussians so affinity falls off smoothly) : Treat both edge fragments in same way: sum orientation discrepancy for both
Computing Affinity is the orientation that second fragment should have if it connects smoothly to first fragment. Measure affinity by difference with actual orientation: Example (use Gaussians so affinity falls off smoothly) : More distant edges should have less affinity
Computing Affinity Why give smaller affinity to distant fragments? –Distant fragments likely to come from different objects –Many distant fragments, so some line up just by random chance
Computing Affinity For faster computation, fewer affinities: Set affinities to 0 beyond some distance T: (Often choose T small: e.g., less than 7 pixels, or just to include the nearest neighbors)
Eigenvector Grouping
Reminder: Grouping with Affinity Matrix Task: Partition image elements into groups G 1,..,G n of similar elements. Affinities should be: –high within group: –low between groups:
Eigenvector grouping Eigenvector grouping is just improved popularity grouping! Idea: –To judge your own popularity, weight the votes from your friends according to their popularity.
Eigenvector grouping Eigenvector grouping is just improved popularity grouping! Idea: –To judge your own popularity, weight the votes from your friends according to their popularity. –Recompute using better estimate of friends popularity: (use weighted voting for them also)
Eigenvector grouping Eigenvector grouping is just improved popularity grouping! Idea: –To judge your own popularity, weight the votes from your friends according to their popularity. –Recompute using better estimate of friends popularity: (use weighted voting for them also) –Even better if recompute using this improved estimate for friends
Eigenvector grouping Eigenvector grouping is just improved popularity grouping! Idea: –To judge your own popularity, weight the votes from your friends according to their popularity. –Recompute using better estimate of friends popularity: (use weighted voting for them also) –Even better if recompute using this improved estimate for friends
Eigenvector grouping Eigenvector grouping is just improved popularity grouping! Idea: –To judge your own popularity, weight the votes from your friends according to their popularity. –Recompute using better estimate of friends popularity: (use weighted voting for them also) –Even better if recompute using this improved estimate for friends –Iterating this, “weighted popularity” converges to eigenvector!
Your popularity = sum of friend’s votes
Usually add normalization so popularities always add to 1
Your popularity = sum of friend’s votes But don’t care about normalization. Important thing is that P is proportional to this
Your popularity = sum of friend’s votes Now use new estimate of friend’s popularity to weight their vote
Your popularity = sum of friend’s votes Now use new estimate of friend’s popularity to weight their vote
Your popularity = sum of friend’s votes Again use new estimate…
Your popularity = sum of friend’s votes Keep going…
Your popularity = sum of friend’s votes Keep going… An eigenvector of !
Your popularity = sum of friend’s votes Keep going… An eigenvector of ! Why? For very large N, (one more time makes little difference)
Your popularity = sum of friend’s votes Keep going… An eigenvector of ! Why? For very large N, (one more time makes little difference) Eigenvector of Aff, by definition
Your popularity = sum of friend’s votes Another view Multiplying by Aff many times, only the component E with largest |Aff E| remains from ( …1)
Your popularity = sum of friend’s votes Another view Multiplying by Aff many times, only the component E with largest |Aff E| remains from ( …1) So this is the eigenvector of Aff with the biggest eigenvalue
Biggest Eigenvector: Math Eigenvectors of Aff (or any symmetric matrix) give a rotated coordinated system. So we can write Eigenvector
Biggest Eigenvector: Math Eigenvectors of Aff (or any symmetric matrix) give a rotated coordinated system. So we can write Eigenvector
Biggest Eigenvector: Math Eigenvectors of Aff (or any symmetric matrix) give a rotated coordinated system. So we can write Eigenvector Eigenvalue for
Biggest Eigenvector: Math Eigenvectors of Aff (or any symmetric matrix) give a rotated coordinated system. So we can write Eigenvector Eigenvalue for Biggest eigenvalue dominates:
Eigenvector Grouping: Another View Toy problem –Single foreground group:
Eigenvector Grouping Toy problem –Single foreground group: …
Eigenvector Grouping Toy problem –Single foreground group: This is what Aff looks like if pixels in foreground group come before background pixels
Grouping Toy problem: single foreground group: Algorithm to find group –Find unit vector V that makes largest Affinity matrix: …
Grouping Toy problem: single foreground group: Algorithm to find group –Find unit vector V that makes largest –Large entry in V indicates group member. Affinity matrix: …
Grouping Toy problem: single foreground group: Algorithm to find group –Find unit vector V that makes largest –Answer ~ Affinity matrix: … (because every nonzero entry of V contributes as much as possible to |Aff V|)
Grouping Toy problem: single foreground group: Algorithm to find group –Find unit vector V that makes largest –To compute V: find ‘largest’ eigenvector of Affinity matrix: … (with the largest eigenvalue)
Grouping Toy problem: single foreground group: Algorithm to find group –Find unit vector V that makes largest –To compute V: find ‘largest’ eigenvector of (This gives largest value for |Aff V|) Affinity matrix: …
Eigenvector Grouping Toy problem –Single foreground group: (Toy) Algorithm summary: –Compute largest eigenvector E of Aff – Assign foreground if –Assign background if
Eigenvector Grouping Why does largest eigenvector work? Leading eigenvector E of A has properties: (by definition)
Eigenvector Grouping Why does largest eigenvector work? Leading eigenvector E of A has properties: (by definition) Eigenvalue (a number) Should be largest eigenvalue
Eigenvector Grouping Why does largest eigenvector work? Leading eigenvector E of A has properties: (by definition) occurs at (max value ) Fact:
Eigenvector Grouping Why does largest eigenvector work? Leading eigenvector E of A has properties: (by definition) occurs at (max value ) Fact: Interpretation: E is “central direction” of all rows of A (largest dot products with rows of A)
Aside: Math occurs at (max value ) Fact: (equivalent to maximize the square)
Aside: Math Derive from calculus: max is at stationary point occurs for Show
Aside: Math Derive from calculus: max is at stationary point occurs for Show
Aside: Math Derive from calculus: max is at stationary point At max V is an eigenvector of
Aside: Math Derive from calculus: max is at stationary point At max V is an eigenvector of V an eigenvector of A
Aside: Math Derive from calculus: max is at stationary point At max V is an eigenvector of V an eigenvector of A Why? A symmetric A has same eigenvectors as, since
Eigenvector Grouping More realistic problem –Single foreground group:
Eigenvector Grouping More realistic problem –Single foreground group: Eigenvector also has form
Eigenvector Grouping More realistic problem –Single foreground group: Eigenvector also has form (E is unit vector. To get largest product with Aff, more efficient to have big entries where they multiply big entries of Aff.)
Eigenvector Grouping More realistic problem –Single foreground group: Eigenvector also has form Can identify foreground items by their big entries of E
Eigenvector Grouping Real Algorithm summary –Compute largest eigenvector E of Aff – Assign foreground if –Assign background if (T a threshold chosen by you)
Eigenvector Grouping Even more realistic problem –Several groups: –Example: two groups Group 1Group 2
Eigenvector Grouping Even more realistic problem –Several groups: –Example: two groups Group 1Group 2 Usually, leading eigenvector picks out one of the groups (the one with biggest affinities). In this case, we again expect
Technical Aside Why does eigenvector pick out just one group? Toy example: 2 x 2 “affinity matrix” are “big” values; affinities between “groups” = 0
Technical Aside Why does eigenvector pick out just one group? Toy example: 2 x 2 “affinity matrix” are “big” values; affinities between “groups” = 0 Leading eigenvector is with eigenvalue Intuition: E has to be a unit vector. To be most efficient in getting large dot product, it put all its large values into group with largest affinities
Eigenvector Grouping For several groups, leading eigenvector picks out the “leading” (most self similar) group. (Remember: calculate eigenvector by svd on ) To identify other groups, can remove points from the leading group and repeat leading eigenvector computation for remaining points
Eigenvector Grouping For several groups, leading eigenvector (or singular vector) picks out the “leading” (most self similar) group. To identify other groups, can remove points from the leading group and repeat leading eigenvector computation for remaining points Alternative: just use non-leading eigenvectors/eigenvalues to pick out the non- leading groups.
(Alternative) Algorithm Summary Choose affinity measures and create affinity matrix Aff Compute eigenvalues for Aff. Assign all elements to active list L Repeat for each eigenvalue starting from largest: –Compute corresponding eigenvector –Choose threshold, assign to new group –Remove new group elements i from active list L. –Stop if L empty or new group too small. “Clean up” groups (optional)
Eigenvector Grouping Problems Method picks “strongest” group. When there are several groups of similar strength, it can get confused Eigenvector algorithm described so far not used currently.
Eigenvectors + Graph Cut Methods
Graph Cut Methods Image undirected graph G = (V,E) Each graph edge has weight value w(E) Example –Vertex nodes =edgels i –Graph edges go between edgels i, j –Graph edge has weight w(i,j)=Aff(i,j) Task: Partition V into V 1...V n, s.t. similarity is high within groups and low between groups
Issues What is a good partition ? How can you compute such a partition efficiently ?
Graph Cut G=(V,E) Sets A and B are a disjoint partition of V
Graph Cut G=(V,E) Sets A and B are a disjoint partition of V Measure of dissimilarity between the two groups:
Graph Cut Cut G=(V,E) Sets A and B are a disjoint partition of V Measure of dissimilarity between the two groups: Cut
Graph Cut Cut links that get summed over G=(V,E) Sets A and B are a disjoint partition of V Measure of dissimilarity between the two groups: Cut
The temptation Cut is a measure of disassociation Minimizing Cut gives partition with maximum disassociation. Efficient poly-time algorithms exist
The problem with MinCut It usually outputs segments that are too small!
The problem with MinCut It usually outputs segments that are too small! Can get small cut just by having fewer cut links
The Normalized Cut (Shi + Malik)
Given a partition (A,B) of the vertex set V. Ncut(A,B) measures difference between two groups, normalized by how similar they are within themselves. If A small, assoc(A) is small and Ncut large (bad partition). Problem: Find partition minimizing Ncut
Matrix formulation Definitions: D is an n x n diagonal matrix with entries W is an n x n symmetric matrix
Normalized cuts Transform the problem to one which can be approximated by eigenvector methods. After some algebra, the Ncut problem becomes subject to the constraints: for some constant
Normalized cuts Transform the problem to one which can be approximated by eigenvector methods. After some algebra, the Ncut problem becomes subject to the constraints: for some constant NP-Complete!
Normalized cuts Drop discreteness constraints to make easier, solvable by eigenvectors: Subject to constraint:
Normalized cuts Drop discreteness constraints to make easier, solvable by eigenvectors: Subject to constraint: Solution y is second least eigenvector of matrix
Normalized cuts Drop discreteness constraints to make easier, solvable by eigenvectors: Subject to constraint: Solution y is second least eigenvector of matrix Easy, since D is diagonal
Normalized cuts Drop discreteness constraints to make easier, solvable by eigenvectors: Subject to constraint: Solution y is second least eigenvector of matrix Because of constraint! (Means eigenvector for second smallest eigenvalue.)
Normalized cuts Drop discreteness constraints to make easier, solvable by eigenvectors: Subject to constraint: Solution y is second least eigenvector of matrix MATLAB: [U,S,V] = svd(M); y= V(:,end-1);
Normalized Cuts: algorithm Define affinities Compute matrices: and [U,S,V] = svd(M); y= V(:,end-1); Threshold y: (could also use a different threshold besides the mean of y)
Normalized Cuts Very powerful algorithm, widely used Results shown later