Download presentation
Presentation is loading. Please wait.
Published byPrudence Ball Modified over 9 years ago
1
Image Similarity and the Earth Mover’s Distance Empirical Evaluation of Dissimilarity Measures for Color and Texture Y. Rubner, J. Puzicha, C. Tomasi and T.M. Buhmann The Earth Mover’s Distance as a Metric for Image Retrieval Y. Rubner, C. Tomasi and J.J. Guibas The Earth Mover’s Distance is the Mallows Distance: Some Insights from Statistics E. Levina and P.J. Bickel Learning-Based Methods in Vision - Spring 2007 Frederik Heger (with graphics from last year’s slides) 1 February 2007
2
2 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How Similar Are They? Images from Caltech 256
3
3 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Similarity is Important for … Image classification Is there a penguin in this picture? This is a picture of a penguin. Image retrieval Find pictures with a penguin in them. Image as search query Find more images like this one. Image segmentation Something that looked like this was called penguin before.
4
4 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Space Shuttle Cargo Bay Image Representations: Histograms Normal histogramCumulative histogram Generalize to arbitrary dimensions Represent distribution of features Color, texture, depth, … Images from Dave Kauchak
5
5 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Image Representations: Histograms Joint histogram Requires lots of data Loss of resolution to avoid empty bins Images from Dave Kauchak Marginal histogram Requires independent features More data/bin than joint histogram
6
6 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Space Shuttle Cargo Bay Image Representations: Histograms Adaptive binning Better data/bin distribution, fewer empty bins Can adapt available resolution to relative feature importance Images from Dave Kauchak
7
7 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu EASE Truss Assembly Space Shuttle Cargo Bay Image Representations: Histograms Clusters / Signatures “super-adaptive” binning Does not require discretization along any fixed axis Images from Dave Kauchak
8
8 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Distance Metrics - - - = Euclidian distance of 5 units = Grayvalue distance of 50 values = ? x y x y
9
9 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Issue: How to Compare Histograms? Bin-by-bin comparison Sensitive to bin size. Could use wider bins … … but at a loss of resolution Cross-bin comparison How much cross-bin influence is necessary/sufficient?
10
10 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Heuristic Histogram Distance: Minkowski-form distance (L p ) Special Cases: L 1 Mahattan distance L 2 Euclidian Distance L Maximum value distance
11
11 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Heuristic Histogram Distance: Weighted-Mean-Variance (WMV) Info: Per-feature similarity measure Based on Gabor filter image representation Shown to outperform several parametric models for texture-based image retrieval
12
12 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Nonparametric Test Statistic: Kolmogorov-Smirnov distance (KS) Info: Defined for only one dimension Maximum discrepancy between cumulative distributions Invariant to arbitrary monotonic feature transformations
13
13 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Nonparametric Test Statistic: Cramer/von Mises type statistic (CvM) Info: Squared Euclidian distance between distributions Defined for single dimension
14
14 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Nonparametric Test Statistic: 2 Info: Very commonly used
15
15 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Information-theory Divergence: Kullback-Leibler divergence (KL) Info: Code one histogram using the other as true distribution How inefficient would it be? Also widely used.
16
16 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Information-theory Divergence: Jeffrey-divergence (JD) Info: Similar to KL divergence But symmetric and numerically stable
17
17 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Ground Distance Measure: Quadratic Form (QF) Info: Heuristic approach Matrix A incorporates cross-bin information
18
18 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Overview: Similarity Measures Ground Distance Measure Earth Mover’s Distance (EMD) Info: Based on solution of linear optimization problem (transportation problem) Minimal cost to transform one distribution to the other Total cost = sum of costs for individual features
19
19 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Summary: Similarity Measures
20
20 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Earth Mover’s Distance ≠
21
21 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Earth Mover’s Distance ≠
22
22 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Earth Mover’s Distance =
23
23 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Earth Mover’s Distance = (amount moved) * (distance moved)
24
24 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How EMD Works All movements (distance moved) * (amount moved) * (amount moved) n clusters Q P m clusters
25
25 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How EMD Works Move earth only from P to Q P’ Q’ n clusters Q P m clusters
26
26 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How EMD Works n clusters Q P m clusters P cannot send more earth than there is P’ Q’
27
27 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How EMD Works n clusters Q P m clusters Q cannot receive more earth than it can hold P’ Q’
28
28 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu How EMD Works n clusters Q P m clusters As much earth as possible must be moved P’ Q’
29
29 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Color-based Image Retrieval Jeffrey divergence Quadratic form distance Earth Mover Distanceχ 2 statistics L1 distance
30
30 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Red Car Retrievals (Color-based)
31
31 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Zebra Retrieval (Texture-based)
32
32 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu EMD with Position Encoding without position with position
33
33 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Issues with EMD High computational complexity Prohibitive for texture segmentation Features ordering needs to be known Open eyes / closed eyes example Distance can be set by very few features. E.g. with partial match of uneven distribution weight EMD = 0, no matter how many features follow
34
34 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Help From Statisticians For even-mass distributions, EMD is equivalent to Mallows distance (for uneven mass distributions, the two distances behave differently) Trick to compute Mallows distance 1-D marginals give better classification results than joint distributions (experimental results) Get marginals from empirical distribution by sorting feature vectors
35
35 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu EMD Summary / Conclusions Ground distance metric for image similarity Uses signatures for best adaptive binning and to lessen impact of prohibitive complexity Can deal with partial matches Good performance for color/texture classification Statistical grounding
36
36 LBMV Spring 2007 - Frederik Hegerfwh@cs.cmu.edu Last Slide Comments? Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.