Optimal invariant metrics for shape retrieval Michael Bronstein Department of Computer Science Technion – Israel Institute of Technology TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A 1
Text search Tagged shapes Content-based search Shapes without metadata 3D warehouse Text search Person Man, person, human Tagged shapes Content-based search Shapes without metadata
Outline ? Feature descriptor Geometric words Bag of words
Invariance Rigid Scale Inelastic Topology Local geodesic distance histogram Gaussian curvature Heat kernel signature (HKS) Scale-invariant HKS (SI-HKS) Wang, B 2010
Heat kernels Heat equation governs heat propagation on a surface Initial conditions: heat distribution at time Solution : heat distribution at time Heat kernel is a fundamental solution of the heat equation with point heat source at (heat value at point after time )
Heat kernel signature can be interpreted as probability of Brownian motion to return to the same point after time (represents “stability” of the point) Multiscale local shape descriptor Time (scale) Sun, Ovsjanikov & Guibas SGP 2009 7
Heat kernel signatures represented in RGB space Sun, Ovsjanikov, Guibas SGP 2009 Ovsjanikov, BB & Guibas NORDIA 2009 8
Scale invariance Original shape Scaled by HKS= HKS= Not scale invariant! B, Kokkinos CVPR 2010
Scaling = shift and multiplicative constant in HKS Scale-invariant heat kernel signature Log scale-space log + d/d Fourier transform magnitude 100 200 300 -15 -10 -5 t 100 200 300 -0.04 -0.03 -0.02 -0.01 t 2 4 6 8 10 12 14 16 18 20 1 3 =2k/T Scaling = shift and multiplicative constant in HKS Undo scaling Undo shift B, Kokkinos CVPR 2010
Scale invariance Heat Kernel Signature Scale-invariant B, Kokkinos CVPR 2010
Scale invariance Heat Kernel Signature Scale-invariant B, Kokkinos CVPR 2010
Modeling vs learning Wang, B 2010 13
Learning invariance T Positives P Negatives N
Similarity learning positive false positive negative false negative with high probability 15
Similarity-preserving hashing -1 -1 +1 -1 -1 -1 -1 +1 -1 -1 +1 +1 = # of distinct bits +1 +1 -1 +1 +1 +1 +1 -1 +1 +1 +1 +1 Collision: with high probability with low probability Gionis, Indik, Motwani 1999 Shakhnarovich 2005 16
Boosting -1 +1 Construct 1D embedding Similarity is approximated by Downweight pairs with Upweight pairs with BBK 2010; BB Ovsjanikov, Guibas 2010 Shakhnarovich 2005 17
Boosting -1 -1 -1 +1 +1 -1 -1 +1 +1 +1 Construct 1D embedding Similarity is approximated by +1 +1 Downweight pairs with Upweight pairs with BBK 2010; BB Ovsjanikov, Guibas 2010 Shakhnarovich 2005 18
SHREC 2010 dataset 19
Total dataset size: 1K shapes (715 queries) Positives: 10K Negatives: 100K BB et al, 3DOR 2010 SHREC 2010 dataset 20
ShapeGoogle with HKS descriptor BB et al, 3DOR 2010 ShapeGoogle with HKS descriptor 21
ShapeGoogle with SI-HKS descriptor BB et al, 3DOR 2010 ShapeGoogle with SI-HKS descriptor 22
Similarity sensitive hashing (96 bit) BB et al, 3DOR 2010 Similarity sensitive hashing (96 bit) 23
WaldHash Construct embedding by maximizing positive Early decision negative Remove pairs with and sample in new pairs into the training set Downweight pairs with Upweight pairs with B2, Ovsjanikov, Guibas 2010 24
30% B2, Ovsjanikov, Guibas 2010 25
Incommensurable spaces! How to compare apples to oranges? Cross-modal similarity Modality 1 Modality 2 Incommensurable spaces! Objects belonging to different modalities usually have different dimensionality and structure and are generated by different processes. Comparing such data is like comparing apples to oranges. Triangular meshes Point clouds How to compare apples to oranges? BB, Michel, Paragios CVPR 2010
Cross-modality embedding The key idea of our paper is to embed incommensurable data into a common metric space, in such a way that positive pairs are mapped to nearby points, while negative pairs are mapped to far away points in the embedding space. BB, Michel, Paragios CVPR 2010 with high probability
Cross-modality hashing -1 -1 +1 -1 -1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 +1 +1 +1 -1 +1 -1 -1 +1 The key idea of our paper is to embed incommensurable data into a common metric space, in such a way that positive pairs are mapped to nearby points, while negative pairs are mapped to far away points in the embedding space. Collision: with high probability with low probability BB, Michel, Paragios CVPR 2010
Cross-representation 3D shape retrieval Database Query 1052 shapes In first example application of our generic approach, we tried to retrieve three dimensional shapes with the query and the database represented using different descriptors. 8x8 dimensional bag of expressions 32-dimensional bag of words BB, Michel, Paragios CVPR 2010
Mean average precision Retrieval performance Mean average precision Our cross-modality metric outperforms Euclidean distances applied to each modality independently. It is only slightly inferior to optimal uni-modal metrics. BB, Michel, Paragios CVPR 2010 Number of bits