Yunhai Wang 1 Minglun Gong 1,2 Tianhua Wang 1,3 Hao (Richard) Zhang 4 Daniel Cohen-Or 5 Baoquan Chen 1,6 5 Tel-Aviv University 4 Simon Fraser University 1 Shenzhen Institutes of Advanced Technology 6 Shandong University 3 Jilin University 2 Memorial University of Newfoundland
2/40 One of the most fundamental tasks in shape analysis Low-level cues (minimal rule; convexity) alone insufficient
3/40 3 Learning segmentation [Kalograkis et al. 10] Active co-analysis [Wang et al. 2012] Unsupervised co-analysis [Sidi et al. 2011] Joint segmentation [Huang et al. 2011] Keys to success: amount & quality of labelled or unlabelled 3D data
4/ labeled meshes over 19 object categories How many 3D models of strollers, golf carts, gazebos, …? Not enough 3D models = insufficient knowledge Labeling 3D shapes is also a non-trivial task
5/40 About 14 million images across almost 22,000 object categories Labeling images is quite a bit easier than labeling 3D shapes
6/40 6 Incomplete Real-world 3D models (e.g., those from Tremble Warehouse) are often imperfect Self-intersecting; non-manifold
7/40 Treat a 3D shape as a set of projected binary images Alleviate various data artifacts in 3D, e.g., self- intersections Then propagate the image labels to the 3D shape Label these images by learning from vast amount of image data
8/40 Joint image-shape analysis via projective analysis for semantic 3D segmentation Utilize vast amount of available image data Allowing us to analyze imperfect 3D shapes
9/40 Bi-class Symmetric Hausdorff distance = BiSH Designed for matching 1D binary images More sensitive to topology changes (holes) Caters to our needs: part-aware label transfer
10/40 10 Image-guided 3D modeling [Xu et al.11] Many works on 2D-3D fusion, e.g., for reconstruction [Li et al.11]
11/40 11 Light field descriptor for 3D shape retrieval [Chen et al.03] Image-space simplification error [Lindstrom and Turk 10] We deal with the higher-level and more delicate task of semantic 3D segmentation
12/40 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion
13/40 Labeling involves GrabCut and some user assistance
14/40 Assume all objects are upright oriented; they mostly are! Project an input 3D shape from multiple pre-set viewpoints
15/40 For each projection of the input 3D shape, retrieve top matches from the set of labelled images
16/40 Select top (non-adjacent) projections with the smallest average matching costs for label transfer
17/40 Label transfer is done per corresponding horizontal slabs Pixel correspondence straightforward Later …
18/40 Label transfer is weighted by a confidence value per pixel Three terms based on image-level, slab-level, and pixel-level similarity: more similar = higher confidence
19/40 Probabilistic map over input 3D shape: computed by integrating per-pixel confidence values over each shape primitive One primitive projects to multiple pixels in multiple images Per-pixel confidence gathered over multiple retrieved images
20/40 Final labeling of 3D shape: multi-label alpha expansion graph cuts based on the probabilistic map
21/40 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion
22/40 Projections of input 3D shape … Database of (labeled) images … Characteristics of the data to be matched Possibly complex topology (lots of holes), not just a contour All upright orientated: to be exploited Goal: find shapes most suitable for label transfer and FAST! Not a global visual similarity based retrieval Want part-aware label transfer but cannot reliably segment Classical descriptors, e.g., shape context, interior distance shape context (IDSC), GIST, Zenike moments, Fourier descriptors, etc., do not quite fulfill our needs
23/40 Takes advantage of upright orientation
24/40 Classical choice for distance: symmetric Hausdorff (SH) But not sensitive to topology changes; not part-aware Cluster scan-lines into smaller number of slabs --- efficiency! Hierarchical clustering by a distance between adjacent slabs
25/40 SH for only one class may not be topology- sensitive A bi-class SH distance is! A B C B SH(A,B)=2, SH(A c, B c )=10 SH(C,B)=2, SH(C c, B c )=2
26/40 A B C B SH(A,B)=2, SH(A c, B c )=10 SH(C,B)=2, SH(C c, B c )=2 BiSH(C,B) = 2 BiSH(A,B) = 10
27/40 BiSH SH BiSH is more part-aware: new slabs near part boundaries
28/40 Slabs are scaled/warped vertically for better alignment Another measure to encourage part-aware label transfer Slabs of labeled image warped to better align with slabs in projected image Warp Slabs recolored: many-to-one slab matching possible Recolor
29/40 Dissimilarity between slabs: BiSH scaled by slab height Slab matching allows linear warp: optimized by a dynamic time warping (DTW) algorithm Dissimilarity between images: sum over slab dissimilarity after warped slab matching
30/40 PSA for 3D shape segmentation Region-based binary shape matching Results and conclusion
31/40 Same inputs, training data (we project), and experimental setting Models in [K 2010]: manifold, complete, no self- intersections PSA allows us to handle any category and imperfect shapes
32/40 11 object categories; about 2600 labeled images All input 3D shapes tested have self-intersections as well as other data artifacts
33/40 Pavilion (465 pieces) Bicycle (704 pieces)
34/40
35/40 Matching two images (512 x 512) takes 0.06 seconds Label transfer (2D-to-2D then to 3D): about 1 minute for a 20K-triangle mesh Number of selected projections: 5 – 10 Number of retrieved images per projection: 2
36/40 Projective shape analysis (PSA): semantic 3D segmentation by learning from labeled 2D images 36 Demonstrated potential in labeling 3D models: imperfect, complex topology, over any category
37/40 No strong requirements on quality of 3D model Utilize the rich availability and ease of processing of photos for 3D shape analysis
38/40 Inherent limitation of 2D projections: they do not fully capture 3D info Inherent to data-driven: knowledge has to be in data Assuming upright; not designed for articulated shapes Relying on spatial and not feature-space analysis
39/40 Labeling 2D images is still tedious: unsupervised projective analysis Additional cues from images and projections, e.g., color, depth, etc. Apply PSA for other knowledge-driven analyses
40/40 40 More results and data can be found from