Learning 3D mesh segmentation and labeling

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Complex Networks for Representation and Characterization of Images For CS790g Project Bingdong Li 9/23/2009.
Context-based object-class recognition and retrieval by generalized correlograms by J. Amores, N. Sebe and P. Radeva Discussion led by Qi An Duke University.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.
Recovering Human Body Configurations: Combining Segmentation and Recognition Greg Mori, Xiaofeng Ren, and Jitentendra Malik (UC Berkeley) Alexei A. Efros.
Semantic Texton Forests for Image Categorization and Segmentation We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד.
Lecture 07 Segmentation Lecture 07 Segmentation Mata kuliah: T Computer Vision Tahun: 2010.
1 Minimum Ratio Contours For Meshes Andrew Clements Hao Zhang gruvi graphics + usability + visualization.
A Study of Approaches for Object Recognition
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Randomized Cuts for 3D Mesh Analysis
Learning 3D mesh segmentation and labeling Evangelos Kalogerakis, Aaron Hertzmann, Karan Singh University of Toronto Head Tors o Upper arm Lower arm Hand.
Prior Knowledge for Part Correspondence Oliver van Kaick 1, Andrea Tagliasacchi 1, Oana Sidi 2, Hao Zhang 1, Daniel Cohen-Or 2, Lior Wolf 2, Ghassan Hamarneh.
Ensemble Learning (2), Tree and Forest
Radial-Basis Function Networks
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
Multimodal Interaction Dr. Mike Spann
Graph-based Segmentation. Main Ideas Convert image into a graph Vertices for the pixels Vertices for the pixels Edges between the pixels Edges between.
TEMPLATE DESIGN © Zhiyao Duan 1,2, Lie Lu 1, and Changshui Zhang 2 1. Microsoft Research Asia (MSRA), Beijing, China.2.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Project by: Cirill Aizenberg, Dima Altshuler Supervisor: Erez Berkovich.
Associative Hierarchical CRFs for Object Class Image Segmentation
School of Computer Science 1 Information Extraction with HMM Structures Learned by Stochastic Optimization Dayne Freitag and Andrew McCallum Presented.
Semi-supervised Mesh Segmentation and Labeling
Computational Biology Group. Class prediction of tumor samples Supervised Clustering Detection of Subgroups in a Class.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Shape2Pose: Human Centric Shape Analysis CMPT888 Vladimir G. Kim Siddhartha Chaudhuri Leonidas Guibas Thomas Funkhouser Stanford University Princeton University.
Graph-based Segmentation
Ensemble Classifiers.
Today’s Lecture Neural networks Training
Course : T Computer Vision
Prior Knowledge for Part Correspondence
CSE 554 Lecture 2: Shape Analysis (Part I)
Recommendation in Scholarly Big Data
Learning to Compare Image Patches via Convolutional Neural Networks
CSC2535: Computation in Neural Networks Lecture 11 Extracting coherent properties by maximizing mutual information across space or time Geoffrey Hinton.
Intrinsic Data Geometry from a Training Set
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
An Artificial Intelligence Approach to Precision Oncology
POLYGON MESH Advance Computer Graphics
Chapter 21 More About Tests.
Article Review Todd Hricik.
Recognizing Deformable Shapes
Nonparametric Semantic Segmentation
Saliency detection Donghun Yeo CV Lab..
Data Mining (and machine learning)
Enhanced-alignment Measure for Binary Foreground Map Evaluation
Fitting Curve Models to Edges
Lecture 25: Introduction to Recognition
K Nearest Neighbor Classification
Computer Vision Lecture 16: Texture II
Learning to Combine Bottom-Up and Top-Down Segmentation
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Neuro-Computing Lecture 4 Radial Basis Function Network
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical Learning Dong Liu Dept. EEIS, USTC.
An Infant Facial Expression Recognition System Based on Moment Feature Extraction C. Y. Fang, H. W. Lin, S. W. Chen Department of Computer Science and.
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Learning From Observed Data
Recognizing Deformable Shapes
Human-object interaction
“Traditional” image segmentation
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Introduction to Artificial Intelligence Lecture 22: Computer Vision II
A Joint Model of Orthography and Morphological Segmentation
CS249: Neural Language Model
Presentation transcript:

Learning 3D mesh segmentation and labeling Presented by Ayrat Mutygullin 23 May 2017 Evangelos Kalogerakis Aaron Hertzmann Karan Singh Hello everybody, my name is Ayrat. And now will present you a method for simultaneous segmentation and labelling of 3D meshes. This article written in 2010 by Evangelos Kalogerakis, Aaron Hertzmann and Karan Singh from University of Toronto.

Goal: mesh segmentation and labeling Input Mesh Labeled Mesh Head Neck Segmentation and labeling of 3D shapes into meaningful parts is important to shape understanding and manipulating with them. The motivation of this work is to take as input a mesh, as you can see here, for a case with this horse, and output a segmentation for that mesh including a label for each segment. Our model for labelling is automatically learned from collection of training meshes in this case, labeled meshes for other animals. Torso Leg Tail Ear Training Meshes 2

Related work: mesh segmentation [Mangan and Whitaker 1999, Shlafman et al. 2002, Katz and Tal 2003, Liu and Zhang 2004, Katz et al. 2005, Simari et al. 2006, Attene et al. 2006, Lin et al. 2007, Kraevoy et al. 2007, Pekelny and Gotsman 2008, Golovinskiy and Funkhouser 2008, Li et al. 2008, Lai et al. 2008, Lavoue and Wolf 2008, Huang et al. 2009, Shapira et al. 2010] Surveys: [Attene et al. 2006, Shamir 2008, Chen et al. 2009] Now there is a lots of exiting works on this field of mesh segmentation, there are many mathematical formulations that have been developed to define the segmentation boundaries on the mesh, they also promising results. The problem is that previous works are mostly based on geometric interpreter criteria. 3

Related work: mesh segmentation Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Random Walks [Lai et al. 08] Normalized Cuts And In addition, here they represented examples of horse. They used even most resent techniques for mesh segmentation, but still produced results far from that human would expect. 4

Is it possible to perform human level segmentation? This raises a question. is it even possible to perform human level segmentation without using some prior knowledge about surface been segmented? We think there many aspects of human segmentation that don't seem could be captured while using only low level geometric curves. For example here you can see the average human segmentation boundaries for the case of a horse and human mesh taken from very useful previous SIGGRAHP. (And We believe that it would be very hard to detect parts in a human as is lower wrapper arms without using some higher-level knowledge.) [X. Chen et al. SIGGRAPH 09] 5

Is it possible to perform human level segmentation? Someone could say that you could select this parameters of segmentation model for each object and for each class. But you can possibly imagine that this is hard or even impossible to do this for a large database of that sets. [X. Chen et al. SIGGRAPH 09] 6

Related work: computer vision for segmentation and labeling Textonboost [Shotton et al. ECCV 06] Now there is already workes in computer vision for image segmentation, that uses prior knowledges to recognise parts in image. Similar to the case of mash segmentation field, people for decades were trying to find mathematical rules for image segmentation. But now recently, the field of computer vision has turned in to joint segmentation and recognition of images by using also learning from database or using prior knowledges. And this project mostly inspired by "Textonboost" approach. 7

Related work: mesh segmentation & labeling Consistent segmentation of 3D meshes [Golovinskiy and Funkhouser 09] Now lets see on some related works in joint mesh segmentation and labelling in computer graphic. First there is approach of Golovinskiy and Funkhauser for "Consistent segmentation of 3D meshes". It assumes an accurate alignment on their input meshes. The method of Simari et al. learning of multi-objective function to be specified for segmentation and labelling, but this method requires manual definition and tuning of objective functions for each type of part, and it is sensitive to local minima. Multi-objective segmentation and labeling [Simari et al. 09] 8

Learning mesh segmentation and labeling Learn from examples Significantly better results than state-of-the-art No manual parameter tuning Can learn different styles of segmentation Several applications of part labeling Now I could better explain by this previous works that our method learns mesh segmentation and labeling from examples. First of all they dont need manual parameter tuning in our technique its completely automatic. And it offer better results than state-of-art. This method also can learn different styles of segmentation. And several applications of part labeling. 9

Labeling problem statement Head Neck Torso Leg Tail Ear c2 c3 c1 c4 Here is our problem statement. Our goal is to specify label for each mesh surface, giving pretty fine set of labels.[click] The label -which is c [click] for each mesh face is depends on the other line surface geometry around. Its also depends to the other labels of neighbouring faces. Therefore, we need to optimise all labels somehow jointly. You know there to do this we use a C-where is big C is a set of possible labels, from our model [click] that perform us this optimisation of label assignment globally on a mesh. C = { head, neck, torso, leg, tail, ear } 10

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Unary term 11

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Face features

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Face Area

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Pairwise Term

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Edge Features

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Edge Length

Conditional Random Field for Labeling Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Unary term

Feature vector surface curvature singular values from PCA shape diameter distances from medial surface average geodesic distances shape contexts spin images contextual label features x Now lets focus a bit more on a details of unary term. As I said the unary term consistency of its mesh face to each label. So for the unary term we really need to find a mapping between features exactly for the face to label probabilities for that face. In order to perform this we need to have features that informative of the types of parts. We use several descriptor to describe a vector of unary term X, such as surface curvature, singular values, shape diameter, distances from medial surface and so on. 18

Learning a classifier x2 x1 Head Neck Torso Leg Tail Ear Now our goal is to map as a set this aaaa to learn this mapping from features do labels. For example now lets see set of training here. We used JointBoost classifier. [click] That animals provides us pairs of features vectors extracted for face with the corresponding labels. We give this simple illustration of this process, for 2 dimensional feature vector just for visualisation, just for that reasons. Its training surface that corresponding in this case to 2 dimensional point, based on its features. Each point is colour according to the corresponding label that this point has. As I said this is just visualisation of this process, in other case feature vector is much higher in dimensional. [click] What the classifier will do it is just find basically decision boundary that would split the input space that given newest point from novel face on a dest mesh. The classifier would decide the label for that point, and even better the classifier provide probability for each label for this face. x1 19

? Learning a classifier x2 x1 We use the Jointboost classifier [Torralba et al. 2007] x2 Head Neck Torso Leg Tail Ear ? Now our goal is to map as a set this aaaa to learn this mapping from features do labels. For example now lets see set of training here. We used JointBoost classifier. [click] That animals provides us pairs of features vectors extracted for face with the corresponding labels. We give this simple illustration of this process, for 2 dimensional feature vector just for visualisation, just for that reasons. Its training surface that corresponding in this case to 2 dimensional point, based on its features. Each point is colour according to the corresponding label that this point has. As I said this is just visualisation of this process, in other case feature vector is much higher in dimensional. [click] What the classifier will do it is just find basically decision boundary that would split the input space that given newest point from novel face on a dest mesh. The classifier would decide the label for that point, and even better the classifier provide probability for each label for this face. x1

Unary term Now the JointBoost classifier puts a probability for singing a label to each test. Here we visualised this probability in this case, showing a horse and label that exist commonly in the case of animals. 21

Unary Term Most-likely labels Classifier entropy Now we use only the Most-likely lebels returned from the unary classifier, then the classification is mostly correcting but pretty noisy near with potential boundaries between parts. And you can see this on the right where visualised the classifier entropy which is measured of exceptanty of the classifier, so classifier is pretty uncertain in areas where are potential boundaries. 22

Our approach Labeled Mesh Input Mesh Pairwise Term Head Neck Torso Leg Tail Ear To solve this problem of noisy boundaries they used pairwise term. Pairwise Term 23

Pairwise Term Geometry-dependent term And the pairwise term essentially, what it does, its analyses neighbouring faces that having a different label, so having essentially segmentation boundaries. The probability of having segmentation boundaries is expressing by Geometry-dependent function. In this term Geometry-dependent function is valuated by binary classifier thats map from edge features to having segmentation boundary or not. Here we again use feature vector which contains curvatures in multiple scales, angles and so on. Features that possibly informative having boundary or not. So the Geometry-dependent term expresses this probability having segmentation boundary, which is also visualised on this horse. Its pretty high in areas there we can possibly have segmentation boundary. [click] so Geometry-dependent term is scaled by Label compatibility term having segmentational boundary, may not only depend on that features, but also on type of labels. 24

Pairwise Term Label compatibility term Head Neck Ear Torso Leg Tail This is expressed symmetric by the symmetric, where we see the numbers of labels. Having the same labels assigned opposite it zero coast by default.[click] And having incompatible labels for surfaces such as the head and the legs here [click] they will never meet. So this combinations of labels have very large coast, here is represented as infinite value, practically its a very large value. And this is set in preprocessing step, what would check, if two labels are never met in training meshes, yes, if they will never met, they will have very large coast. Leg Tail 25

Full CRF result Unary term classifier Full CRF result Head Neck Torso Leg Tail Ear Unary term classifier Full CRF result So if we will use the unary term and pairwise term, the Full CRF result. Its much better, its much cleaner, than using Unary term classifier alone. 26

Dataset used in experiments We label 380 meshes from the Princeton Segmentation Benchmark Each of the 19 categories is treated separately [Chen et al. 2009] Antenna Head Thorax Leg Abdomen Now in order to a valuated method, we use the Princeton Segmentation Benchmark that provided by Chen et al. we have labeled 380 data meshes. And we train and test separately our method for each object of the 19 categories with 20 meshes each. 27

Quantitative Evaluation Labeling 6% error by surface area No previous automatic method Segmentation Our result: 9.5% Rand Index error State-of-the art: 16% [Golovinskiy and Funkhouser 08] With 6 training meshes: 12% With 3 training meshes: 15% So we test on each mesh by learning a model from the other 19 mesh in the object category. Averaging the classification error of each test meshes, and average object category results in a pretty low error 6% even if we use fure training meshes our lebeling performance is very high close to 19%. (Another is no exactly previous authomatic method that we can compare with) [click] If we consider segmentation alone. And compare it with Benchmark segmentation, using Rand Index measure, that exist in Benchmark paper. Our result is pretty low, its 9.5% error. And the results of State-of-the art in 16%. So if we will use few training meshes, our results still better, even with 3 training meshes. 28

Labeling results So here we see labeling results for represented method for each object category, as you can see its detected parts. 29

Segmentation Comparisons Here we visualised to compare some previous techniques for segmentation of chair for example. Its just the representative example. On the right you can see our approach. Where we coloured each segment based on a label that it has. On the left you can see the results for shape diameter and Randomised Cuts approaches. They dont use learning and they don't do any labeling of parts. You can see here our approach is more reasonable. and this is the case for other meshes. Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Our approach 30

Segmentation Comparisons For the next example that you see for a Human woman mesh. Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Our approach 31

Learning different segmentation styles Head Neck Torso Leg Tail Ear Training Meshes Test Meshes Head Front Torso Now, our method can learn different segmentation styles. For example here you can see particular style segmentation for fourth leg animals. Here it torses one part of segment, but if you search the Benchmark segmentations there you can find [click] other segmentation styles. Here the torse is segmented for more parts for the same training, this method adaptively learn [click] and use it to the training meshes. Middle Torso Back Torso Front Leg Back Leg Tail 32

Generalization to different categories Head Wing Body Tail Head Neck Torso Leg And more interesting thing here. This method can generalise across the different categorys. For example, it can learn model from birds and to apply it on airplanes. [click ]And here is example for four leg animal that applyed for humans 33

Failure cases Face Hair Torso Handle Neck Leg Nose Cup As I said our method perferm better results, but its still have failure cases. For example you can see here result for face segmentation. It happens probably because large ability in this object category. Head, faces, and so on. [click] In generally this meshes are larger different then the training meshes, and this technique still might not produce good results. [click] May not generalise well. 34

Limitations Adjacent segments with the same label are merged Head Torso Upper arm Lower arm Hand Upper leg Lower leg Foot This method has several limitations, for example neighbouring segments with the same label that are merged. For example human with 3 heads. We applied model that is trained from human model. So as you can see heads are not segmented in our approach. Because the labels of surface that corresponding to heads is the same. its the head labels. so we cannot segment the 3 heads. the same thing with other parameters. 35

Limitations Results depend on having sufficient training data Handle Cup Top Spout As in a case of data driven method results are also depends on having sufficient training data, if having very few meshes, it may not capture the variability of existing object category. 19 training meshes 3 training meshes 36

Limitations Many features are sensitive to topology Head Torso Upper arm Lower arm Hand Upper leg Lower leg Foot And our features are also sensitive to topology. Here they are significantly different topology from our training meshes. As you can see here, in human meshes, their hand connected to the legs, in this case this method may not generalise well. So due to this limitations here there is possible future works for improvement this results. 37

Thank you! Thank you very much for your attention. 38

Question 1 Density of meshes. Did you have to remesh? - No, they used just Benchmark meshes, And i think they are very nice and easy to use without remeshing.

Question 2 - If you would work with scanned meshes or something, you have to worry about density of samplings of this meshes, right? - Ok, so for that reason we use multy-scaled features, so essentially we compute for example curvature in relative size of object, so during the feature selection that Joinboost does, the scale of this object would be also automatically selected to use this mapping, so its adapted for different density meshes. But for point clouds and some complex architecture models, we have to explore more features I guess.

Question 3 Slide 18. - Some of this descriptors have already been used for segmentation individually? - In our case we put all this features in a big feature vector and we do this because different such a features might be relevant for different parts and different segmentation styles.

Question 4 What is the ω and μ in pairwise term ? - This classifier helps detect boundaries better than using only dihe- dral angles. The second term penalizes boundaries between faces with high exterior dihedral angle ω . The μ term penalizes boundary length and is helpful for preventing jaggy boundaries and for removing small, isolated segments. A small constant ε is added to avoid computing log 0.

Question 5 What do JointBoost classifier? JointBoost is a boosting algo- rithm that has many appealing properties: it performs automatic feature selection and can handle large numbers of input features for multiclass classification, it has a fast sequential learning algorithm, and it produces output probabilities suitable for com- bination with other terms in the CRF model. JointBoost is designed to share features among classes, which greatly reduces generalization error for multiclass recognition when classes overlap in feature space.