Learning 3D mesh segmentation and labeling

Learning 3D mesh segmentation and labeling
Presented by Ayrat Mutygullin 23 May 2017 Evangelos Kalogerakis Aaron Hertzmann Karan Singh Hello everybody, my name is Ayrat. And now will present you a method for simultaneous segmentation and labelling of 3D meshes. This article written in 2010 by Evangelos Kalogerakis, Aaron Hertzmann and Karan Singh from University of Toronto.

Goal: mesh segmentation and labeling
Input Mesh Labeled Mesh Head Neck Segmentation and labeling of 3D shapes into meaningful parts is important to shape understanding and manipulating with them. The motivation of this work is to take as input a mesh, as you can see here, for a case with this horse, and output a segmentation for that mesh including a label for each segment. Our model for labelling is automatically learned from collection of training meshes in this case, labeled meshes for other animals. Torso Leg Tail Ear Training Meshes 2

Related work: mesh segmentation
[Mangan and Whitaker 1999, Shlafman et al. 2002, Katz and Tal 2003, Liu and Zhang 2004, Katz et al. 2005, Simari et al. 2006, Attene et al. 2006, Lin et al. 2007, Kraevoy et al. 2007, Pekelny and Gotsman 2008, Golovinskiy and Funkhouser 2008, Li et al. 2008, Lai et al. 2008, Lavoue and Wolf 2008, Huang et al. 2009, Shapira et al. 2010] Surveys: [Attene et al. 2006, Shamir 2008, Chen et al. 2009] Now there is a lots of exiting works on this field of mesh segmentation, there are many mathematical formulations that have been developed to define the segmentation boundaries on the mesh, they also promising results. The problem is that previous works are mostly based on geometric interpreter criteria. 3

Related work: mesh segmentation
Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Random Walks [Lai et al. 08] Normalized Cuts And In addition, here they represented examples of horse. They used even most resent techniques for mesh segmentation, but still produced results far from that human would expect. 4

Is it possible to perform human level segmentation?
This raises a question. is it even possible to perform human level segmentation without using some prior knowledge about surface been segmented? We think there many aspects of human segmentation that don't seem could be captured while using only low level geometric curves. For example here you can see the average human segmentation boundaries for the case of a horse and human mesh taken from very useful previous SIGGRAHP. (And We believe that it would be very hard to detect parts in a human as is lower wrapper arms without using some higher-level knowledge.) [X. Chen et al. SIGGRAPH 09] 5

Is it possible to perform human level segmentation?
Someone could say that you could select this parameters of segmentation model for each object and for each class. But you can possibly imagine that this is hard or even impossible to do this for a large database of that sets. [X. Chen et al. SIGGRAPH 09] 6

Related work: computer vision for segmentation and labeling
Textonboost [Shotton et al. ECCV 06] Now there is already workes in computer vision for image segmentation, that uses prior knowledges to recognise parts in image. Similar to the case of mash segmentation field, people for decades were trying to find mathematical rules for image segmentation. But now recently, the field of computer vision has turned in to joint segmentation and recognition of images by using also learning from database or using prior knowledges. And this project mostly inspired by "Textonboost" approach. 7

Related work: mesh segmentation & labeling
Consistent segmentation of 3D meshes [Golovinskiy and Funkhouser 09] Now lets see on some related works in joint mesh segmentation and labelling in computer graphic. First there is approach of Golovinskiy and Funkhauser for "Consistent segmentation of 3D meshes". It assumes an accurate alignment on their input meshes. The method of Simari et al. learning of multi-objective function to be specified for segmentation and labelling, but this method requires manual definition and tuning of objective functions for each type of part, and it is sensitive to local minima. Multi-objective segmentation and labeling [Simari et al. 09] 8

Learning mesh segmentation and labeling
Learn from examples Significantly better results than state-of-the-art No manual parameter tuning Can learn different styles of segmentation Several applications of part labeling Now I could better explain by this previous works that our method learns mesh segmentation and labeling from examples. First of all they dont need manual parameter tuning in our technique its completely automatic. And it offer better results than state-of-art. This method also can learn different styles of segmentation. And several applications of part labeling. 9

Labeling problem statement
Head Neck Torso Leg Tail Ear c2 c3 c1 c4 Here is our problem statement. Our goal is to specify label for each mesh surface, giving pretty fine set of labels.[click] The label -which is c [click] for each mesh face is depends on the other line surface geometry around. Its also depends to the other labels of neighbouring faces. Therefore, we need to optimise all labels somehow jointly. You know there to do this we use a C-where is big C is a set of possible labels, from our model [click] that perform us this optimisation of label assignment globally on a mesh. C = { head, neck, torso, leg, tail, ear } 10

Conditional Random Field for Labeling
Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Unary term 11

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Face features

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Face Area

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Pairwise Term

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Edge Features

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Edge Length

Input Mesh Labeled Mesh Head Neck Torso Leg Tail Ear The CRF energy consists of two terms.[click] The unary energy term [click] you can see it here, that accesses the consistency of each mesh face to label. Given a feature vector [click] that is extracted for each face. And the unary term per faces is also scaled by the face area [click] in order to account for non uniform desolations. We also have pairwise term [click] that assist the consistence of adjustment faces to pairs of labels. And this term esentially corresponeds to how likely is to have segmentation boundary between adjustment faces, [click] even the features are extracted for that show that. And this pairwise term is also scaled by Edge lenght [click] in order to account again for non uniform desolations. Unary term

Feature vector surface curvature singular values from PCA shape diameter distances from medial surface average geodesic distances shape contexts spin images contextual label features x Now lets focus a bit more on a details of unary term. As I said the unary term consistency of its mesh face to each label. So for the unary term we really need to find a mapping between features exactly for the face to label probabilities for that face. In order to perform this we need to have features that informative of the types of parts. We use several descriptor to describe a vector of unary term X, such as surface curvature, singular values, shape diameter, distances from medial surface and so on. 18

Learning a classifier x2 x1 Head Neck Torso Leg Tail Ear
Now our goal is to map as a set this aaaa to learn this mapping from features do labels. For example now lets see set of training here. We used JointBoost classifier. [click] That animals provides us pairs of features vectors extracted for face with the corresponding labels. We give this simple illustration of this process, for 2 dimensional feature vector just for visualisation, just for that reasons. Its training surface that corresponding in this case to 2 dimensional point, based on its features. Each point is colour according to the corresponding label that this point has. As I said this is just visualisation of this process, in other case feature vector is much higher in dimensional. [click] What the classifier will do it is just find basically decision boundary that would split the input space that given newest point from novel face on a dest mesh. The classifier would decide the label for that point, and even better the classifier provide probability for each label for this face. x1 19

? Learning a classifier x2 x1
We use the Jointboost classifier [Torralba et al. 2007] x2 Head Neck Torso Leg Tail Ear ? Now our goal is to map as a set this aaaa to learn this mapping from features do labels. For example now lets see set of training here. We used JointBoost classifier. [click] That animals provides us pairs of features vectors extracted for face with the corresponding labels. We give this simple illustration of this process, for 2 dimensional feature vector just for visualisation, just for that reasons. Its training surface that corresponding in this case to 2 dimensional point, based on its features. Each point is colour according to the corresponding label that this point has. As I said this is just visualisation of this process, in other case feature vector is much higher in dimensional. [click] What the classifier will do it is just find basically decision boundary that would split the input space that given newest point from novel face on a dest mesh. The classifier would decide the label for that point, and even better the classifier provide probability for each label for this face. x1

Unary term Now the JointBoost classifier puts a probability for singing a label to each test. Here we visualised this probability in this case, showing a horse and label that exist commonly in the case of animals. 21

Unary Term Most-likely labels Classifier entropy
Now we use only the Most-likely lebels returned from the unary classifier, then the classification is mostly correcting but pretty noisy near with potential boundaries between parts. And you can see this on the right where visualised the classifier entropy which is measured of exceptanty of the classifier, so classifier is pretty uncertain in areas where are potential boundaries. 22

Our approach Labeled Mesh Input Mesh Pairwise Term Head Neck Torso Leg
Tail Ear To solve this problem of noisy boundaries they used pairwise term. Pairwise Term 23

Pairwise Term Geometry-dependent term
And the pairwise term essentially, what it does, its analyses neighbouring faces that having a different label, so having essentially segmentation boundaries. The probability of having segmentation boundaries is expressing by Geometry-dependent function. In this term Geometry-dependent function is valuated by binary classifier thats map from edge features to having segmentation boundary or not. Here we again use feature vector which contains curvatures in multiple scales, angles and so on. Features that possibly informative having boundary or not. So the Geometry-dependent term expresses this probability having segmentation boundary, which is also visualised on this horse. Its pretty high in areas there we can possibly have segmentation boundary. [click] so Geometry-dependent term is scaled by Label compatibility term having segmentational boundary, may not only depend on that features, but also on type of labels. 24

Pairwise Term Label compatibility term Head Neck Ear Torso Leg Tail
This is expressed symmetric by the symmetric, where we see the numbers of labels. Having the same labels assigned opposite it zero coast by default.[click] And having incompatible labels for surfaces such as the head and the legs here [click] they will never meet. So this combinations of labels have very large coast, here is represented as infinite value, practically its a very large value. And this is set in preprocessing step, what would check, if two labels are never met in training meshes, yes, if they will never met, they will have very large coast. Leg Tail 25

Full CRF result Unary term classifier Full CRF result Head Neck Torso
Leg Tail Ear Unary term classifier Full CRF result So if we will use the unary term and pairwise term, the Full CRF result. Its much better, its much cleaner, than using Unary term classifier alone. 26

Dataset used in experiments
We label 380 meshes from the Princeton Segmentation Benchmark Each of the 19 categories is treated separately [Chen et al. 2009] Antenna Head Thorax Leg Abdomen Now in order to a valuated method, we use the Princeton Segmentation Benchmark that provided by Chen et al. we have labeled 380 data meshes. And we train and test separately our method for each object of the 19 categories with 20 meshes each. 27

Quantitative Evaluation
Labeling 6% error by surface area No previous automatic method Segmentation Our result: 9.5% Rand Index error State-of-the art: 16% [Golovinskiy and Funkhouser 08] With 6 training meshes: 12% With 3 training meshes: 15% So we test on each mesh by learning a model from the other 19 mesh in the object category. Averaging the classification error of each test meshes, and average object category results in a pretty low error 6% even if we use fure training meshes our lebeling performance is very high close to 19%. (Another is no exactly previous authomatic method that we can compare with) [click] If we consider segmentation alone. And compare it with Benchmark segmentation, using Rand Index measure, that exist in Benchmark paper. Our result is pretty low, its 9.5% error. And the results of State-of-the art in 16%. So if we will use few training meshes, our results still better, even with 3 training meshes. 28

Labeling results So here we see labeling results for represented method for each object category, as you can see its detected parts. 29

Segmentation Comparisons
Here we visualised to compare some previous techniques for segmentation of chair for example. Its just the representative example. On the right you can see our approach. Where we coloured each segment based on a label that it has. On the left you can see the results for shape diameter and Randomised Cuts approaches. They dont use learning and they don't do any labeling of parts. You can see here our approach is more reasonable. and this is the case for other meshes. Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Our approach 30

Segmentation Comparisons
For the next example that you see for a Human woman mesh. Shape Diameter [Shapira et al. 10] Randomized Cuts [Golovinskiy and Funkhouser 08] Our approach 31

Learning different segmentation styles
Head Neck Torso Leg Tail Ear Training Meshes Test Meshes Head Front Torso Now, our method can learn different segmentation styles. For example here you can see particular style segmentation for fourth leg animals. Here it torses one part of segment, but if you search the Benchmark segmentations there you can find [click] other segmentation styles. Here the torse is segmented for more parts for the same training, this method adaptively learn [click] and use it to the training meshes. Middle Torso Back Torso Front Leg Back Leg Tail 32

Generalization to different categories
Head Wing Body Tail Head Neck Torso Leg And more interesting thing here. This method can generalise across the different categorys. For example, it can learn model from birds and to apply it on airplanes. [click ]And here is example for four leg animal that applyed for humans 33

Failure cases Face Hair Torso Handle Neck Leg Nose Cup
As I said our method perferm better results, but its still have failure cases. For example you can see here result for face segmentation. It happens probably because large ability in this object category. Head, faces, and so on. [click] In generally this meshes are larger different then the training meshes, and this technique still might not produce good results. [click] May not generalise well. 34

Limitations Adjacent segments with the same label are merged Head
Torso Upper arm Lower arm Hand Upper leg Lower leg Foot This method has several limitations, for example neighbouring segments with the same label that are merged. For example human with 3 heads. We applied model that is trained from human model. So as you can see heads are not segmented in our approach. Because the labels of surface that corresponding to heads is the same. its the head labels. so we cannot segment the 3 heads. the same thing with other parameters. 35

Limitations Results depend on having sufficient training data
Handle Cup Top Spout As in a case of data driven method results are also depends on having sufficient training data, if having very few meshes, it may not capture the variability of existing object category. 19 training meshes 3 training meshes 36

Limitations Many features are sensitive to topology Head Torso
Upper arm Lower arm Hand Upper leg Lower leg Foot And our features are also sensitive to topology. Here they are significantly different topology from our training meshes. As you can see here, in human meshes, their hand connected to the legs, in this case this method may not generalise well. So due to this limitations here there is possible future works for improvement this results. 37

Thank you! Thank you very much for your attention. 38

Question 1 Density of meshes. Did you have to remesh?
- No, they used just Benchmark meshes, And i think they are very nice and easy to use without remeshing.

Question 2 - If you would work with scanned meshes or something, you have to worry about density of samplings of this meshes, right? - Ok, so for that reason we use multy-scaled features, so essentially we compute for example curvature in relative size of object, so during the feature selection that Joinboost does, the scale of this object would be also automatically selected to use this mapping, so its adapted for different density meshes. But for point clouds and some complex architecture models, we have to explore more features I guess.

Question 3 Slide Some of this descriptors have already been used for segmentation individually? - In our case we put all this features in a big feature vector and we do this because different such a features might be relevant for different parts and different segmentation styles.

Question 4 What is the ω and μ in pairwise term ?
- This classifier helps detect boundaries better than using only dihedral angles. The second term penalizes boundaries between faces with high exterior dihedral angle ω . The μ term penalizes boundary length and is helpful for preventing jaggy boundaries and for removing small, isolated segments. A small constant ε is added to avoid computing log 0.

Question 5 What do JointBoost classifier?
JointBoost is a boosting algorithm that has many appealing properties: it performs automatic feature selection and can handle large numbers of input features for multiclass classification, it has a fast sequential learning algorithm, and it produces output probabilities suitable for com- bination with other terms in the CRF model. JointBoost is designed to share features among classes, which greatly reduces generalization error for multiclass recognition when classes overlap in feature space.

Learning 3D mesh segmentation and labeling

Similar presentations

Presentation on theme: "Learning 3D mesh segmentation and labeling"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning 3D mesh segmentation and labeling

Similar presentations

Presentation on theme: "Learning 3D mesh segmentation and labeling"— Presentation transcript:

Similar presentations

About project

Feedback