Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mean Euclidean Distance Error (mm)

Similar presentations


Presentation on theme: "Mean Euclidean Distance Error (mm)"โ€” Presentation transcript:

1 Mean Euclidean Distance Error (mm)
Hand Pose Learning: Combining Deep Learning and Hierarchical Refinement for 3D Hand Pose Estimation Min-Yu Wu; Ya-Hui Tang; Pai-Wei Ting; Li-Chen Fu Department of Computer Science and Information Engineering National Taiwan University Abstract We propose a hybrid method of training a deep learning model and hierarchical refinement for hand pose estimation in a 3D space using depth images. Methods Our system is decomposed into two parts. The first part is composed of conventional joint-position-loss layer and our proposed skeleton- difference layer. The second part is an additional loss layer that considers physical constraints of a hand pose, keeping the joint relations. Our approach achieves 8.45 mm, and our result outperforms the previous works. Designing a skeleton-difference (SD) layer that can allow a convolutional neural network (CNN) training process to effectively learn the shape as well as physical constraints of a hand. Employing a refinement method that is capable of hierarchically regressing a hand pose with an energy function. To represent the spatial relationship better, we transform the bones into vectors, and the loss is then defined as : ๐›น ๐‘†๐ทL = ๐ฟ๐‘œ๐‘ ๐‘  ๐‘Ž + ๐ฟ๐‘œ๐‘ ๐‘  ๐‘ Fig. 2 Each finger represents one part and has three joints Fig. 1 The defined 16 Joints in ICVL dataset [4] Fig. 3 Angular loss ๐ฟ๐‘œ๐‘ ๐‘  ๐‘Ž Fig. 4 Bone length ๐ฟ๐‘œ๐‘ ๐‘  ๐‘ Discussion The best performance of SD loss function is 0.5. But when the weight is increased to 1, the performance starts to degrade. The reason is that SD loss is the compensation for ED loss. The former simply canโ€™t steal the latterโ€™s thunder. Fig. 5 The flowchart of our system Fig. 6 The proposed training approach Results Conclusions Table 1(updated) : SDNet: ZF-Net[3] with skeleton-difference loss function We proposed a novel architecture for hand pose estimation that combines the skeleton-difference network (SDNet) and a hierarchical refinement method. ๐›น ๐‘‡๐‘œ๐‘ก๐‘Ž๐‘™ = ๐œ” ๐ท ๐›น ๐ท + ๐œ” ๐‘†๐ท ๐›น ๐‘†๐ท Network Settings (weight) Mean Euclidean Distance Error (mm) Baseline (ZF-Net) SDNet ( ๐œ” ๐‘†๐ท =0.125) 8.8209 SDNet ( ๐œ” ๐‘†๐ท =0.25) 8.7291 SDNet ( ๐œ” ๐‘†๐ท =0.5) 8.4520 SDNet ( ๐œ” ๐‘†๐ท =1.0) 8.5104 Fig. 7 The fraction of success compared with [1,2] Contact References X. Sun, Y. Wei, S. Liang, X. Tang, and J. Sun, "Cascaded hand pose regression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp D. Tang, H. Jin Chang, A. Tejani, and T.-K. Kim, "Latent regression forest: Structured estimation of 3d articulated hand posture," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014, pp D. Tang, T.-H. Yu, and T.-K. Kim, "Real-time articulated hand pose estimation using semi-supervised transductive regression forests," in Proceedings of the IEEE international conference on computer vision, 2013, pp Ya-Hui Tang National Taiwan University Website: Phone:


Download ppt "Mean Euclidean Distance Error (mm)"

Similar presentations


Ads by Google