Mean Euclidean Distance Error (mm)

Slides:



Advertisements
Similar presentations
Good afternoon everyone
Advertisements

Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Su-A Kim 3 rd June 2014 Danhang Tang, Tsz-Ho Yu, Tae-kyun Kim Imperial College London, UK Real-time Articulated Hand Pose Estimation using Semi-supervised.
3D Human Body Pose Estimation using GP-LVM Moin Nabi Computer Vision Group Institute for Research in Fundamental Sciences (IPM)
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Spatial Pyramid Pooling in Deep Convolutional
A Fast and Robust Fingertips Tracking Algorithm for Vision-Based Multi-touch Interaction Qunqun Xie, Guoyuan Liang, Cheng Tang, and Xinyu Wu th.
Mutual Information-based Stereo Matching Combined with SIFT Descriptor in Log-chromaticity Color Space Yong Seok Heo, Kyoung Mu Lee, and Sang Uk Lee.
Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )
Hurieh Khalajzadeh Mohammad Mansouri Mohammad Teshnehlab
Video Tracking Using Learned Hierarchical Features
Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
A NOVEL METHOD FOR COLOR FACE RECOGNITION USING KNN CLASSIFIER
Face Alignment at 3000fps via Regressing Local Binary Features CVPR14 Shaoqing Ren, Xudong Cao, Yichen Wei, Jian Sun Presented by Sung Sil Kim.
Skeleton Based Action Recognition with Convolutional Neural Network
Cascade Region Regression for Robust Object Detection
Convolutional Neural Network
Technische Universität München Yulia Gembarzhevskaya LARGE-SCALE MALWARE CLASSIFICATON USING RANDOM PROJECTIONS AND NEURAL NETWORKS Technische Universität.
Gaussian Conditional Random Field Network for Semantic Segmentation
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Hierarchical Motion Evolution for Action Recognition Authors: Hongsong Wang, Wei Wang, Liang Wang Center for Research on Intelligent Perception and Computing,
Facial Smile Detection Based on Deep Learning Features Authors: Kaihao Zhang, Yongzhen Huang, Hong Wu and Liang Wang Center for Research on Intelligent.
3D ShapeNets: A Deep Representation for Volumetric Shapes Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, Jianxiong Xiao.
Feature selection using Deep Neural Networks March 18, 2016 CSI 991 Kevin Ham.
Face Recognition based on 2D-PCA and CNN
Tofik AliPartha Pratim Roy Department of Computer Science and Engineering Indian Institute of Technology Roorkee CVIP-WM 2017 Paper ID 172 Word Spotting.
Deeply-Recursive Convolutional Network for Image Super-Resolution
Big data classification using neural network
Intelligent Learning Systems Design for Self-Defense Education
Deep Learning for Dual-Energy X-Ray
A Discriminative Feature Learning Approach for Deep Face Recognition
IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*
Deeply learned face representations are sparse, selective, and robust
Summary of “Efficient Deep Learning for Stereo Matching”
Deep Learning Amin Sobhani.
From Vision to Grasping: Adapting Visual Networks
Jure Zbontar, Yann LeCun
A Pool of Deep Models for Event Recognition
Restricted Boltzmann Machines for Classification
References [1] - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11): ,
Combining CNN with RNN for scene labeling (segmentation)
Compositional Human Pose Regression
Computer Vision James Hays
Deep Face Recognition Omkar M. Parkhi Andrea Vedaldi Andrew Zisserman
Introduction to Neural Networks
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
Wei Liu, Chaofeng Chen and Kwan-Yee K. Wong
Convolutional Neural Networks for Visual Tracking
Single Image Rolling Shutter Distortion Correction
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Overview Proposed Approach Experiments Compositional inference
Semantic segmentation
Age and Gender Classification using Convolutional Neural Networks
YOLO-LITE: A Real-Time Object Detection Web Implementation
Outline Background Motivation Proposed Model Experimental Results
Visualizing and Understanding Convolutional Networks
George Bebis and Wenjing Li Computer Vision Laboratory
SSM 여러 부위 SSM 발 3D 생성 중첩된 외곽선 추출 여러 부위 reigstration 골절 3D 생성.
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
CIS 519 Recitation 11/15/18.
Learning Incoherent Sparse and Low-Rank Patterns from Multiple Tasks
LOGAN: Unpaired Shape Transform in Latent Overcomplete Space
MULTI-VIEW VISUAL SPEECH RECOGNITION BASED ON MULTI TASK LEARNING HouJeung Han, Sunghun Kang and Chang D. Yoo Dept. of Electrical Engineering, KAIST, Republic.
Image Processing and Multi-domain Translation
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Multi-Modal Multi-Scale Deep Learning for Large-Scale Image Annotation
Introduction Face detection and alignment are essential to many applications such as face recognition, facial expression recognition, age identification,
Adrian E. Gonzalez , David Parra Department of Computer Science
Nguyen Ngoc Hoang, Guee-Sang Lee, Soo-Hyung Kim, Hyung-Jeong Yang
Presentation transcript:

Mean Euclidean Distance Error (mm) Hand Pose Learning: Combining Deep Learning and Hierarchical Refinement for 3D Hand Pose Estimation Min-Yu Wu; Ya-Hui Tang; Pai-Wei Ting; Li-Chen Fu Department of Computer Science and Information Engineering National Taiwan University Abstract We propose a hybrid method of training a deep learning model and hierarchical refinement for hand pose estimation in a 3D space using depth images. Methods Our system is decomposed into two parts. The first part is composed of conventional joint-position-loss layer and our proposed skeleton- difference layer. The second part is an additional loss layer that considers physical constraints of a hand pose, keeping the joint relations. Our approach achieves 8.45 mm, and our result outperforms the previous works. Designing a skeleton-difference (SD) layer that can allow a convolutional neural network (CNN) training process to effectively learn the shape as well as physical constraints of a hand. Employing a refinement method that is capable of hierarchically regressing a hand pose with an energy function. To represent the spatial relationship better, we transform the bones into vectors, and the loss is then defined as : 𝛹 𝑆𝐷L = 𝐿𝑜𝑠𝑠 𝑎 + 𝐿𝑜𝑠𝑠 𝑏 Fig. 2 Each finger represents one part and has three joints Fig. 1 The defined 16 Joints in ICVL dataset [4] Fig. 3 Angular loss 𝐿𝑜𝑠𝑠 𝑎 Fig. 4 Bone length 𝐿𝑜𝑠𝑠 𝑏 Discussion The best performance of SD loss function is 0.5. But when the weight is increased to 1, the performance starts to degrade. The reason is that SD loss is the compensation for ED loss. The former simply can’t steal the latter’s thunder. Fig. 5 The flowchart of our system Fig. 6 The proposed training approach Results Conclusions Table 1(updated) : SDNet: ZF-Net[3] with skeleton-difference loss function We proposed a novel architecture for hand pose estimation that combines the skeleton-difference network (SDNet) and a hierarchical refinement method. 𝛹 𝑇𝑜𝑡𝑎𝑙 = 𝜔 𝐷 𝛹 𝐷 + 𝜔 𝑆𝐷 𝛹 𝑆𝐷 Network Settings (weight) Mean Euclidean Distance Error (mm) Baseline (ZF-Net) 12.6990 SDNet ( 𝜔 𝑆𝐷 =0.125) 8.8209 SDNet ( 𝜔 𝑆𝐷 =0.25) 8.7291 SDNet ( 𝜔 𝑆𝐷 =0.5) 8.4520 SDNet ( 𝜔 𝑆𝐷 =1.0) 8.5104 Fig. 7 The fraction of success compared with [1,2] Contact References X. Sun, Y. Wei, S. Liang, X. Tang, and J. Sun, "Cascaded hand pose regression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 824-832. D. Tang, H. Jin Chang, A. Tejani, and T.-K. Kim, "Latent regression forest: Structured estimation of 3d articulated hand posture," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 3786-3793. M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014, pp. 818-833. D. Tang, T.-H. Yu, and T.-K. Kim, "Real-time articulated hand pose estimation using semi-supervised transductive regression forests," in Proceedings of the IEEE international conference on computer vision, 2013, pp. 3224-3231 Ya-Hui Tang National Taiwan University Email: d05922027@csie.ntu.edu.tw Website: http://robotlab.csie.ntu.edu.tw/vision.html Phone: +886937928005