Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu

Human Upper Body Pose Recognition Using Adaboost Template for Natural Human Robot Interaction
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu Institute for Infocomm Research (I2R), Singapore

Outline Introduction Related Works The Method
The Problem Template Modeling Adaboost Template Recognition & Segmentation Experiments and Evaluations Conclusions

Introduction Motivations Difficulties for approaches on 2D images
Upper body pose is one of important clues of human social behavior in natural conversation, especially multiple persons are involved in the conversation; A social robot has to be aware of various clues from human body for intelligent and natural human-robot-interaction; An important clues in our social robots for attention estimation and engagement management (direction, distance, motion state, upper body pose, face pose, gaze, etc.) Difficulties for approaches on 2D images Pose ambiguity due to the lost depth information and self-occlusion; Limited view of human objects when engaged in face-to-face interaction; Variations of human shapes, scales, clothes, poses, etc. Complexity of visual features due to lighting conditions, cluttered backgrounds, and crowded scenes.

Related Works Human Body Pose Recognition in Computer Vision
2D silhouette based approaches (e.g., Gavirla and Philomin, ICCV’09, Mittal, et al, IEEE AVSS’03, Dimitrijevic, et al, ICCV Workshop’05) 2D pictorial models (e.g., Ju, et al, FG’96, Felzenszwalb and Huttenlocher, IJCV 2005, Andriluka, et al, CVPR’09, Ferrari, et al, CVPR’09) 3D structure models (e.g., Taylor, CVIU 2000, Lee and Cohen, ECCV’04) Template Matching Deformable template matching (e.g., Cootes, Edwards, Taylor, Active Appearance Models) Object tracking (Yilmaz, et al, ACM CS 2006 (Survey)) Face detection (Yang, et al, IEEE T-PAMI 2002 (Survey)) Image registration (Zitova and Flusser, IVC 2003 (Survey)) Adaboost Learning in Vision Face detection (Viola & Jones, CVPR’01 (Cascade Classifiers)) Multi-view face detection (Huang, et al., IEEE T-PAMI 2007 (Vector Boosting Algorithm)) Multiclass object detection with shared features (Torralba, et al, CVPR’04 (Joint Boosting Algorithm)) Online tracking (e.g., Avidan, “Ensemble Tracking,” IEEE T-PAMI 2007)

Method: The Problem Problem formulation Challenges
Classify the upper body poses into seven categories: views of 0°, ±30°, ±60°, and ±90° to the camera. Challenges The depth measures from disparity images are not accurate; Inter-class variations due to variations of human sizes, shapes, poses, and clothes; Inter-class variations due to human positions to the camera; Incompletion of disparity measures from body due to the lack of texture features.

Method: Template Modeling
Learning the basic templates Learning the mean template for each category Learning the variance template for each category Learning the percentage template for each category

Method: Adaboost Template
Definition of positive and negative regions Design of weak classifiers R+ R−

Method: Learning Adaboost learning algorithm
Given Nc training samples for category c. Initialize: For t=1,…,T For each pixel x in the template Compute the error with respect to the distribution Dt Choose Tune the template boundary Update the distribution

Method: Recognition & Segmentation
Adaptive model-driven segmentation: Quality level of disparity measurements Adaptively compensate for the missing disparity measurements

Experiments and Evaluations
A New Benchmarking Data Set Camera: Videre Design STOC stereo camera. Data Set: 430 images from 19 individuals. Training samples: Randomly select 93 images of 8 persons from the data set, among them, 28 for 0° view, 13 for +30° view, 10 for -30° view, 11 for +60° view, 10 for -60° view, 11 for +90° view, and 10 for -90° view. Baseline Algorithm: Template matching: 3D surface template matching (Breitenstein, et al, “Real-Time Face Pose Estimation from Single Range Images”, CVPR’08) Distance: Let T(x) be a normalized input sample Recognition

Results on recognition: On average, the accuracy rate increased from 67.4% to 90.7%. Template Matching -90° -60° -30° 0° +30° +60° +90° 100% 60% 37.1% 2.9% 2.6% 31.6% 65.8% 1.75% 14.3% 78.6% 3.6% 61.0% 9.7% 29.3% 27.8% 72.2% Adaboost Template -90° -60° -30° 0° +30° +60° +90° 100% 3.4% 88% 6.9% 1.7% 3.3% 81.7% 15% 9.5% 87.3% 3.2% 5.2% 77.6% 17.2% 98.3%

Results on segmentation: Pose recognition Quality estimation Top-down segmentation

Application: Attention Estimation
Deployed in a robot receptionist for attention estimation

Conclusions A new approach of human upper body pose recognition for human robot interaction A new template model: Adaboost template Easy for training (no need of negative samples) Achieve good balance between generality and specialties of training samples Both recognition and segmentation Deployed and tested on a robot receptionist for attention estimation and the management of engagement in dialogs which may involve multiple participants.

Thank You!

Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu

Similar presentations

Presentation on theme: "Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu

Similar presentations

Presentation on theme: "Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu"— Presentation transcript:

Similar presentations

About project

Feedback