CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition Using 3D Appearance and Motion Features Guangqi Ye, Jason J. Corso, Gregory D. Hager.

Slides:



Advertisements
Similar presentations
1 Gesture recognition Using HMMs and size functions.
Advertisements

FEATURE PERFORMANCE COMPARISON FEATURE PERFORMANCE COMPARISON y SC is a training set of k-dimensional observations with labels S and C b C is a parameter.
Recognizing Human Actions by Attributes CVPR2011 Jingen Liu, Benjamin Kuipers, Silvio Savarese Dept. of Electrical Engineering and Computer Science University.
Image Congealing (batch/multiple) image (alignment/registration) Advanced Topics in Computer Vision (048921) Boris Kimelman.
VisHap: Guangqi Ye, Jason J. Corso, Gregory D. Hager, Allison M. Okamura Presented By: Adelle C. Knight Augmented Reality Combining Haptics and Vision.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
SOMM: Self Organizing Markov Map for Gesture Recognition Pattern Recognition 2010 Spring Seung-Hyun Lee G. Caridakis et al., Pattern Recognition, Vol.
SA-1 Body Scheme Learning Through Self-Perception Jürgen Sturm, Christian Plagemann, Wolfram Burgard.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Recent Developments in Human Motion Analysis
Human Action Recognition
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
K-means Based Unsupervised Feature Learning for Image Recognition Ling Zheng.
CS 188: Artificial Intelligence Fall 2009 Lecture 19: Hidden Markov Models 11/3/2009 Dan Klein – UC Berkeley.
Face Recognition and Retrieval in Video Basic concept of Face Recog. & retrieval And their basic methods. C.S.E. Kwon Min Hyuk.
SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.
I mage and M edia U nderstanding L aboratory for Performance Evaluation of Vision-based Real-time Motion Capture Naoto Date, Hiromasa Yoshimoto, Daisaku.
1 Applying Vision to Intelligent Human-Computer Interaction Guangqi Ye Department of Computer Science The Johns Hopkins University Baltimore, MD
A Brief Overview of Computer Vision Jinxiang Chai.
West Virginia University
Autonomous Learning of Object Models on Mobile Robots Xiang Li Ph.D. student supervised by Dr. Mohan Sridharan Stochastic Estimation and Autonomous Robotics.
Pattern Recognition Vidya Manian Dept. of Electrical and Computer Engineering University of Puerto Rico INEL 5046, Spring 2007
Final Exam Review CS485/685 Computer Vision Prof. Bebis.
3D Motion Capture Assisted Video human motion recognition based on the Layered HMM Myunghoon Suk & Ashok Ramadass Advisor : Dr. B. Prabhakaran Multimedia.
SPIE'01CIRL-JHU1 Dynamic Composition of Tracking Primitives for Interactive Vision-Guided Navigation D. Burschka and G. Hager Computational Interaction.
Hand Gesture Recognition System for HCI and Sign Language Interfaces Cem Keskin Ayşe Naz Erkan Furkan Kıraç Özge Güler Lale Akarun.
Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Miguel Reyes 1,2, Gabriel Dominguez 2, Sergio Escalera 1,2 Computer Vision Center (CVC) 1, University of Barcelona (UB) 2
Hand Tracking for Virtual Object Manipulation
Automatic Detection and Segmentation of Robot-Assisted Surgical Motions presented by Henry C. Lin Henry C. Lin, Dr. Izhak Shafran, Todd E. Murphy, Dr.
Computer Vision Lab Seoul National University Keyframe-Based Real-Time Camera Tracking Young Ki BAIK Vision seminar : Mar Computer Vision Lab.
資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 A Static Hand Gesture Recognition Algorithm Using K- Mean Based Radial Basis Function Neural Network 作者 :Dipak Kumar Ghosh,
Towards Coastal Threat Evaluation Decision Support Presentation by Jacques du Toit Operational Research University of Stellenbosch 3 December 2010.
Stereoscopic Video Overlay with Deformable Registration Balazs Vagvolgyi Prof. Gregory Hager CISST ERC Dr. David Yuh, M.D. Department of Surgery Johns.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Students: Meera & Si Mentor: Afshin Dehghan WEEK 4: DEEP TRACKING.
Fast Census Transform-based Stereo Algorithm using SSE2
Fingertip Detection with Morphology and Geometric Calculation Dung Duc Nguyen ; Thien Cong Pham ; Jae Wook Jeon Intelligent Robots and Systems, IEEE/RSJ.
Rick Parent - CIS681 Motion Analysis – Human Figure Processing video to extract information of objects Motion tracking Pose reconstruction Motion and subject.
Speech Communication Lab, State University of New York at Binghamton Dimensionality Reduction Methods for HMM Phonetic Recognition Hongbing Hu, Stephen.
1 Self-Calibration and Neural Network Implementation of Photometric Stereo Yuji IWAHORI, Yumi WATANABE, Robert J. WOODHAM and Akira IWATA.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
Skeleton Based Action Recognition with Convolutional Neural Network
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Chapter 8. Learning of Gestures by Imitation in a Humanoid Robot in Imitation and Social Learning in Robots, Calinon and Billard. Course: Robots Learning.
Visual Odometry David Nister, CVPR 2004
Multimedia Systems and Communication Research Multimedia Systems and Communication Research Department of Electrical and Computer Engineering Multimedia.
Stereo Vision Local Map Alignment for Robot Environment Mapping Computer Vision Center Dept. Ciències de la Computació UAB Ricardo Toledo Morales (CVC)
Line Matching Jonghee Park GIST CV-Lab..  Lines –Fundamental feature in many computer vision fields 3D reconstruction, SLAM, motion estimation –Useful.
3D Motion Classification Partial Image Retrieval and Download Multimedia Project Multimedia and Network Lab, Department of Computer Science.
ENTERFACE 08 Project 9 “ Tracking-dependent and interactive video projection ” Mid-term presentation August 19th, 2008.
Local Stereo Matching Using Motion Cue and Modified Census in Video Disparity Estimation Zucheul Lee, Ramsin Khoshabeh, Jason Juang and Truong Q. Nguyen.
Naifan Zhuang, Jun Ye, Kien A. Hua
San Diego May 22, 2013 Giovanni Saponaro Giampiero Salvi
Action-Grounded Push Affordance Bootstrapping of Unknown Objects
3D Motion Classification Partial Image Retrieval and Download
Temporal Order-Preserving Dynamic Quantization for Human Action Recognition from Multimodal Sensor Streams Jun Ye Kai Li Guo-Jun Qi Kien.
Video-based human motion recognition using 3D mocap data
Identifying Human-Object Interaction in Range and Video Data
The Open World of Micro-Videos
Filtering Things to take away from this lecture An image as a function
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Handwritten Characters Recognition Based on an HMM Model
Filtering An image as a function Digital vs. continuous images
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
AUthor:Liang WanG, Tao Gu, Xianping Tao, Jian Lu reporter:何知涵
Presentation transcript:

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition Using 3D Appearance and Motion Features Guangqi Ye, Jason J. Corso, Gregory D. Hager Computational Interaction and Robotics Lab The Johns Hopkins University Baltimore, MD

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Analogy Between Gesture and Speech

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. 4DT Platform  Previous work: J. Corso, D. Burschka, G. Hager, The 4DT: Unencumbered HCI With VICs. CVPRHCI,  Geometrically and photometrically calibrated  Known background

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Video Preprocessing Acquisition Rectification Background Subtraction Color Calibration

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. System Framework Image Preprocessing Coarse Stereo Matching Appearance/Motion Extraction Feature Clustering Gesture Recognition

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Visual Feature Capturing: 3D Volume  Consider limited 3D space around object  Block-based coarse stereo matching

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Motion Computation  Motion by differencing of stereo volume

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Unsupervised Learning of Feature Set  VQ: K-means approach  Choice of cluster number based on distortion analysis

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Temporal Gesture Modeling  6-state discrete forward HMMs  Multilayer Neural Network Aligning all sequences to have equal length 3-layers, 50 hidden nodes

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Experiment: Gesture Vocabulary Push Toggle

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Vocabulary Swipe Left Swipe Right

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Vocabulary Twist Clockwise Anti-clockwise

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Different Feature Data Sets  Appearance volume  5x5x5=125  10x10x10=1000  Motion volume  Concatenation of appearance and motion e.g.,(125-appearance, 1000-d motion)  Combination of clustering result of appearance and motion  Form a 2-d vector of cluster identity e.g., (3, 2)

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition  Training: >100 sequences for each gesture  Test: >70 sequences for each gesture  Combination achieves best results Feature SetClustersHMMNN Appearance Motion Concatenation Combination 8*15=

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Real-time Implementation Demo

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Conclusion  Novel approach to extract 3D appearance and motion cues without tracking  VQ clustering to learn gesteme  Modeling dynamic gestures using HMM, NN  Real-time implementation on 4DT  Extensive experiments achieve high recognition accuracy

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Thanks

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. 3D Appearance Volume  Comprehensive color normalization  Coarse disparity map Consider local images of m x n patches, perform pair-wise image matching between patches  Disparity search range [0, ( p-1 ) * w ]  Dimensionality 3D volume m*n*p

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition  HMM modeling on collapsed sequences Raw: Collapsed:  Without considering duration Feature SetTrainingTest Appearance Motion Concatenation Combination

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. 4DT Platform  Gestures in visual HCI: popular choice  Manipulative gesture modeling without tracking Difficulty of reliable tracking of hand Complexity of hand modeling  3D data acquisition Limitation of 2D cues for modeling hand Stereo matching Special sensors

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Properties of 4DT  Known background & object properties

CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C.