Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar.

Slides:



Advertisements
Similar presentations
Evidential modeling for pose estimation Fabio Cuzzolin, Ruggero Frezza Computer Science Department UCLA.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Vogler and Metaxas University of Toronto Computer Science CSC 2528: Handshapes and Movements: Multiple- channel ASL recognition Christian Vogler and Dimitris.
Neural networks Introduction Fitting neural networks
Patch to the Future: Unsupervised Visual Prediction
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
International Conference on Automatic Face and Gesture Recognition, 2006 A Layered Deformable Model for Gait Analysis Haiping Lu, K.N. Plataniotis and.
Chapter 15 Probabilistic Reasoning over Time. Chapter 15, Sections 1-5 Outline Time and uncertainty Inference: ltering, prediction, smoothing Hidden Markov.
 INTRODUCTION  STEPS OF GESTURE RECOGNITION  TRACKING TECHNOLOGIES  SPEECH WITH GESTURE  APPLICATIONS.
Department of Electrical and Computer Engineering He Zhou Hui Zheng William Mai Xiang Guo Advisor: Professor Patrick Kelly ASLLENGE.
SOMM: Self Organizing Markov Map for Gesture Recognition Pattern Recognition 2010 Spring Seung-Hyun Lee G. Caridakis et al., Pattern Recognition, Vol.
An Introduction to Hidden Markov Models and Gesture Recognition Troy L. McDaniel Research Assistant Center for Cognitive Ubiquitous Computing Arizona State.
Hidden Markov Models Theory By Johan Walters (SR 2003)
Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,
1 CS6825: Recognition 8. Hidden Markov Models 2 Hidden Markov Model (HMM) HMMs allow you to estimate probabilities of unobserved events HMMs allow you.
ICIP 2000, Vancouver, Canada IVML, ECE, NTUA Face Detection: Is it only for Face Recognition?  A few years earlier  Face Detection Face Recognition 
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Lecture 5: Learning models using EM
Human Action Recognition
1 Integration of Background Modeling and Object Tracking Yu-Ting Chen, Chu-Song Chen, Yi-Ping Hung IEEE ICME, 2006.
Learning the space of time warping functions for Activity Recognition Function-Space of an Activity Ashok Veeraraghavan Rama Chellappa Amit K. Roy-Chowdhury.
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.
Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.
May 20, 2006SRIV2006, Toulouse, France1 Acoustic Modeling of Accented English Speech for Large-Vocabulary Speech Recognition ATR Spoken Language Communication.
Introduction to Automatic Speech Recognition
1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.
Knowledge Systems Lab JN 9/10/2002 Computer Vision: Gesture Recognition from Images Joshua R. New Knowledge Systems Laboratory Jacksonville State University.
7-Speech Recognition Speech Recognition Concepts
Segmental Hidden Markov Models with Random Effects for Waveform Modeling Author: Seyoung Kim & Padhraic Smyth Presentor: Lu Ren.
Comparative study of various Machine Learning methods For Telugu Part of Speech tagging -By Avinesh.PVS, Sudheer, Karthik IIIT - Hyderabad.
Project title : Automated Detection of Sign Language Patterns Faculty: Sudeep Sarkar, Barbara Loeding, Students: Sunita Nayak, Alan Yang Department of.
Object Stereo- Joint Stereo Matching and Object Segmentation Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on Michael Bleyer Vienna.
K. J. O’Hara AMRS: Behavior Recognition and Opponent Modeling Oct Behavior Recognition and Opponent Modeling in Autonomous Multi-Robot Systems.
Combined Central and Subspace Clustering for Computer Vision Applications Le Lu 1 René Vidal 2 1 Computer Science Department, Johns Hopkins University,
CVPR Workshop on RTV4HCI 7/2/2004, Washington D.C. Gesture Recognition Using 3D Appearance and Motion Features Guangqi Ye, Jason J. Corso, Gregory D. Hager.
Online Arabic Handwriting Recognition Fadi Biadsy Jihad El-Sana Nizar Habash Abdul-Rahman Daud Done byPresented by KFUPM Information & Computer Science.
Fingerspelling Alphabet Recognition Using A Two-level Hidden Markov Model Shuang Lu, Joseph Picone and Seong G. Kong Institute for Signal and Information.
出處: Signal Processing and Communications Applications, 2006 IEEE 作者: Asanterabi Malima, Erol Ozgur, and Miijdat Cetin 2015/10/251 指導教授:張財榮 學生:陳建宏 學號: M97G0209.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
ECE 8443 – Pattern Recognition EE 3512 – Signals: Continuous and Discrete Objectives: Spectrograms Revisited Feature Extraction Filter Bank Analysis EEG.
Template matching and object recognition. CS8690 Computer Vision University of Missouri at Columbia Matching by relations Idea: –find bits, then say object.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
CS Statistical Machine learning Lecture 24
Ch 5b: Discriminative Training (temporal model) Ilkka Aho.
Experimental Results Abstract Fingerspelling is widely used for education and communication among signers. We propose a new static fingerspelling recognition.
Unsupervised Mining of Statistical Temporal Structures in Video Liu ze yuan May 15,2011.
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Activity Analysis of Sign Language Video Generals exam Neva Cherniavsky.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,
Phone-Level Pronunciation Scoring and Assessment for Interactive Language Learning Speech Communication, 2000 Authors: S. M. Witt, S. J. Young Presenter:
1 7-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches Recognition Theories Bayse Rule Simple Language Model P(A|W) Network Types.
A NONPARAMETRIC BAYESIAN APPROACH FOR
Gait Recognition Gökhan ŞENGÜL.
EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture
Lecture 26 Hand Pose Estimation Using a Database of Hand Images
Dynamic Routing Using Inter Capsule Routing Protocol Between Capsules
The connected word recognition problem Problem definition: Given a fluently spoken sequence of words, how can we determine the optimum match in terms.
Globally Optimal Generalized Maximum Multi Clique Problem (GMMCP) using Python code for Pedestrian Object Tracking By Beni Mulyana.
Hidden Markov Models Part 2: Algorithms
EEG Recognition Using The Kaldi Speech Recognition Toolkit
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Hand Gesture Recognition Using Hidden Markov Models
Handwritten Characters Recognition Based on an HMM Model
Connected Word Recognition
Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.
Presentation transcript:

Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar with sign language. Ideally, a translation system would improve communication by utilizing common and easy to use signs that can be applied to human computer interaction. Continuous stream sign recognition is a major challenge since both spatial (hand position) and temporal (when gesture starts/ends) segmentation are needed. This poster will discuss: State of the art ASL recognition system structure Constraints for sign language recognition Introduction to algorithms commonly used in ASL recognition Benchmark results and databases Abstract Developing sign language applications for deaf people is extremely important, since it is difficult to communicate with people that are unfamiliar with sign language. Ideally, a translation system would improve communication by utilizing common and easy to use signs that can be applied to human computer interaction. Continuous stream sign recognition is a major challenge since both spatial (hand position) and temporal (when gesture starts/ends) segmentation are needed. This poster will discuss: State of the art ASL recognition system structure Constraints for sign language recognition Introduction to algorithms commonly used in ASL recognition Benchmark results and databases Shuang Lu Advisor: Joseph Picone A Review of American Sign Language (ASL) Recognition College of Engineering Temple University VB-HMM-EM for single sign recognition General HMM structure: Initial state; Hidden variables; Observed variables. VB-HMM-EM for single sign recognition General HMM structure: Initial state; Hidden variables; Observed variables. References [1] J. Alon, V. Athitsos, Q. Yuan, and S. Sclaroff. A unified framework for gesture recognition and spatiotemporal gesture segmentation. PAMI, 31(9):1685–1699, [2] R. Yang, S. Sarkar, and B. Loeding. Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. PAMI, 32,no.3:462–477, [3] A. Thangali, J. P. Nash, and S. Sclaroff. Exploiting Phonological Constraints for Hand Shape Inference in ASL Video. In CVPR, References [1] J. Alon, V. Athitsos, Q. Yuan, and S. Sclaroff. A unified framework for gesture recognition and spatiotemporal gesture segmentation. PAMI, 31(9):1685–1699, [2] R. Yang, S. Sarkar, and B. Loeding. Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. PAMI, 32,no.3:462–477, [3] A. Thangali, J. P. Nash, and S. Sclaroff. Exploiting Phonological Constraints for Hand Shape Inference in ASL Video. In CVPR, Level-Building Dynamic Programming DP: Given an optimal task, we need to solve many sub-problems, and then combine these solutions to reach an overall solution. Level-Building Dynamic Programming DP: Given an optimal task, we need to solve many sub-problems, and then combine these solutions to reach an overall solution. With Bigram grammar constraint If can not be a predecessor of Conclusion As can be seen from the above results, these commonly used algorithms which have shown to produce good results for speech recognition (i.e. HMM, DP) were tested for sign language. However, they do not yield satisfactory results in continuous sign recognition. Lacking spatial information is one major reason because these models typically ignore hand shapes. Moreover, multiple constraints should be considered to either speed up the process or improve accuracy. Future work will focus on how to use shape information without slowing down the overall recognition system. Conclusion As can be seen from the above results, these commonly used algorithms which have shown to produce good results for speech recognition (i.e. HMM, DP) were tested for sign language. However, they do not yield satisfactory results in continuous sign recognition. Lacking spatial information is one major reason because these models typically ignore hand shapes. Moreover, multiple constraints should be considered to either speed up the process or improve accuracy. Future work will focus on how to use shape information without slowing down the overall recognition system. Multiple Constraints Grammar constraint: Whether two gestures can be connected State transition constraint: Probability of one state of a sign change to another state Figure 2. Multiple constrains {start, end} co-occurrence constraint: Probability of two gestures can be the start and end states for a sign Multiple Constraints Grammar constraint: Whether two gestures can be connected State transition constraint: Probability of one state of a sign change to another state Figure 2. Multiple constrains {start, end} co-occurrence constraint: Probability of two gestures can be the start and end states for a sign Figure 3. ASL example: John should buy a car ASL Recognition System This process is typically implemented with Dynamic programming since HMM’s require large sets of training data. ASL Recognition System This process is typically implemented with Dynamic programming since HMM’s require large sets of training data. Gaussian mixture model Face detection based Motion information Hand Candidate Detection Global features (Distance histogram) Local features (Location and motion) Hand shape features (HOG) Feature Extraction Dynamic Programming Hidden Markov Model Bayesian Network Calculate Match Cost One Sign Process within DP recognition Figure 1. ASL recognition system Benchmark Databases: Benchmark Databases: Research InstituteYearShort Sleeves BackgroundNumber of Signer Data Size Data Type Purdue University2002NoSimpleThreeMediumLetter spelling Boston University2001YesMultipleThreeLargeLexicon/continuous RWTH-Boston2004SomeMultipleThreeLargeSentence/Lexicon/ Continuous University of South Florida 2006YesComplexOneSmallSentence VBE step: maximize lower bound w.r.t. VBM step: maximize lower bound w.r.t. Figure 4. Level-building Dynamic programming Figure 5. HMM structure Figure 6. VB-EM hand shape matching Results The best results are obtained from the VB-EM algorithm. For hand shape the rate of correct retrieval at rank 1 and top-5 are only 32.1% and 61.5%. The best recognition result for simple background continuous sign gesture recognition is about 83%, using level-building DP. However, the results become significantly worse when using complex backgrounds or when test sequences are from different signers. Results The best results are obtained from the VB-EM algorithm. For hand shape the rate of correct retrieval at rank 1 and top-5 are only 32.1% and 61.5%. The best recognition result for simple background continuous sign gesture recognition is about 83%, using level-building DP. However, the results become significantly worse when using complex backgrounds or when test sequences are from different signers. Error rate 20 test sequences 5 test sequences 10 test sequences Error rate Figure 8. Error rate for complex background test Figure 9. Error rate for cross signer test Figure 10. Left: Original image; Middle: Hand candidate (red) detection (USF) Right: Hand candidate detection (Boston) Figure 7. VB-HMM