Morphological Segmentation of Natural Gesture

Slides:

Advertisements

Similar presentations

Gesture recognition using salience detection and concatenated HMMs Ying Yin Randall Davis Massachusetts Institute.

Advertisements

Université du Québec École de technologie supérieure Face Recognition in Video Using What- and-Where Fusion Neural Network Mamoudou Barry and Eric Granger.

An Adaptive Learning Method for Target Tracking across Multiple Cameras Kuan-Wen Chen, Chih-Chuan Lai, Yi-Ping Hung, Chu-Song Chen National Taiwan University.

State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.

An Approach to ECG Delineation using Wavelet Analysis and Hidden Markov Models Maarten Vaessen (FdAW/Master Operations Research) Iwan de Jong (IDEE/MI)

Large Vocabulary Unconstrained Handwriting Recognition J Subrahmonia Pen Technologies IBM T J Watson Research Center.

Angelo Dalli Department of Intelligent Computing Systems

Supervised Learning Recap

Automatic in vivo Microscopy Video Mining for Leukocytes * Chengcui Zhang, Wei-Bang Chen, Lin Yang, Xin Chen, John K. Johnstone.

Lecture 8: Hidden Markov Models (HMMs) Michael Gutkin Shlomi Haba Prepared by Originally presented at Yaakov Stein’s DSPCSP Seminar, spring 2002 Modified.

Hidden Markov Models Bonnie Dorr Christof Monz CMSC 723: Introduction to Computational Linguistics Lecture 5 October 6, 2004.

Hidden Markov Model based 2D Shape Classification Ninad Thakoor 1 and Jean Gao 2 1 Electrical Engineering, University of Texas at Arlington, TX-76013,

What is the temporal feature in video sequences?

Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,

PatReco: Hidden Markov Models Alexandros Potamianos Dept of ECE, Tech. Univ. of Crete Fall

Adaptive Rao-Blackwellized Particle Filter and It’s Evaluation for Tracking in Surveillance Xinyu Xu and Baoxin Li, Senior Member, IEEE.

Part 4 c Baum-Welch Algorithm CSE717, SPRING 2008 CUBS, Univ at Buffalo.

Student: Hsu-Yung Cheng Advisor: Jenq-Neng Hwang, Professor

A Probabilistic Framework for Video Representation Arnaldo Mayer, Hayit Greenspan Dept. of Biomedical Engineering Faculty of Engineering Tel-Aviv University,

Multi-camera Video Surveillance: Detection, Occlusion Handling, Tracking and Event Recognition Oytun Akman.

Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.

Handwritten Character Recognition using Hidden Markov Models Quantifying the marginal benefit of exploiting correlations between adjacent characters and.

Statistical automatic identification of microchiroptera from echolocation calls Lessons learned from human automatic speech recognition Mark D. Skowronski.

Tracking Pedestrians Using Local Spatio- Temporal Motion Patterns in Extremely Crowded Scenes Louis Kratz and Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS.

Isolated-Word Speech Recognition Using Hidden Markov Models

1 7-Speech Recognition (Cont’d) HMM Calculating Approaches Neural Components Three Basic HMM Problems Viterbi Algorithm State Duration Modeling Training.

7-Speech Recognition Speech Recognition Concepts

Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.

Cognitive Computer Vision Kingsley Sage and Hilary Buxton Prepared under ECVision Specific Action 8-3

Recognition, Analysis and Synthesis of Gesture Expressivity George Caridakis IVML-ICCS.

Handover and Tracking in a Camera Network Presented by Dima Gershovich.

Stable Multi-Target Tracking in Real-Time Surveillance Video

1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.

CS Statistical Machine learning Lecture 24

Context-based vision system for place and object recognition Antonio Torralba Kevin Murphy Bill Freeman Mark Rubin Presented by David Lee Some slides borrowed.

1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,... Si Sj.

A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.

Activity Analysis of Sign Language Video Generals exam Neva Cherniavsky.

Visual Tracking by Cluster Analysis Arthur Pece Department of Computer Science University of Copenhagen

Statistical Models for Automatic Speech Recognition Lukáš Burget.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

1 Hidden Markov Model Observation : O1,O2,... States in time : q1, q2,... All states : s1, s2,..., sN Si Sj.

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida March 31,

Learning, Uncertainty, and Information: Learning Parameters

Date: October, Revised by 李致緯

Hidden Markov Models.

Traffic Sign Recognition Using Discriminative Local Features Andrzej Ruta, Yongmin Li, Xiaohui Liu School of Information Systems, Computing and Mathematics.

EEL 6586: AUTOMATIC SPEECH PROCESSING Hidden Markov Model Lecture

An Iterative Approach to Discriminative Structure Learning

Tracking Objects with Dynamics

Classification of unlabeled data:

Hidden Markov Models (HMMs)

Particle Filtering for Geometric Active Contours

Statistical Models for Automatic Speech Recognition

CSC 594 Topics in AI – Natural Language Processing

Video-based human motion recognition using 3D mocap data

Hidden Markov Models (HMMs)

Hidden Markov Models Part 2: Algorithms

A Unifying Review of Linear Gaussian Models

CRANDEM: Conditional Random Fields for ASR

Statistical Models for Automatic Speech Recognition

Three classic HMM problems

4.0 More about Hidden Markov Models

LECTURE 15: REESTIMATION, EM AND MIXTURES

Chapter14-cont..

Visual Recognition of American Sign Language Using Hidden Markov Models 문현구 문현구.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

A Gentle Tutorial of the EM Algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hidden Markov Models Jeff A. Bilmes International.

Listen Attend and Spell – a brief introduction

Presentation transcript:

Morphological Segmentation of Natural Gesture Stroke Retract Prepare Hold Jacob Eisenstein MAS 622 Final Project

Natural Gesture Gesture supplements verbal communication Turn boundaries Reference resolution Visual imagery What are the lowest-level gesture units? McNeill: “Movement phases” Stroke Prepare Hold Retract

Videos of people explaining things to each other Prepare Stroke Hold Time

Outline Hand Tracking “Guided” clustering Kalman Filter Gesture Recognition Durational HMMs Recurrent Neural Networks

Hand Tracking Seems easy Occlusion, shadows Hands are not in every frame 85% accuracy with color info alone How to do better?

N P Better Hand Tracking Other features But how to use these features? Position Edges But how to use these features? Supervised Training P = set of positive examples N = set of negative examples P N

N P “Guided” Training P’ N’ Labeling is very expensive Approximate P and N Initialize clusters at centers of P’ and N’ K-means cluster using all points N P P’ N’

Hand Tracking Results Error Rate: (FP + FN + 2*WrongPos) / ALL

Kalman Filtering X(t) = X(t-1) + V(t-1) V(t) = V(t-1) + W(t) Y(t) = X(t) + R(t) State Observation Initialization Cov(W) = [.1 0 0 .1] Cov(R) = [1 0 0 1] Parameters re-estimated using EM

Kalman Filter Results Reduces position accuracy Smoothes velocity Improves overall performance by ~5%

Movement Phase Recognition Two sources of information Observable features Velocity, position Temporal / sequential Ideal for HMM?

HMM Setup We have data with states labeled Learn state transitions and outputs directly from data No need for Baum-Welch estimation Find best path using Viterbi Can use any probabilistic classifier for the output probabilities

Initial Results Accuracy = percent classified correctly Including “no gesture” 5-class problem 1-component mixture: 34.6% 3-component mixture: 33.3% 7-component mixture: 32.6% Not very good!

Durational HMMs HMMs assume an exponential decay model for state duration What about other models of state duration? Rabiner explains parameter estimation for durational HMMs, but not Viterbi

Viterbi for Gaussian Durational HMMs Pi(d) Pj(d) Leaving a state obeys an probability density function P(d==t) = N(t,u,s) Each self-transition obeys a cumulative probability function P(d>t) = 1-C(t,u,s) Normalize for the cost you’ve already paid P(d=t|d>t-1) = N(t,u,s)/(1-C(t-1,u,s)) P(t>t|d>t-1) = (1-C(t,u,s))/(1-C(t-1,u,s))

Results for Durational Viterbi Standard 1 component: 34.6 3 components: 33.3 7 components: 31.6 Durational 1 component: 35.5 3 components: 36.7 7 components: 38.0 Best durational is 3.4% better than best baseline

Neural Networks Feedforward network (13 x 50 x 5): 44.5% Ignoring sequence and temporal information! Maybe recurrent NNs can do even better?

Future Work Hand Tracking Kalman Filtering Gesture Phase Recognition Cluster to mixtures of Gaussians instead of single Gaussians Kalman Filtering Noise is not Gaussian Particle filter? Gesture Phase Recognition Recurrent Neural Networks Other discriminantive methods