Presentation is loading. Please wait.

Presentation is loading. Please wait.

Morphological Segmentation of Natural Gesture

Similar presentations


Presentation on theme: "Morphological Segmentation of Natural Gesture"— Presentation transcript:

1 Morphological Segmentation of Natural Gesture
Stroke Retract Prepare Hold Jacob Eisenstein MAS 622 Final Project

2 Natural Gesture Gesture supplements verbal communication
Turn boundaries Reference resolution Visual imagery What are the lowest-level gesture units? McNeill: “Movement phases” Stroke Prepare Hold Retract

3 Videos of people explaining things to each other
Prepare Stroke Hold Time

4 Outline Hand Tracking “Guided” clustering Kalman Filter Gesture
Recognition Durational HMMs Recurrent Neural Networks

5 Hand Tracking Seems easy Occlusion, shadows
Hands are not in every frame 85% accuracy with color info alone How to do better?

6 N P Better Hand Tracking Other features But how to use these features?
Position Edges But how to use these features? Supervised Training P = set of positive examples N = set of negative examples P N

7 N P “Guided” Training P’ N’ Labeling is very expensive
Approximate P and N Initialize clusters at centers of P’ and N’ K-means cluster using all points N P P’ N’

8 Hand Tracking Results Error Rate: (FP + FN + 2*WrongPos) / ALL

9 Kalman Filtering X(t) = X(t-1) + V(t-1) V(t) = V(t-1) + W(t)
Y(t) = X(t) + R(t) State Observation Initialization Cov(W) = [.1 0 0 .1] Cov(R) = [1 0 0 1] Parameters re-estimated using EM

10 Kalman Filter Results Reduces position accuracy Smoothes velocity
Improves overall performance by ~5%

11 Movement Phase Recognition
Two sources of information Observable features Velocity, position Temporal / sequential Ideal for HMM?

12 HMM Setup We have data with states labeled
Learn state transitions and outputs directly from data No need for Baum-Welch estimation Find best path using Viterbi Can use any probabilistic classifier for the output probabilities

13 Initial Results Accuracy = percent classified correctly
Including “no gesture” 5-class problem 1-component mixture: 34.6% 3-component mixture: 33.3% 7-component mixture: 32.6% Not very good!

14 Durational HMMs HMMs assume an exponential decay model for state duration What about other models of state duration? Rabiner explains parameter estimation for durational HMMs, but not Viterbi

15 Viterbi for Gaussian Durational HMMs
Pi(d) Pj(d) Leaving a state obeys an probability density function P(d==t) = N(t,u,s) Each self-transition obeys a cumulative probability function P(d>t) = 1-C(t,u,s) Normalize for the cost you’ve already paid P(d=t|d>t-1) = N(t,u,s)/(1-C(t-1,u,s)) P(t>t|d>t-1) = (1-C(t,u,s))/(1-C(t-1,u,s))

16

17 Results for Durational Viterbi
Standard 1 component: 34.6 3 components: 33.3 7 components: 31.6 Durational 1 component: 35.5 3 components: 36.7 7 components: 38.0 Best durational is 3.4% better than best baseline

18 Neural Networks Feedforward network (13 x 50 x 5): 44.5%
Ignoring sequence and temporal information! Maybe recurrent NNs can do even better?

19 Future Work Hand Tracking Kalman Filtering Gesture Phase Recognition
Cluster to mixtures of Gaussians instead of single Gaussians Kalman Filtering Noise is not Gaussian Particle filter? Gesture Phase Recognition Recurrent Neural Networks Other discriminantive methods


Download ppt "Morphological Segmentation of Natural Gesture"

Similar presentations


Ads by Google