Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju.

Similar presentations


Presentation on theme: "Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju."— Presentation transcript:

1 Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju

2 Motivation Activity recognition from video for higher functionality  Who is presenting agenda item  Attendee interest levels

3 Motivation Want it to be automatic and not involve hand generation of models  Impractical in the case of many activities  Less versatile as you might be constrained to particular aspects of the problem

4 Problem Definition Video Data Observations are extracted movement deltas via face tracking Hand label training segments Learn underlying models from training segments Carry out activity recognition

5 Approach - Learning Assume underlying models can be approximated by HMMs Use Baum Welch to learn best model using training segments Need to find observation space and number of states

6 Approach - Learning HMMs:

7 Approach - Learning To find observation space:  Run through all training segments and add observations  For new observation when doing recognition, augment learned observation matrices

8 Approach - Learning To find number of states, Q (for each activity):  Set upper bound as length of longest training segment  Iterate over values and generate most likely model using Baum Welch

9 Approach - Learning To find number of states, Q (for each activity):  Choose best Q using N-fold cross validation using criterion of discriminative power  With best Q, run Baum Welch using a number of sets of randomly initialized parameters to get λ a

10 Approach - Recognition Define a window width, w From the beginning, sequentially consider windows of observations (where L is length of entire sequence)

11 Approach - Recognition Calculate likelihood of each window segment L Rabinier, A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition, Proceedings IEEE, 1989

12 Approach - Recognition Label middle frame in each window with activity with highest likelihood

13 Evaluation and Results Activities being observed:

14 Evaluation and Results Observation stream obtained from 87 second long image sequence 1296 individual frames Example frames after face detection:

15 Evaluation and Results Observation sequence first hand labeled Segments showing same activity extracted 4 training segments used to learn each activity

16 Evaluation and Results

17 Once underlying models were learned, calculate likelihood using sliding window Value of 21 was used for the window width, w, as this was the average length of training segments

18 Evaluation and Results

19 Carry out recognition using the likelihoods by assigning activities to the frames Compare against hand assigned labels Accuracy approximately 76%

20 Evaluation and Results Algorithm assigned: Different from hand label Same as hand label

21 Evaluation and Results Hand assigned: Different from algorithm label Same as algorithm label

22 Future Work Learn underlying model generating sequence of activities themselves Standardize lengths of training segments using Dynamic Time Warping and use that as the window width

23 The End Questions


Download ppt "Learning and Recognizing Activities in Streams of Video Dinesh Govindaraju."

Similar presentations


Ads by Google