Gesture Recognition in a Class Room Environment Michael Wallick CS766
Virtual Videography Place cameras in an environment Automatically edit video off-line Output should look like a professional editor
Our Implementation Looking at the classroom domain Recorded one semester of CS559 (Computer Graphics)
Computer Vision in Virtual Videography Understand what is happening on the chalkboard Writing on the board Understand what the professor is doing Location Actions
Chalkboard… Partition the board into regions Regions are semantically related groups of writing Regions can be approximated using computer vision Let’s treat this as a black box … it just happens
Gesture Recognition Understand gestures or actions by a performer Generally used as an input to a computer Understand what the professor is doing Pointing Writing Reaching
Writing can be confused with Pointing and Reaching
Template Matching for G. R. Generate templates of known gestures Match an unknown frame with a template matching algorithm Sum of Squared Difference Cross Correlation Image Difference …
Implement of Gesture Recognition The user selects several template images Pointing Reaching
Format the templates Separate the lecturer Crop the image Resize the images 256x256
Build the Recognition Mask Load each template into the mask For each “on” pixel, increment the mask at that location
Recognizing Gestures Separate the lecturer from foreground Crop and resize For every “on” pixel, increment the “Score” by that value in the mask Compute Confidence as (float) (Score/Mask_Total) Compute Confidence for all gestures
A Gesture Matches if Confidence is: Under 50% but much larger than other gestures Over 50% and not too close to other gestures
Example: Ground State
Example: Pointing
Example: Reaching
Mistakes Overall the results are good Sometimes individual frames are not correct
Solution For each frame, look at surrounding frames Label frame with gesture of the majority
Where to go from here… Use the regions to Validate the gestures Determine what is being pointed at Incorporate the writing information with the gestures Write paper and webpage!
Conclusions We want to use gesture recognition for Virtual Videography Gestures can be used to drive camera model Find gestures by template matching For each frame, take the “average” around a region of frames to correct errors
Thank You! Questions/ Comments?