Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University.

Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University of Sheffield

 Video can be used to provide a record of a theatre performance  Complex environment  Automatic labelling of this video is difficult  Manually annotation with semantic labels to support further computer-based study  Aim: Label the three-dimensional (3D) movement of the actors, both in terms of their pose and stage use http://commons.wikimedia.org/wiki/File:C%2 7etait_mieux_avant.jpg

 HumanEva video data (Sigal et al., 2010)  Multiview video of single person movements  Baseline automatic skeleton fitting algorithm  Ground-truth provided by optical motion capture  Labelling  1 in 10 frames from 393 frames of video data for a jogging motion  Two experiments  One untimed – compare with Sigal et al  One timed – comparison of effect of different starting poses for labelling a single frame

 Multiview video is mapped as texture walls according to the orientation of the cameras  Moveable camera and texture walls  Users manipulate skeleton from any viewpoint  Mouse input to alter joints

Experiment 1: Results of untimed labelling vs. baseline algorithm Video num3D error(mm)Standard Deviation Single video183.116.1 Multiview video371.010.3 Sigal et al (2010) baseline algorithm 482  Untimed  Single video vs. multiview video  Compare labelled skeleton joint centre positions with ground- truth data  Final error is average of all errors in all frames  Compare with baseline algorithm (Sigal et al, 2010)

Experiment 2: Results of timed labelling Time(min)3D error(mm)Standard Deviation A. Initial pose8580.811.4 B. Incremental pose4681.916.5  Multiview video  Timed  A: Initial pose – reuse initial default pose to start labelling process for each frame  B: Incremental pose – start with pose from last frame labelled

 Our 3D labelling approach is comparable in accuracy to Sigal et al’s (2010) automatic baseline algorithm  Manual labelling is laborious. Efficiency improvements:  Inverse Kinematics  Pose prediction  Alternative interfaces  Sketch-based control of skeleton pose  Use of a 3D depth camera (Kinect) for pose creation  Next Step: Data capture of a real performance

Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University.

Similar presentations

Presentation on theme: "Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University.

Similar presentations

Presentation on theme: "Cong Ye 1, Steve Maddock 1 and Frances Babbage 2 1 Department of Computer Science 2 School of English Literature, Language and Linguistics The University."— Presentation transcript:

Similar presentations

About project

Feedback