Authoring Directed Gaze for Full-Body Motion Capture

Slides:



Advertisements
Similar presentations
Department of nskinfo i-education
Advertisements

Heuristic Search techniques
Programming with Alice Computing Institute for K-12 Teachers Summer 2011 Workshop.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
Animation. 12 Principles Of Animation (1)Squash and Stretch (2)Anticipation (3)Staging (4)Straight Ahead Action and Pose to Pose (5)Follow Through and.
1Notes  Handing assignment 0 back (at the front of the room)  Read the newsgroup!  Planning to put 16mm films on the web soon (possibly tomorrow)
Physically Based Motion Transformation Zoran Popović Andrew Witkin SIGGRAPH ‘99.
Computer-Based Animation. ● To animate something – to bring it to life ● Animation covers all changes that have visual effects – Positon (motion dynamic)
Green Screening with eZeScreen 3/24/07 Green Screen Storytelling Technical Reference - using iMovie and eZeScreen How to blend live.
Eyes Alive Sooha Park - Lee Jeremy B. Badler - Norman I. Badler University of Pennsylvania - The Smith-Kettlewell Eye Research Institute Presentation Prepared.
Chapter 7. BEAT: the Behavior Expression Animation Toolkit
Evolving Virtual Creatures & Evolving 3D Morphology and Behavior by Competition Papers by Karl Sims Presented by Sarah Waziruddin.
December 9, 2014Computer Vision Lecture 23: Motion Analysis 1 Now we will talk about… Motion Analysis.
Game Maker Terminology
12 Principles Of Animation (1)Squash and Stretch (2)Anticipation (3)Staging (4)Straight Ahead Action and Pose to Pose (5)Follow Through and Overlapping.
1cs426-winter-2008 Notes  Will add references to splines on web page.
UNIT 5.  The related activities of sorting, searching and merging are central to many computer applications.  Sorting and merging provide us with a.
Course14 Dynamic Vision. Biological vision can cope with changing world Moving and changing objects Change illumination Change View-point.
What you need to know about camera shots and techniques.
LOGO Change blindness in the absence of a visual disruption Professor: Liu Student: Ruby.
1 Neural Codes. 2 Neuronal Codes – Action potentials as the elementary units voltage clamp from a brain cell of a fly.
Computer Animation CS 446 September 5, 2001.
The problem you have samples for has been edited to reduce the amount of reading for the students. We gave the original late in the year. As we were working.
Using the Cyton Viewer Intro to the viewer.
Demand and supply analysis – part 1
Computer Graphics.
UFOAnalyzerV2 (UA2) the key of accuracy
Chapter 21 More About Tests.
Design and Layout (part two)
SHOT TYPES , ANGLES & BASIC COMPOSITION
Film Studies Need to Know (Or what I should have gotten 1st Semester)
Introduction to Events
Analog vs. Digital
How we film and what we call it
The myth of average Runs for 18 minutes
Enhanced-alignment Measure for Binary Foreground Map Evaluation
Given f(x)= x4 –22x3 +39x2 +14x+120 , answer the following questions:
Fitting Curve Models to Edges
COMPOSITION AND FOCUS.
Authoring Directed Gaze for Full-Body Motion Capture
Categorizing sex and identity from the biological motion of faces
Office of Education Improvement and Innovation
Sign test/forensic mini mock
By: Kevin Yu Ph.D. in Computer Engineering
Microsoft Word - Formatting Pages
Sparse and Redundant Representations and Their Applications in
Multimedia Authoring Tools
Bowei Tang, Tianyu Chen, and Christopher Atkeson
Camera Shots and Angles
Cropping for Impact Photojournalism.
Fundamentals of Data Representation
How we film and what we call it
WELCOME.
Synthesis of Motion from Simple Animations
Attentional Modulations Related to Spatial Gating but Not to Allocation of Limited Resources in Primate V1  Yuzhi Chen, Eyal Seidemann  Neuron  Volume.
Computer Animation Displaying animation sequences raster animation
What is Animation? 'To animate' literally means to give life to. Animating is moving something that cannot move on it's own. Animation adds to graphics.
ECE 352 Digital System Fundamentals
Paper by D.L Parnas And D.P.Siewiorek Prepared by Xi Chen May 16,2003
Neural Mechanisms of Visual Motion Perception in Primates
Computer Graphics Lecture 15.
Motion Graphs Davey Krill May 3, 2006.
12 Principles of Animation
Copyright © Cengage Learning. All rights reserved.
ECE 352 Digital System Fundamentals
Software Development Techniques
Data-Driven Approach to Synthesizing Facial Animation Using Motion Capture Ioannis Fermanis Liu Zhaopeng
Let’s be more precise about describing moves of figures in the plane.
Focused skill week 1 task 2 Work 2
Presentation transcript:

Authoring Directed Gaze for Full-Body Motion Capture Bart Broekman (5657679); Lex de Kogel (5655331)

Overview Add directed gaze movements to motion capture data Works automatically but also provides control for manual editing Foundation is a representation of the gaze as a sequence of gaze shifts and fixations Sequence then gets translated to eye movements 1 minute This paper covers an approach which adds directed gaze to characters which have been animated using motion capture data. The approach automatically adds plausible directed gaze, but also provides the ability to manually edit the gaze, it also adds synthetic gaze movements on top of the original motion. The foundation of the approach is a sequence of gaze shifts and fixations in gaze which acts as a representation of the gaze. The synthetic gaze component translates the sequence of shifts an fixations to coordinated movements of the eyes.

Problem Directed gaze is the intentional movement of a character’s line of sight toward targets in space. Gaze of performance captured motion typically needs to be animated by hand. The goal is to make this process faster and easier. Clear problem statement Directed gaze is when a character’s line of sight moves toward a target in the space around them. When creating an animation, gaze is an important component of good character animation. It can for instance signal where their focus is and convey a part of their personality. When recording video with motion capture, the gaze needs to be edited almost all of the time because of the fact that either the movement of the eyes are not recorded, or when the set-up allows for eye-movement tracking, it would probably mean that when the model gets put in the virtual scene the scene is somewhat different from the where the motion was captured meaning that the gaze would be off, resulting in an animator needing to edit the gaze. What this system tries to achieve is automatically calculating a gaze direction based on the data that is already present, for instance things like the direction the torso and the head are facing. The way this is created also helps a lot with the editing of the gaze, which will be explained later. The most important part of this approach is the gaze inference which also will be explained later.

Related work and background information Gaze Synthesis Focus on saccades Focus on coordinated movements towards targets Data-driven Procedural Some papers focus on the rapid eye movements in people's gaze, while others’ focus on the coordinated movement of the body (i.e. the head, the torso and the eyes). Two models based on the coordinated movements are one used by for instance Lance et al. whose approach is data driven, and another one where the model is inspired by neuro physical observations which gives the ability to make procedural models. The later model was used in this paper to extent, specifically a model made by Pesja et al. as it is designed to provide biologically plausible movements.

Related work and background information Gaze Inference When and where the character should look and generate the corresponding gaze behavior Influenced by two mechanisms: Spontaneous bottom-up attention (based on what’s interesting in the scene) Deliberate top-down attention (based on characters goals) Much prior work on bottom-up attention This paper focuses on an editable approach Related work and background information In this context gaze inference is used for methods that determine when and where a character should look, and based on the results generate a corresponding gaze behaviour. Other researches have shown that the gaze is influenced by two mechanics: The first one is where are the things that people find most interesting in the scene, these can be things like brightly coloured objects, things that move, things with a high contrast in them etc. (this is called the spontaneous bottom-up attention) The other thing is where are the thing the character is the most interested in, for instance the person they are talking to. (this is called the deliberate top-down attention) The most of the researches that have been done, have been about bottom-up attention. This paper however is more related to the top-down models, since it infer’s the gaze of a character’s intent based on their movements instead of distracting things. This does not mean this paper tries to generate new types of gaze behaviour as the goal is more to look at the editable specification of the gaze.

Related work and background information Motion Synthesis and Editing Related to automatic methods to fill missing movements from motion capture data First to consider eye gaze This paper is mainly about the adding and editing of gazes, so it is strongly related to models that automatically adds missing movements to motion capture data, but this method is the first to consider eye gaze.

Gaze Inference When is someone looking where? A gaze instance is defined by The gaze shift start frame, fixation start frame, and fixation end frame (fs, fx and fe) The target T The alignment parameters of the head and torso 3.0 As Bart just said it means when is someone looking where. In this paper the way the writers have defined gaze inference is as a sequence of gaze instances. A gaze instance is defined in the following way. Gaze instance G = (fs,fx,fe,T,αH,αT), where fs, fx, and fe are gaze shift start frame, fixation start frame, and fixation end frame, respectively, and T is the gaze target. αH and αT a re-alignment parameters,which specify the head and torso posture relative to the target. There may be gaps in between of unconstrained gaze

Gaze Inference Gaze Instance Inference Gaze Target Inference Gaze Event Discovery Motion Interval Classification Gaze Instance Generation Gaze Target Inference Computing Alignment Parameters 3.1 The first problem is we need to find when gaze instances happen With a human gaze shift the eyes, head, and torso accelerate toward the gaze target, reach their peak velocities at the midpoint of the gaze shift, and decelerate as they align with the target, at which point the fixation begins We detect gaze shifts by searching for this pattern in the body motion using a three-step approach. First, we search for clusters of maxima in angular acceleration signals of the head and torso joints, which correspond to significant gaze events(gaze shift beginnings and ends). In the second step, we classify the motion intervals into gaze shifts and fixations. Third, for each adjacent gaze shift-fixation pair, we generate a gaze instance. Gaze Event Discovery - [Leg in normale woorden uit wat je in het plaatje ziet] Each pair of adjacent maxima defines a motion interval Ik = (fs,k,fe,k), which could either be a gaze shift or a fixation. Motion Interval Classification [insert intuitieve uitleg hier, formules enzo zijn niet makkelijk te lezen] Now we have found a motion interval we have to classify it as either a gaze shift or a fixation. To determine if a motion interval Ik is a gaze shift, we use the fact that gaze shifts are characterized by peaks in joint velocity and we construct another probabilistic measure (not pictured here). Basically we compare the velocities at the starting and ending frames to minimum and maximum velocities in the interval. If the difference between the starting and ending frame and the maximum is big enough and the difference to the minimum is small enough we classify it as a gaze shift, otherwise it’s a fixation. Gaze Instance Generation - Finally we create pairs of the gaze shifts and fixation. So we have classified a gaze shift followed by a fixation and we save that as one gaze shift fixation pair, where the fixation starts on the same frame the shift ends

Gaze Inference Gaze Instance Inference Gaze Target Inference Directional term (Pd) Importance term (Pi) Hand contact term (Ph) Computing Alignment Parameters 3.2 Now we know when gaze shifts and fixations happen, but we have to figure out what someone is looking at. For this we will need a simplified 3d model of the scene. Our gaze inference method is based on three heuristics. First, the character is more likely to be looking at a point that lies along the movement direction of its head. If the character turns right, it is unlikely that the target lies to the left, up, or down. Second, the character is more likely to look at objects and characters that are important to the story or task. We let the animator manually label scene objects as important. Finally, the character is more likely to gaze at objects just before touching or picking them up. Of course the target must be within the character’s view. We use a probabilistic approach for gaze target inference where we use a 2D projection of what the character can see at the fixation start frame, we fixate on the target at the global maximum of the Pt. As you can see in the formula for Pt we have the 3 different terms Pd, Pi and Ph and their weights. I will briefly explain each term with these images. In image 1 you can see the scene from the characters eyes. In image 2 you can see Xs where the character started the gaze shift, Xx where we are now and then we follow that line all the way up to the last point where the eyes can physically move, and as you can see the probability is high on the line and linearly falls when moving away from it. In image 3 you can simply see which objects are labeled as important by the animators. In image 4 you can see which item is about to be picked up, the exact strength of this signal is based on how long it will take until the object will get picked up and image 5 combines all of these terms with the given weights.

Gaze Inference Gaze Instance Inference Gaze Target Inference Computing Alignment Parameters Separate parameters for head and torso Is someone looking with a high or low level of attention? Editable by animators By default set to the posture in the motion capture Gewoon uitleggen wat het is, mogelijk uitbreiden met technische uitleg als nodig voor tijd

Gaze Inference Evaluation Purpose was: Detect key gaze shifts in the original motion Detect which target we are looking at Several scenes were tested on accuracy with ground truth coming from eye tracking and human annotation. The effectiveness of the gaze inference was evaluated using motion capture, eye tracking, and human annotation data as ground-truth. The purpose was to detect key gaze shifts in the original motion and detect which target we are looking at. In the table you can see the percentage of ground-truth gaze shifts that had a match in the inference output. As you can see certain scenes did a bit better than others, for example the ChatWithFriend scene didn’t do so well. Most of the gaze shifts are conversational gaze aversions involving subtle head movements, which may be difficult to detect using this method. [misschien nog iets over de kwaliteit van target detection, maar de paper zegt er niet veel interessants over]

Gaze Synthesis Synthesis of Gaze Kinematics Gaze Motion Blending Important parameters Peak velocity (Vmax) Head and torso latency (𝛕) Moving target Supply gaze controller with relative target translation Gaze Motion Blending Given a gaze behavior specified as a sequence of instances, the gaze synthesis component produces a gaze motion and adds it to the character’s body motion. To synthesize plausible gaze shift movements toward each target, a gaze controller was created. The exact implementation of the gaze controller was not given in the paper so we will not go into too much detail on the exact details, but we will still discuss many of the properties and required parameters etc. The gaze controller generates a rotational motion of the eyes, head, and torso joints toward a gaze target T. Kinematics of eye, head, and torso movements are described by their peak velocities. Peak velocities depend linearly on gaze shift amplitude— the further the joints need to rotate to reach the target, the faster they move. [Explain formula here, a b and v are constants per joint, D is the rotational amplitude to allign it with the target] There is also a latency between the movements of the head, torso and eyes. Having computed the required kinematic parameters, the controller synthesizes the gaze shift as rotational movements from q s ∗ to q x ∗ To properly deal with a moving targets without getting weird results it’s imported to look not only at where the target is now but also where it’s going to be relative to the characters root. We will not go into how this works exactly, because it was not fully explained in the paper.

Gaze Synthesis Synthesis of Gaze Kinematics Gaze Motion Blending Create a linear interpolation between the original motions orientation and the preferred orientation A weight is used for the interpolation which needs to: preserve the original motion as much as possible ensure a smooth transition from constrained to unconstrained gaze Lower weight is close to the original pose, higher is closer to the controllers parameters The weight has 2 components One lowers the weight if the difference in orientations is small Another lowers the weight at the edges of gaze instances to make for smooth transitions At each frame f , the gaze controller outputs a head and torso posture as a set of joint orientations Create a linear interpolation between the original motions orientation and the preferred orientation A weight is used for the interpolation which needs to: preserve the original motion as much as possible (Subtleties) ensure a smooth transition from constrained to unconstrained gaze Lower weight is close to the original pose, higher is closer to the controllers parameters The weight has 2 components One lowers the weight if the difference in orientations is small Another lowers the weight at the edges of gaze instances to make for smooth transitions

Gaze Editing An animator may want to change several things about a performance such as Correcting errors in the actors performance Changing where objects (targets) are Adjusting gaze to express different personality or intent An animator may want to change several things about a performance such as correcting errors in the actors performance, changing where objects are (which could be targets) and adjusting gaze to express different personality or intent. These changes are rather difficult in a workflow simply based on the kinematics of the eyes, head and torso but can be done more easily with the methods used in this paper.

Gaze Editing Editing Operations Edit where the character is looking by gaze targets. Directly editing parameters of specific gaze instances Editing the head and torso alignment Changing the timing Adding new gaze instances Removing gaze instances Editing Tool 1 Edit where the character is looking by gaze targets. 2 Directly editing parameters of specific gaze instances 3 Editing the head and torso alignment 4 Changing the timing 5 Adding new gaze instances 6 Removing gaze instances I will be going over each of the operations and briefly talk about how difficult they are to implement. 1, 2 and 3 are fairly basic since it only changes the properties of existing gaze instances, making a change here does not directly affect anything else. The others are more complicated since they change the timeline and potentially interfere with other gaze instances. The biggest issue is that gaze instances are not allowed to overlap, so if you want to extent one gaze instance into another that becomes a problem. Another issue is that gaze instances can’t be too short, since it takes time to do a gaze shift. In the version created by the paper creators it has to be at least 0.3 seconds long otherwise it gets deleted. When creating a new instance they trim or remove the overlapping gaze instance. If a gaze instance is removed it simply becomes unconstrained gaze.

Gaze Editing Editing Operations Editing Tool The tools were made as a Unity plugin

Example 1 (WalkCones) demonstrates the effects of adding inferred eye movements to a walking motion. The eye movements make the character more lifelike and anticipate changes in trajectory.

Example 2 (DrinkSoda) shows a story edit where a new character is introduced and the other two characters are made to gaze at her

In Example 3 (MakeSandwich), the character making a sandwich is made to look at the camera after adding each ingredient to increase the viewer’s engagement. In Example 4 (WaitForBus), we prevented the character from checking his watch by removing the corresponding gaze instance and constraining his arm

Examples 5 and 6 show how we can use gaze edits to change the character’s personality. Example 5 (ChatWithFriend) contrasts two versions of a first-person conversational scene, one where the character unsettlingly stares at the camera and another where it looks away uncomfortably. In Example 6 (HandShake), we made the character appear standoffish by having him look away during the handshake.

Evaluation Authoring Effort Adding gaze to 7 scenes: Animators: 25 minutes; 86 keys per minute System: 1.5 minutes Editing gaze in 4 scenes: The goal of the study was to find a way to reduce the effort required to produce high-quality gaze animation by modeling the gaze behaviour as a sequence of gaze instances. The first part that will be evaluated is whether the system does indeed reduce the effort required of the animator. The first way this was tested, was by asking animators to add eye movements to 7 scenes. This took them 25 minutes and 86 key presses per minute on average. And to contrast this those same scenes got put through the created system. Here it took 1.5 minutes on average and 0 key presses per minute. The second way this was tested, was asking the animators to edit the gaze in 4 scenes (where each scene had a specific thing that needed to change). As can be seen in the table, the approach of the paper was significantly faster except for one scene where it was as fast as the traditional way.

Evaluation Animation quality; 5 different scenes No gaze Recorded gaze (eye tracking) Hand-authored gaze Synthesized gaze To evaluate the quality of the produced animation, multiple animations were made: No gaze, recorde gaze using eye-tracking, hand-authored gaze and synthesized gaze. The participants were shown 2 types of animation: synthesized gaze and another one, and were asked which one they prefered. There were three types of subjective criteria the participants had to make their decision on: Animator competence, realism and communicative clarity. The results are shown in the table on the right. You can see that compared to no gaze the synthesized gaze is significantly better which was expected, but the expectation was that the synthetic gaze would also be significantly prefered over recorded gaze and not be inferior compared to the hand-authored gaze. All in all however the system seems to incur a small loss in quality but the effort compared to the hand-authored gaze was a lot smaller making it a better approach than doing everything by hand.

Future work Extensions Other types of gaze movements Improving accuracy and sensitivity Transfering to another platform Possible improvements of the paper (future work) The writers of the paper talk about a few possible extensions to the model they currently have. Firstly this model only looks at directed gaze, but there are also other types of gaze movements that can be simulated, types like saccades or smooth pursuit. Movements like the blinking of eyes is already in place using probabilistically generated in the post processing phase. But this could also be explored to see how it can be generated more robustly. The second thing that is proposed is the accuracy and sensitivity of the model. The main way to do this, is by improving the probabilistic formulations used for the gaze inference. [HIER ZOUDEN WE NOG WAT MEER KUNNEN ZEGGEN ALS DAT NODIG IS] The third thing is transfering the system to another platform, currently it is build in unity, but for artists other programs like Maya or Motionbuilder are more commonly used.

Questions Questions from the audience & discussion