Data-Driven Approach to Synthesizing Facial Animation Using Motion Capture Ioannis Fermanis 6511325 Liu Zhaopeng 6516637
Traditional process Animation production typically goes through a number of creation stages: acting(where the animator acts out the expressions in front of a mirror) blocking (where key body and face poses are created) refining pass (where overlapping actions are added, movements are exaggerated, and the timing and spacing of motions is refined) final polish (where animation curves are cleaned and small details are added) This paper proposes a method that produces animation at the “refining” stage
Purpose Advantage of the papers method: Hand-animation: Costly (both time and money), stylized, high quality Motion capture: Realistic, Fast, not stylized, too many keyframes Advantage of the papers method: Stylized animation and high quality with limited budget and time. Reduce the number of keyframes of the motion-capture data.
Motion Curve Corresponding facial feature or body part(such as head or eyes) xyz translation rotation along the time axis
Pattern Match Approach Input: Captured motion. Motion curves. Realistic. One character. Segmentation Pattern searching Continuous adjustment Database: Hand-animated motion. Example curves. Highly stylized. Output
Pattern Match Approach Example Curves: Divided into chunks of progressively smaller sizes. Serve as patterns to be found in the motion-capture curve. Top-down recursive partition. Pattern Match Approach Segmentation Pattern searching Continuous adjustment Struct Keyframe { time; translation/rotation value; type of tangent; tangent’s in/out angles; }
Pattern Match Approach Segmentation Pattern searching Continuous adjustment Measure the difference between example curves and captured motion curves. Find the best match/ Use a threshold. - Large threshold: more stylized - Small threshold: more realistic Sort the example curves by their length,
Pattern Match Approach Segmentation Pattern searching 3. Continuous adjustment Distance Measurement SAX HMM-Viterbi Algorithm SSE: Sum of Squared Errors For all key frames in the segment. Subsequent captured motion curve at time r Example curve segments The simplest and most common approach.
Pattern Match Approach Segmentation Pattern searching 3. Curve Warping Distance Measurement SAX HMM-Viterbi Algorithm SAX (Symbolic Aggregate Approximation): Converts the curve into symbols Reduces dimensionality Allow faster comparison HMM-Viberbi Algorithm: Compare two curves based on both the quality of individual feature matches and the smoothness of the overall mapping across all points.
Pattern Match Approach Segmentation Pattern searching Curve warping Once the best match is found, the motion curve is warped. Warp function with a given parameter θ: Warp function Satisfies: Warped motion curve matches example curve as closely as possible Continuity constraint. Smoothness constraint.
Pattern Match Approach Segmentation Pattern searching Curve warping Simplified Approach: Compute the difference of the values on the y axis of each point The deviation defines how much the motion curve has to be warped Delete all points that are not part of a matching pattern.
The warped motion curve may include artifact.
Evaluation For testing, the authors hired a 3D-animator to create a high quality animation data-base (costed €6000 for 1 minute) To test this approach, Motion-capture was used to create a dataset of 2 emotions (happy emotion and angry emotion) By using the 3 methods described before(distance measure, SAX and HMM), the animation database was searched for matches Then the number of keyframe is drastically reduced, producing the synthesized animation that will be used in the experiments for evaluation
Evaluation Perceptual Tests: 2 Experiments Experiment 1: Consists of 3 Blocks Block 1: 10 animations without sound (5 animation types of anger x 2 examples) Block 2: 10 animations without sound (5 animation types of the happiness x 2 examples) Block 3: 20 animations with sound (5 animation types x 2 examples x 2 emotions) The 5 animation types are : artist-created, motion capture,distance measure,SAX and HMM
Evaluation After the first 2 blocks, participants were asked if these animations are: Expressive Appealing Cartoony After the 3rd block, participants were asked to rate how good was the facial expression at conveying the message (ignoring the lip movement)
Results of experiment 1 Artist-created animation was still the best across all tests HMM was the best at synthesizing animation and a few times equal to artist-created animation Anger was more expressive than happiness Synthesized happy animations were rated the lowest
Evaluation Experiment 2: The same as experiment 1 but this time with refined animations Refined animations: 1 angry and 1 happy animations synthesized using distance measure 2 angry and 1 happy animations synthesized using HMM
Results of experiment 2 Refined animations were better than unrefined when evaluating how expressive and cartoony they are
https://youtu.be/PQP9965jcCk
Evaluation Overall results: No significant difference between the pattern-matching methods Good results in keyframe reduction The unrefined experiment shows good results for angry animation The refined experiment shows good result for happy animation
Future Work The method could be extended to handle curves jointly thus removing artifacts A more sophisticated curve segment dictionary could be used to make the matching process faster without increasing the memory constraint The work can be extended by using machine learning on more data and a larger range of emotions to test generalizability With more publicly available data (to create a database) the methods and results should improve
Evaluation of the paper Positives: The structure of the methodology was good Overall a good idea, combining motion capture and hand-animation with relatively good results Negatives: The structure and explanation of the evaluation and its result was not as good Not all graphs and information is shown in the paper The paper repeats itself unnecessarily Larger sample size would provide more reliable data (18 participants only)
Questions?
Discussion Was adding sound in the experiments a good idea?