Easy Generation of Facial Animation Using Motion Graphs J. Serra, O. Cetinaslan, S. Ravikumar, V. Orvalho and D. Cosker J. Serra, O. Cetinaslan, S. Ravikumar, V. Orvalho and D. Cosker Presented by Andreas Kelemen Costas Mavridis
Introduction Userful for secondary characters. Most research has focused on body animation Facial animation seldom studied. This method uses motion graphs to achieve unique, on-the-fly motion synthesis, controlled via a small number of parameters. User specifies thresholds that control compression ratio
Related Work Procedural animations vs performance driven and key-frame techniques Procedural animation divided into three classes: Constraint or rule-based Statistical or knowledge-based Behavioural-based Motion graphs: Fit within statistical procedural animation Widely used for body animation Inspired by the motion graphs of Kovar et al.
This Paper - Method Two steps: Create the region graphs from analyzing the DB Traverse these structures to synthesize the motion using Dijkstra’s algorithm to minimize similarity between source and target node
This Paper - Method As a result: Compact DB representation Faster generation of animations Authoring of animation with a low number of input parameters. Motion graphs approach specifically tuned for facial animation Novel approach to ease the choice of the thresholds A new method for introducing coherent noise in animation
This Paper - Method
Motion Graph Generation Information contained in graph: Nodes contains average displacement of merged poses Mitigate errors in alignment, effects of facial proportions Each edge has a similarity value Average number of consecutive merged nodes Number from neutral-to-peak and/or peak-to-peak expressions and respective standard deviation
Structure of Motion Training Data
Structure of Motion Training Data Dynamic 2D/3D sparse DB, whose samples need to be aligned Any sample can then be incorporated into a graph as long as it has Sparse landmarks Same equivalent pose (Peq) as all other samples Has the transitions between peak poses and a peak pose
Structure of Motion Training Data Pose alignment revolves around two steps: All poses are rigidly aligned with their equivalent pose The first pose is aligned with the average of all samples (Procrustes analysis) to reduce the effect of different proportions
Structure of Motion Training Data Cohn–Kanade (CK and CK+) 68 landmarks reduced to 23 Poses: happiness, sadness, disgust, surprise, anger, fear, contempt & neutral
Structure of Motion Training Data Pre-processing to reduce error/jitter Sigmoid fitting procedure Least squares to find the optimal parameters
Graph Creation
Similarity metric Spatial location of the landmarks Instantaneous velocity
Optimizing for compression Specifying the desired compressions and tolerances The author controls the trade-off between motion quality and flexibility of the graph Compression is calculated differently for each stage This method locally optimizes the thresholds for each stage/sequence
Motion Synthesis
Motion Synthesis Choosing the nodes relevant to desired facial behaviour Can directly choose the sources and destinations in all graphs Can specify the label (emotions) The actual landmark movements still need to be extracted
Motion Synthesis - Reconstructing motion from path Because of compression information is lost Each node allows approximation of information lost when the poses were merged. Using the average of consecutive merged nodes we construct the average displacement landmarks velocity Use the peak nodes average duration as the sequence length Using the peaks durations allows more realistic synthesis
Motion Synthesis - Reconstructing motion from path Normalize the displacements of each facial region The motion as a whole does not become linear, and keeps its core motion properties Smooth the sequences using the Savitzky–Golay window-based filter and sigmoid fitting It can approximate linear motions, such as blinking Removes drastic motions Removes any left zigzag
Motion Synthesis - Introducing variation Inherently introduce noise, as each facial region has its own motion graph and respective path More variety of emotions for crowds Additional variations in the sequences length and the chosen path and relies on the standard deviation The path variations build on top of Dijkstra’s path Noise is controlled by the percentage of the original path that can be changed.
Results
Results Motion graphs were created and tested with 70 sequences from ∼20 subjects of the CK/CK+ Smaller DB with 6 x < Pneutral . . . Ppeak expression . . . Pneutral > and 3 x <Pneutral . . . Ppeak expression 1 . . . Ppeak expression n . . . Pneutral > Lower data compression -> better representation of motion, more time
Results - Bad Local minima occur when creating/using motion graphs Stiff motion/less expressive Limited due to number of poses in the DB DB was not made for facial animation -> subtle poses Animations hard to recognize Relies on well defined peak expressions More complex motions will require extremely large graph Merging of different expressions for high compression No gaze and head movements
Results - Good Reduces DB size Minimum loss of information Low error in landmark positions and velocity Synthesized motion similar to original even with high compression Data compression ratio typically between boundaries defined by user input Structure with one region grows exponentially with number of nodes of the graph Smooth motion Less input, no equipment
Future Works Gaze with complex emotions More complex animations i.e. talking Better results with artistic help (better models, better results) Better Database (Current, only Neutral-Peak but no peak-to-peak)
Questions
Discussion
Discussion As is, can this method be used in movies and games, and are the results adequate for secondary characters? It offers a good alternative, but the method is not ready yet. No speech movement, gaze change. Too generic.
Discussion Referring to future work using more complex animation which technology could be used to improve results? How can this be cost-efficient and usable for better secondary characters? Speaking movement, although it would be difficult to find decent DB for that. Gaze and head movement is important. Biggest range of emotions