Coordinate-Invariant Methods For Motion Analysis and Synthesis Jehee Lee Dept. Of Electric Engineering and Computer Science Korea Advanced Institute of Science and Technology
Contents Issues in Motion Analysis and Synthesis Spatial Filtering for Motion Data Multiresolution Motion Analysis Applications In this talk, I will address a couple of general issues in motion signal processing, such as coordinate-invariance and time-invariance. I will explore those issues for specific problems, motion filtering and multiresolution analysis. Then, I will demonstrate our methods can be used in a variety of motion editing applications.
Character Animation Realistic motion data Motion capture technology Commercial libraries Producing animation from available motion clips requires specialized tools interactive editing, smoothing, enhancement, blending, stitching, and so on Animating human-like characters is a recurring issue in computer graphics. Recently, motion capture has become one of the most promising technologies in character animation. Realistic motion data can be captured by recording live motions with optical or magnetic motion capture systems. Archives of re-usable motion clips are also commercially available. Although it is relatively easy to obtain high quality motion clips by virtue of motion capture technology, crafting an animation with available motion clips is still difficult and requires specialized tools such as interactive editing, smoothing, enhancement, blending, stitching, and so on.
Motion Signal Processing Difficulties in handling motion data Singularity Inherent non-linearity of orientation space General issues in motion signal processing Coordinate-invariance Time-invariance Although well-established methods exist for signal processing in finite-dimensional vector spaces, the majority of those methods do not easily generalize in a uniform, coordinate-invariant way for manipulating motion data that contain orientations as well as positions. A typical approach is to parameterize 3D orientations with three independent parameters such as Euler angles. However, there are some well-known drawbacks to the Euler angle parameterization. To avoid such a problem, it is desirable to employ a non-singular orientation representation, such as rotation matrices or unit quaternions. Due to the inherent non-linearity of the orientation space, however, it is challenging to generalize conventional signal processing methods to the orientation space. In addition to the singularity problem, we have to consider a couple of issues for motion signal processing. In particular, coordinate-invariance is a very important issue of motion signal processing that is not addressed in conventional signal processing problems for handling image and audio data.
Coordinate-Invariance Independent of the choice of coordinate frames A motion editing operation, such as smoothing, enhancement, blending, and stitching, is coordinate-invariant if the result is not influenced by the choice of the coordinate system. The notion of coordinate-invariance is important not only in a theoretical viewpoint but also in practical situations. Suppose that, for example, identical motion clips are placed at different positions in a reference frame and we apply the same operation to modify them. In that situation, a common expectation is to have the same results independently of their positions. The coordinate-invariant operators guarantee this expectation.
Time-Invariance Independent of the position on the signal Time Yet another issue is time-invariance. Consider a signal in which the same patterns occur in different time instances. Then, a time-invariant operation gives the same result independent of its position on the time axis. Time
Overview Generalize conventional methods Requirements Designing spatial filters for orientation data Multiresolution analysis for rigid motion Requirements Coordinate-invariance Time-invariance Computationally optimal Our goal is to generalize a conventional signal processing technique to motion data. Specifically, I will address two problems: filtering and multiresolution analysis. I will explain many alternative ways to achieve this goal. But, I could find only one solution which satisfies all requirements.
Contents Issues in Motion Analysis and Synthesis Spatial Filtering for Motion Data Multiresolution Motion Analysis Applications The first topic is filtering.
Spatial Filtering for Orientation Data Linear Time-Invariant filter filter mask : vector-valued signal : Not suitable for unit quaternion data unit-length constraints The basic approach of spatial filtering is to sum the products between the mask coefficients and the sample values under the mask at a specific position on the signal. This type of filtering is very popular for vector signals. However, if such a mask is applied to a unit quaternion signal, then the filter response won’t be a quaternion of unit length, because the unit quaternion space is not closed under addition and scalar multiplication.
Previous Work Euler angle parameterization Re-normalization Bodenheimer et al. (’97) Re-normalization Azuma and Bishop (‘94) Exploit a local parameterization Lee and Shin (‘96) Welch and Bishop (‘97) Fang et al. (‘98) Hsieh et al. (‘98) Recently, there have been ever-increasing efforts to generalize various signal processing techniques for motion data. While a great deal of research results are available for position data, the research on orientation data is now emerging. A simple approach is to apply a filter to each component of unit quaternions separately and then to normalize the filter response. However, re-normalization has some drawbacks I will explain later. Thus, recent approaches exploited logarithm and exponential maps to avoid re-normalization, but some important filter properties such as coordinate-invariance are not guaranteed yet.
Exp and Log Let’s get started with the geometric structure of exponential and logarithmic mapping. Consider the unit quaternion space that is a unit hyper-sphere in a 4-dimensional space. The tangent vector at a unit quaternion point lies in the tangent plane at that point.
Exp and Log Multiplying the tangent vector by the inverse of the quaternion point brings every tangent space to coincide with the one at the identity. In physics terminology, angular velocity is defined in this way. The first derivative of the function is not an angular velocity. Instead, we measure the angular velocity in the tangent space at the identity.
Exp and Log The real component of the identity is one and its imaginary components are zero. Thus, any tangent vector at the identity is perpendicular to the real axis and thus it must be a purely imaginary quaternion. Therefore, the tangent space can be considered as a 3-dimensional vector space.
Exp and Log log exp Exponential and logarithm maps give a mapping between the unit quaternion space and its tangent space at the identity. The exponential and logarithmic maps are convenient mathematical tools for representing the change of orientation in a vector space.
Linear and Angular Displacement Let me explain a little bit about linear and angular displacements. Consider two points in a R3 space. The displacement between them is simply the subtraction of one point from the other, and thus p_i+1 is computed by adding the displacement to p_i. Similarly, we want to represent the change of orientation between two quaternion points. To do so, we multiply the inverse of q_i to both points.
Linear and Angular Displacement Then, q_i is brought to the identity.
Linear and Angular Displacement By taking the logarithm of this point, we are able to represent the angular displacement between two points in the tangent space. This equation means that, by adding the angular displacement w_i to q_i, we can obtain the next point q_i+1.
Transformation Transformation between linear and angular signals From this observation, we consider a transformation between linear and angular signals. Given a sequence of unit quaternions, we define its vector counterpart such that the linear displacement between p_i and p_i+1 is identical to the angular displacement between q_i and q_i+1. Then, the angular velocity estimated from the unit quaternion signal is identical to the linear velocity estimated from its vector counterpart. Basically, Our filtering strategy is that we first transform the orientation data into the vector space, apply a filter, and then transform the result back to the orientation space. Suppose that, for example, we apply a smoothing filter that minimizes the second derivative at p_i. Then, we can expect that the angular acceleration at q_i is also minimized.
Filter Design Given: spatial filter F Output: spatial filter H for orientation data “Unitariness” is guaranteed There are several different ways to realize that strategy. Our approach is as follows. Consider a spatial filter F whose coefficients are summed up to one. Then, we can define a corresponding filter for orientation data. This equation implies that the linear displacement caused by F is equivalent to the angular displacement caused by H. The unitariness of filter responses is guaranteed since the unit quaternion space is closed under multiplication.
Filter Design Given: spatial filter F Output: spatial filter H for orientation data The expansion of the exponent gives the weighted sum of the angular displacements for neighbors. This equation looks complicated, but its meaning is quite simple.
Examples Let me show a simple example. The example is a binomial mask that is frequently used for smoothing signals. This equation means that the filter response is computed as an weighed average of neighboring points. I’d like to interpret the equation in a somewhat different way such that the displacement to the output point is represented as a weighed average of linear displacements vectors.
Examples To do so, we first separate the original point p_i and the displacement caused by the filter.
Examples Then, the displacement term can be described as a weighted average of w_i. It is a very simple calculation.
Examples What we have done for orientation filters is clear in these equations. The given filter F computes its response by adding linear displacement to the original point. Similarly, our orientation filter computes its response by adding angular displacement to the original point.
Example Here, the displacement of q_i can be computed as a weighed average of the angular displacement vectors.
Examples Original Filtered Angular acceleration Original Filtered Let me show an experimental result. The left side is the original noisy signal and the middle is the result of filtering. The effect of smoothing is clearly shown in the magnitude plots of angular acceleration. Original Filtered
Properties of Orientation Filters Coordinate-invariance Time-invariance Symmetry The spatial filter H for orientation data inherits important properties from its vector counterpart. First, H is invariant under both local and global coordinate transformations. Therefore, H gives the same results independent of the choice of the coordinate frame in which q_i’s are represented. Second, H is time-invariant, that is, H commutes with the time-shifting operator. Finally, H is symmetric, if given mask coefficients are symmetric.
Computation Compute for i=1 … N (# of log = N) In a computational aspect, w_1 used for evaluating the filter responses at q_1 and q_2. To avoid redundant computation, we filter the signal in two steps. The first step is to compute w_i for all i. And then, we evaluate the filter response for each point. Then, we need to evaluate n logarithms and n exponential maps. This is optimal. Compute for i=1 … N (# of log = N) Compute for i=1 … N (# of exp = N)
Our scheme vs. Re-normalization Filtering with average filter At this point, we need to compare our filtering scheme to others. In practice, we don’t have any critical problem with re-normalization if the input signal is sampled densely. However, in the multiresolution framework, we have to deal with a coarse signal because the signal is down-sampled successively. With coarse signals, our scheme is much more robust than other schemes. For example, if the average filter is applied to this signal in that the sample points are evenly distributed on a great arc, their average is zero and thus re-normalization is impossible. In this case, our filter gives the correct result. The angular displacement between q_i and q_i+1 is cancelled by the angular displacement between q_i and q_i-1. And, this one is cancelled by this one. Therefore, q_i don’t have any displacement. This is correct because the average filter don’t make any change at a symmetric configuration. Re-normalization Our scheme
Local vs. Global Parameterization Global Log parameterization Transform to Apply a filter Deploying the exponential and logarithmic maps, re-normalization can be avoided. There are many alternative ways in using exponential and logarithmic maps. A simple method is to transform quaternion points to the tangent space through logarithmic mapping, apply a filter to the points in the tangent space, and then transform the results back through exponentiation. Basically, this method parameterizes 3D orientations globally by three independent parameters and it is well-known that there is no non-singular parameterization with three parameters. In this case, singularity is observed near to the antipode of the identity. The logarithmic map ill-behaves near to the anti-pole and even not defined at that point. A signal near to anti-pole is extremely distorted by logarithmic mapping, whereas a signal near to the identity is less distorted. Therefore, the filter response is quite dependent on the position of the signal in the reference coordinate frame. This implies that global log parameterization is not coordinate-invariant.
Transform into a Hemi-Sphere Antipodal equivalence One may argue that, since a pair of antipodal points represent the same orientation, we don’t have to deal with a signal near to the anti-pole, because we can project every point into a hemi-sphere. This is right if we are dealing with unorganized point crowd. However, it is not the case for an ordered sequence because projection into a hemi-sphere may incur discontinuity of the signal at the boundary of the hemi-sphere.
Cumulative vs. Non-cumulative Cumulative local parameterization Compute Apply a filter, and then Integrate Another alternative is cumulative local parameterization. We first compute w_i, apply a filter to the sequence of angular displacements directly, and then integrate the filter responses to reconstruct an orientation signal. This method does not have a singularity problem. However, the integration incurs a serious problem because a small perturbation of an angular displacement vector may cause a large discrepancy at the end of the signal. Clearly, this method is not time-invariant.
Coordinate- and Time-invariant Alternatives Geometric construction Slerp (spherical linear interpolation) Bezier curve construction of Shoemake (‘85) Algebraic construction on tangent space Local parameterization (coordinate-invariant) Local support (time-invariant) We even have many alternatives that are both coordinate-invariant and time-invariant. One category of those alternatives is based on spherical linear interpolation. It is always possible to represent a general weighted average as a combination of linear interpolation if the sum of weight coefficients is one. Replacing a linear interpolation by a slerp, a quaternion analogue of a general weighted average can be obtained. A typical example is the Bezier curve construction of Shoemake. Basically, slerp is coordinate-invariant and a combination of slerp is also coordinate-invariant. The other category is to transform the filtering problem into the one in the tangent space. Our scheme is in this category. There are many variants in both categories. But, most of those variants are computationally not optimal except ours.
Summary (Motion Filtering) Designing spatial filters for orientation data Satisfy desired properties Coordinate-invariance Time-invariance Symmetry Simple, efficient, easy to implement In summary, I presented a general method of constructing spatial filters for orientation data. As explained so far, there are many alternative methods. Surprisingly, our method is the only one I could find which satisfies all filter properties listed here and computationally optimal. Our method is very simple, efficient, and easy to implement.
Contents Issues in Motion Analysis and Synthesis Spatial Filtering for Motion Data Multiresolution Motion Analysis Applications The next topic is multiresolution analysis.
Multiresolution Analysis Representing a signal at multiple resolutions facilitate a variety of signal processing tasks give hierarchy of successively smoother signals It is well-known that multiresolution analysis provides a unified framework to facilitate a variety of signal processing tasks. The basic idea is to represent a signal as a collection of coefficients, that form a coarse-to-fine hierarchy, to give a series of successively smoother signals. The coefficients at the coarsest level describe the global pattern of a signal, while those at fine levels give the different details at successively finer resolutions.
Previous Work Image and signal processing Motion synthesis and editing Gauss-Laplacian pyramid [Burt and Adelson 83] Texture analysis and synthesis, image editing, curve and surface manipulation, data compression, and so on Motion synthesis and editing Hierarchical spacetime control [Liu, Gortler and Cohen 94] Motion signal processing [Bruderlin and Williams 95] The notion of multiresolution analysis was initiated by Burt and Adelson who introduced a multiresolution image representation, called Gauss-Laplacian pyramid. Multiresolution techniques have been extensively exercised in computer graphics for texture analysis and synthesis, image editing, curve and surface manipulation, data compression, and so on. These techniques have been used in motion editing and synthesis as well.
Decomposition Expansion : up-sampling followed by smoothing Reduction Expansion The construction of the multiresolution representation is based on two basic operations: reduction and expansion. The expansion is achieved by a subdivision operation that can be considered as up-sampling followed by smoothing. The reduction is a reverse operation, that is, smoothing followed by down-sampling. Given a detailed signal, the reduction operator generates its simplified version at a coarser resolution by applying a smoothing filter and then removing every other frames to down-sample the signal. The expansion of the simplified version interpolates the missing information to approximate the original signal. Then, the difference between them gives the motion details at that resolution. Expansion : up-sampling followed by smoothing Reduction : smoothing followed by down-sampling
Decomposition and Reconstruction Cascading these operations until there remains a sufficiently small number of frames in the signal, we can construct the multiresolution representation which includes the coarse base signal m_0 and a series of motion details. Then, the original signal m_n can be reconstructed by adding motion details to the base signal successively.
Our Approach Multiresolution Motion Analysis Hierarchical displacement mapping How to represent Motion displacement mapping [Bruderlin and Williams 95] Motion warping [Popovic and Witkin 95] Spatial filtering for motion data How to construct Implement reduction and expansion We employ two ideas to construct a multiresolution representation for motion data. Hierarchical displacement mapping provides a hierarchical structure to manage positions and orientations in a coherent manner. Motion displacement mapping is originally invented for warping a canned motion clip while preserving its fine details. In our context, a displacement map is used for representing motion details at a specific resolution. Our filter design scheme plays an important role in the construction algorithm. With our filters, we are able to implement the reduction and expansion operations to separate motion details level-by-level in a coordinate- and time-invariant way.
Motion Representation Configuration of articulated figures Bundle of motion signals Each signal represents time-varying positions and orientations Rigid transformation At this point, we need to fix our terms and notation. The pose of an articulated figure is specified by its joint configurations in addition to the position and orientation of its root segment. Motion data for an articulated figure can be viewed as a bundle of motion signals. Each signal represents time-varying positions or orientations at the corresponding body segment. For uniformity, we assume that the configuration of each joint is given by a 3-dimensional rigid transformation. Then, we can describe the DOFs at every body segment as the pair of a vector in R3 and a unit quaternion in S3. The pair of p and q yields a rigid transformation that rotates a point in R3 by q and then translates it by p.
global (fixed) reference frame Motion Displacement Let me explain about motion displacements and motion displacement mapping. Consider a motion frame denoted by m. Here, p is its position and q is its orientation measured in a global reference frame. The displacement between two motion frames is specified by a rigid transformation that can be locally parameterized by a pair of 3D vectors, u and v; one for linear displacement and the other for angular displacement global (fixed) reference frame
global (fixed) reference frame Motion Displacement In a geometric viewpoint, the motion displacement has an intuitive interpretation. We first rotate the motion frame m along the direction of v by the amount of its magnitude and then translate by u to obtain the other frame m’. This relation is described by this equation. Here, both u and v are measured in the body-fixed coordinate frame attached to m. global (fixed) reference frame
Hierarchical Displacement Mapping Our multiresolution representation consists of a base signal m_0 and a series of displacement maps that form a coarse-to-fine hierarchy. Each displacement map consists of a series of motion displacements, while the base signal is a series of motion frames. The original motion encoded in the multiresolution representation can be reconstructed by successively applying the displacement map at each level to the expansion of the signal at the same level.
Hierarchical Displacement Mapping Starting with a base signal m0, for instance,
Hierarchical Displacement Mapping we first expand the signal by inserting new motion frames, and then
Hierarchical Displacement Mapping apply the corresponding displacement map to obtain a signal m1 at the finer resolution.
Hierarchical Displacement Mapping A series of successively refined motions Coordinate-independence measured in a body-fixed coordinate frame Uniformity through a local parameterization By repeating this process to the finest level, we can reproduce the original motion signal. This approach has two advantages. First, since motion displacements are measured in a body-fixed coordinate frame, the representation is independent of the choice of the global reference frame. Second, we don’t have to distinguish positions and orientations in the displacement maps, because linear and angular displacements have an identical form.
Coordinate Frame-Invariance Decomposition Reconstruction Until now, we have shown that both hierarchical displacement mapping and motion filtering are coordinate-invariant. In conclusion, our multiresolution representation is also coordinate-invariant. This can be explained in several different ways. The first way to explain is that identical motion clips placed at different positions in a reference coordinate system give the exactly same displacement maps in their multiresolution representations. Coordinate-dependent information is remained only in the base signal. The same fact can be explained in a different way. If we decompose a motion signal m into the multiresolution representation, apply a coordinate-transformation T to the base signal, and reconstruct the signal at a different position, then that signal is identical to the one obtained by applying the transformation to the original signal.
Contents Issues in Motion Analysis and Synthesis Spatial Filtering for Motion Data Multiresolution Analysis Applications Now, let me show some applications of multiresolution analysis.
Enhancement / Attenuation Level-wise scaling of coefficients A natural application is modifying a motion signal to convey different moods or emotions through the level-wise scaling of detail coefficients with different scaling factors.
Enhancement / Attenuation Level-wise scaling of coefficients
Extrapolation Combine multiple motions together select a base signal and details from different examples walking running Motion blending is a popular technique to produce a wide span of motions from a small number of example motions. A particularly interesting form of blending is extrapolation. For example, consider three example motions: walking, turning, limping. From these motions, we would like to create a new motion, turning with a limp. Basically, the global pattern of the target motion is similar to turning and its fine details are similar to the limping. Comparing these two motions, we can have information about how straight movement is transformed to turnning. Comparing these two motions, we know how normal walking is transformed to walking with a limp. limping running with a limp
Extrapolation Walking Turning Limping
Extrapolation Walking Running Strutting
Stitching A simple approach Estimate velocities at boundaries, then Perform -interpolation The next application is motion stitching. In animation production, it is often required to combine a series of motion clips together into an animation of arbitrary length. A simple approach would be to estimate the linear and angular velocities at the boundaries of each pair of consecutive motion clips,
Stitching A simple approach Estimate velocities at boundaries, then Perform -interpolation and then perform C1 interpolation.
Stitching Difficulties of the simple approach Hard to estimate velocity robustly However, it is not easy to compute a robust estimate of the velocity from live captured signals since they usually oscillate to include fine details that may distinguish the realistic motion of a live creature from the unnatural motion of a robot. The simple approach is very sensitive to the noise at the boundaries.
Stitching Stitching motion clips seamlessly Merging coefficients level-by-level In the multiresolution framework, it is very easy to stitch motion clips seamlessly without computing velocities. Given two motion clips, we first construct their multiresolution representations and then, Running Walking
Stitching Stitching motion clips seamlessly Merging coefficients level-by-level merge the coefficients level-by-level. At the boundary, we blend the overlapped coefficients. This scheme is very robust to noise, because small noises are encoded in fine levels and thus can hardly affect the global pattern of the motion encoded in coarser levels. Running Walking
Stitching Stitching motion clips seamlessly Merging coefficients level-by-level stub a toe limp stitching
Frequency-based motion editing Edit the global pattern of example motions without explicit segmentation The next application is about editing the global pattern of a given motion. Consider a motion clip in that it turns left and then turns right. With this motion clips, we’d like to produce motion sequences of complex patterns.
Shuffling and Reconstruction Multiresolution representation of example motion Let me give you a brief overview of our approach. We start with the multiresolution representation of the given motion. To produce a different pattern,
Shuffling and Reconstruction We duplicate and shuffle the frames in the base signal. The base signal of new motion Shuffling
Shuffling and Reconstruction detail coefficients Multiresolution Sampling Then, we reconstruct detail coefficients level-by-level on the new base signal through multiresolution sampling. Shuffling
Shuffling and Reconstruction Multiresolution Sampling Then, we are able to produce a new motion with different global appearance. Shuffling
Multiresolution Sampling Input Shuffling Output
Multiresolution Sampling Feature matching example) the change of linear and angular velocities The basic idea of multiresolution sampling is to pick a detail coefficient by feature matching. At the right turning instance, we’d like to choose detail coefficients from the right turning portion of the original signal. Similarly, at the left turning instance, we’d like to choose detail coefficients from the left turning portion. Matching
Multiresolution Sampling Feature matching example) the change of linear and angular velocities Reconstruct In this way, we are able to produce a motion signal at a finer resolution. Matching
Multiresolution Sampling Matching features at multiple resolutions Reconstruct Matching In the next level, we compare the features at multiple resolutions for robust feature matching. For example, when we reconstruct detail coefficients at level 2, the feature functions at level 0 and level 1 are considered simultaneously. Matching
Summary Multiresolution motion analysis Coherency in positions and orientations Coordinate-invariance and Time-invariance In summary, we presented a new multiresolution approach to motion analysis. We focused on manipulating both positions and orientations in a coherent manner. To do so, we employed two ideas: Hierarchical displacement mapping and motion filtering. With these ideas, we were able to construct multiresolution representations in a coordinate- and time-invariant way.