Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan.

Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan Department of Electronic Engineering, The Chinese University of Hong Kong PCS2006, April 24-26, Beijing, China

Outline Introduction Background Proposed Algorithm
Experimental Results Conclusion

Introduction Motion Estimation in Critically-Sampled Wavelet Domain
Pro: basically free form the blocking effects Con: inefficient in high bands Motion Estimation in Shift-Invariant Wavelet Domain Pro: perform ME more precisely and efficiently Con: computational complexity e.g. low-band-shift (LBS) and complete-to-overcomplete DWT (CODWT) ME/MC in wavelet domain has received much attention due to its superior performance by comparing to conventional ME/MC in spatial domain. ME/MC in wavelet domain is basically free form the blocking effects due to the global nature of wavelet transform. However, ME/MC in critically sampled wavelet domain is very inefficient in high bands because of the shift-variant property of wavelet decomposition. To overcome the shift-variant property of wavelet transform, LBS and CODWT are proposed. These methods avoid the shift-variant property of the wavelet transform and perform ME more precisely and efficiently. However, a major disadvantage of these methods is the computational complexity which mainly comes from full search algorithm.

Background Motion Estimation in Shift-Invariant Wavelet Domain (1)
Two Level Shift-Invariant Wavelet Decomposition by using Low-Band-Shift (LBS) or Complete-to-Overcomplete DWT (CODWT) First, let’s look at some background. This is an example of two level SIDWT by using LBS or CODWT. For LBS method, these other shifted subbands are obtained by shifting the LL Band of each level, then followed by normal wavelet transform. For CODWT method, these other shifted subbands are obtained by making a direct link between the critically-sampled subbands and shift-invariant subbands using the complete-to-overcomplete prediction filters.

Generation of Wavelet Blocks For ME in shift-invariant wavelet domain, the coefficients of each wavelet tree rooted in the lowest subband are rearranged to form a wavelet block. The purpose of the wavelet block is to provide a direct association between the wavelet coefficients and what they represent spatially in the image. Related coefficients at all scales and orientations are included in each wavelet block. The v-pixel-shifted or {dx,dy}-pixel-shifted coefficient of the pth wavelet block of the reference frame t’ can be represented by (dx%2l, dy%2l) means which shifted subband it is, (i+[dx/2l],j+[dy/2l]) means where the pixel is located for the shifted subband. l denotes decomposition level, k denotes subband type, such as LL/HL/LH/HH subband type. The coefficient of the pth wavelet block of the current frame t can be represented by The difference between these two representations of the reference and current frames is that, for ME/MC, we just want to use the shift-invariant wavelet coefficients of the reference frame to predict the critically-sampled wavelet coefficients of the current frame. The v-pixel-shifted or {dx,dy}-pixel-shifted coefficient of the pth wavelet block of reference frame t’ can be represented by The coefficient of the pth wavelet block of current frame t can be represented by

The sum of absolute difference (SAD) of the pth wavelet block for the motion vector v is computed as follows: The wavelet blocks in the search window in the reference frame are compared to current wavelet block, and a reference wavelet block that leads to the best match is selected. The sum of absolute difference (SAD) of the pth wavelet block for the displacement vector v is computed as follows: The optimum motion vector v∗ of the pth wavelet block, which has minimum displacement error, is given by: However, computational complexity of full search ME in SIWD is so time consuming because of the extra shifted subbands that limits its practical applications. For this reason, alternative and faster techniques should be developed. The optimum motion vector v∗ of the pth wavelet block, which has minimum displacement error, is given by:

Background Anisotropic Motion Model in Wavelet Domain
Traditional 2D ME in spatial domain suffers from the aperture problem 2D ME in wavelet domain the aperture problem can in fact be exploited as an advantage. Another background is the Anisotropic Motion Model in Wavelet Domain. Traditional 2D ME in spatial domain, specially for optical flow estimation techniques, suffers from the aperture problem, due to the ill-posed nature and small observation window. However, the aperture problem can in fact be exploited as an advantage of motion estimation in wavelet domain. Since wavelet transform structures the image into the subbands with different orientations, the subbands contain different edges with different normal flow direction. This suggests that the 2D ME problem can be approximated by a 1D ME along the normal flow direction for the vertical/horizontal subbands. We can use this property to develop the fast lossy motion estimation algorithm in shift-invariant wavelet domain. (a) Aperture problem in spatial domain, (b) Anisotropic motion model in wavelet domain

Proposed Algorithm Multiresolution-Spatio-Temporal Context (1)
Traditional MRME algorithms Multiresolution context Not enough for reducing the risk of getting trapped into a local minimum. The proposed algorithm Multiresolution-spatio-temporal Context Consists of one multiresolution context, four spatial contexts, and five temporal contexts. In the traditional MRME algorithms, the ME is first performed in the coarser resolution to find an initial MV, and then the MV of finer resolution will be refined based on the MV obtained at coarser resolution. However, the coarse-scaled MV for some blocks may not be accurate enough and could cause some errors which propagate along the hierarchical structure. Therefore, using only multiresolution context is not enough for reducing the risk of getting trapped into a local minimum. In the proposed approach, we will exploit the context information from multiresolution-spatial-temporal adjacent block to select a set of initial MV candidates. The candidate predictors for the current block consist of one multiresolution context, four spatial contexts, and five temporal contexts. The set of the initial MV candidates can be expressed as {c0, c1, ..., c9}. (a) multiresolution context, (b) spatial context, (c) temporal context

Proposed Algorithm Multiresolution-Spatio-Temporal Context (2)
For LL subband Initialization: spatio-temporal context, plus the candidate points in shifted LL subband, where the median predictor is located Refinement: diamond search algorithm For other levels Initialization: multiresolution-spatio-temporal context Refinement: anisotropic double cross search algorithm For LL subband, in the initialization stage, besides the candidate predictors from spatio-temporal context, all the candidate points in the shifted LL subband, where the median predictor is located, are also checked. The best predictor among these candidate predictors is further refined by using diamond search algorithm to obtain the best motion vector in LL subband. For other levels, In the initialization stage, all the candidate predictors from multiresolution-spatio-temporal context will be checked. The best predictor among these candidate predictors is further refined by using our proposed anisotropic double cross search algorithm to obtain the best motion vector in the corresponding level.

Proposed Algorithm Anisotropic Double Cross Search Algorithm (1)
Anisotropic motion model suggests that the 2D ME problem in wavelet domain can be approximated by 1D ME along the normal flow direction for the vertical/horizontal subbands. During the 1D window searching, only the coefficients in the corresponding subbands and LL subband are computed. Since anisotropic motion model suggests that the 2D ME problem can be approximated by a 1D ME along the normal flow direction for the vertical/horizontal subbands, we make use of the anisotropic property of the motion field in wavelet domain to develop an anisotropic double cross search algorithm During the 1D window searching refinement, only the coefficients in the corresponding subbands and LL subband are computed. And we define the SAD from the corresponding subbands and LL subband as anisotropic SAD (ASAD), as follows

Proposed Algorithm Anisotropic Double Cross Search Algorithm (2)
The proposed search algorithm initially considers all possible motion vector predictor candidates from multiresolution-spatio-temporal context and uses the best motion vector predictor candidate {dxp; dyp} as the center of the search. Then, the search starts from the two different subbands in different routes: one route is from HL subbands to LH subbands, another route is from LH subbands to HL subbands, The circle denotes This ellipse denotes This rectangle denotes The solid ellipse and solid rectangle These two search routes form an anisotropic double cross search pattern. However, there may exist two best matching points for each cross search route. If the two best matching points from those two cross search routes are not the same, as shown in Fig.4(a) Case I, then the coefficients in HH subbands will be used to judge which one is the best matching point. And the winner is selected as the new center of the search; meanwhile the search stepsize keeps unchanged. If the two best matching points are the same but not the center of the search, as shown in Fig.4(b) Case II, then the best matching point is selected as the new center of the search; meanwhile the search stepsize is reduced by half until it is equal to 1. when the two best matching points from each cross search step are both located in the center of the search, as shown in Fig.4(c) Case III., then, the final step is reached

Experimental Results (1)
Simulation results are reported in the following ways: PSNR MAD operation number speed-up ratio For performance comparison Full Search Algorithm (FSA) FMRME [6] FIBME [7] proposed MR-STC-ADCS Simulation results are reported in the following ways: PSNR for the quality measure between the original and the motion-compensated reconstructed frames. MAD for the distortion measure between the original and the motion-compensated wavelet frames. operation number per block used to compute the partial distortion; execution time per frame for motion estimation including the required overheads for comparison. For the performance comparison, we tested the following algorithms: FSA, FMRME, FIBME, MR-STC-ADCS. Here are Experimental Results of Test Video Sequences with Full Search Algorithm.

Comparison of the Tested Algorithms for QCIF Video Sequences This table lists the Comparison Results of the Tested Algorithms for QCIF Video Sequences

Comparison of the Tested Algorithms for CIF Video Sequences And, this table lists the Comparison Results of the Tested Algorithms for CIF Video Sequences

Comparison of the Tested Algorithms for 4CIF Video Sequences On average, for all sequences examined in the experimental tests: MR-STC-ADCS is roughly 11.5 and 2.6 times faster whereas its PSNR is approximately 1.46 dB and 0.6 dB higher than FMRME and FIBME; and its MAD is approximately and lower than FMRME and FIBME. MR-STC-ADCS is about 271 times faster than FSA for QCIF, 667 times for CIF, and 1313 times for 4CIF, while having an average PSNR loss of only 0.04 dB or an average MAD increase of only compared to the FSA. And, here is for CIF Video Sequences From the Experimental Results of all the Tested Video Sequences, we can see, On average, MR-STC-ADCS is roughly 11.5 and 2.6 times faster whereas its PSNR is approximately 1.46dB and 0.6dB higher than FMRME and FIBME algorithms, respectively. and its MAD is approximately and lower than FMRME and FIBME. On average, MR-STC-ADCS is about 271 times faster than FSA for search window (SW) 15, 667 times for SW 31, and 1313 times for SW 63 while having an average PSNR loss of only 0.04dB or an average MAD increase of only compared to FSA or an average MAD increase of only compared to the FSA.

Conclusion Fast Lossy In-Band Motion Estimation Algorithm
Anisotropic property of the motion field in shift-invariant wavelet domain Multiresolution-spatio-temporal Context Anisotropic Double Cross Search In this paper, we proposed a fast lossy In-Band motion estimation algorithm,MR-STC-ADCS. The proposed algorithm is based on the anisotropic property of motion field in shift-invariant wavelet domain. And we use this property to propose an Anisotropic Double Cross Search Algorithm using Multiresolution-spatio-temporal Context.

Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan.

Similar presentations

Presentation on theme: "Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan.

Similar presentations

Presentation on theme: "Anisotropic Double Cross Search Algorithm using Multiresolution-Spatio-Temporal Context for Fast Lossy In-Band Motion Estimation Yu Liu and King Ngi Ngan."— Presentation transcript:

Similar presentations

About project

Feedback