Download presentation
Presentation is loading. Please wait.
1
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Fast super-resolution of video sequences using sparse directional transforms* Sandeep Kanumuri Onur G. Guleryuz DoCoMo USA Labs *Presented at 2008 SIAM Conference on Imaging Science on 07/09/2008 (Animated slides, please use slide show mode)
2
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 2 Outline System Model Motivation Prior Work Our Solution: SWAT (Sparse Warped transform and Adaptive Thresholding) –Algorithm Flowchart –Over-complete Transform –Warped (Directional) Transform –Over-complete Inverse Transform –Adaptive Thresholding Performance Comparison Conclusion
3
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 3 System Model Design goals 1.High Quality Rendering 2.Fast Algorithm (Lower Complexity) – Single Frame, Simple Transform
4
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML Motivation
5
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 5 Broadcast Video – TV application Docking station Low-resolution video signal for mobile phones Low-resolution video is sent to the docking station Docking station uses the SWAT algorithm to convert low-resolution video to high-resolution video High-resolution video is sent to a TV or a large display BENEFIT: Broadcast programming aimed at mobile phones can also be used in stationary environments A.1 A.2 B Low-resolution video is converted to high- resolution video by the cell phone itself using the SWAT algorithm and high- resolution video is transmitted to the TV using local wireless technologies Only one path (Path A or Path B) is used
6
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 6 Broadcast Video – VGA phones Low-resolution video signal for mobile phones BENEFIT: SWAT capability allows this cell phone to convert low-resolution video to high-resolution video VGA phone with SWAT capability VGA phone without SWAT capability
7
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 7 More Applications… Video Quality Enhancement Service –SWAT algorithm can be deployed as a service to enhance the resolution and quality of videos Video Conferencing –A SWAT equipped terminal can show video at a higher zoom level and with improved quality High-quality Image Zooming –SWAT algorithm enables the mobile phone to convert the low quality, low resolution image into a high quality, high resolution image
8
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 8 Prior Work Linear solutions –Filter design Non-linear solutions –Regularization (Projection onto the model space) Signal Sparsity –Iterated Denoising / Shrinkage –Lp-Norm Minimization Optical Flow Adaptive filtering Example-based approaches –Data Consistency (Projection onto the input space)
9
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 9 SWAT Algorithm Flowchart Output Image/Video Input Image/Video Linear Interpolation Filter Directional Over-complete Transform Adaptive Thresholding Directional Over-complete Inverse Transform Enforce Data Consistency More iterations? Low-resolution, low quality High-resolution, low quality High-resolution, high quality yesno Regularization
10
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 10 Linear Interpolation Filter A linear interpolation filter is used to form an initial estimate of the high-resolution image/video –However, the quality of interpolation is relatively low Popular filter choice –Low pass filter of Daubechies 7/9 Inverse Wavelet –H.264 Interpolation Filter A customized linear interpolation filter can be used, if any of the following is known. –Downsampling filter (if the input was obtained by downsampling a higher resolution original) –Filtering caused by the camera acquisition process
11
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 11 0N-1 k (Sparse Decomposition Domain)(Signal Domain) S(k) +T -T 0N-1 n s(n) 0N-1 k C(k) ^ (Denoised) Core idea – Exploit Signal Sparsity S(k) 0N-1 k + W(k) C(k) = “noise”
12
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 12 Transform size: 4x4 (used for description), 3x3 Transform used: DCT, Hadamard For an Over-complete Transform –all possible 4x4 blocks in the image/frame are selected using a non- directional mask –Each 4x4 block undergoes a transform to produce a set of transformed coefficients –Each pixel is involved in multiple transforms (16, on the average) –Total number of transformed coefficients ~ 16 x number of pixels Directional Over-complete Transform –Here, each of the 4x4 blocks is formed by applying a directional mask followed by a warping process (see next slide) Block (1,1) Block (2,1) Block (H-3,1)Block (H-3,2)Block (H-3,W-3) Block (1,2)Block (1,W-3) Block (2,2)Block (2,W-3) … … … … … … Blocks of an Over-complete Transform H = Height of image; W = Width of image Non-directional mask used to select a 4x4 block Over-complete Transform
13
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 13 but violated on directional edges Signal sparsity in DCT domain holds for horizontal and veritcal edges Non-directional mask Directional masks Transform domain: 4x4 DCT Transform support is warped Animated Slide, Please use slide show mode Let us consider 4 blocks along the edge - First, using Non-directional masks - Now, using Directional masks - Directional masks lead to sparse representation For Directional Over-complete Transform, Directional masks replace the Non-directional mask Warped (Directional) Transform
14
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 14 Decision made for a block (4x4) of pixels –At each pixel, a vote is cast for the mask that minimizes the signal variance along the mask direction. –The mask with the most votes is chosen Reduces inconsistency in directions How to choose a mask? Example masks
15
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 15 Over-complete Inverse Transform For an Over-complete Inverse Transform –Each set of transformed coefficients is converted back to pixel domain –Each pixel has multiple estimates from different blocks and a weighted combination is used to arrive at its final estimate W1W2W3 and so on with all the blocks….
16
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 16 Adaptive Thresholding Transform coefficients are thresholded for denoising A master threshold ( ) is used for an initial pass A local threshold ( ) is calculated and finally used –E lost : Energy lost due to thresholding when is used as threshold. Parameters f 1 to f n and E 1 to E n are tuned to achieved a local optimum 1 f1f1 f2f2 fnfn (0,0)E2E2 E1E1 EnEn E lost f()
17
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 17 Enforcing Data Consistency Role of data consistency module –Ensure that the high-resolution estimate, when downsampled, can produce the low-resolution input. Data Consistency module Downsampling Filter Linear Interpolation Filter High-resolution Input Low-resolution Input High-resolution Output + + _ +
18
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 18 Performance Comparison Super-resolution of QCIF to CIF sequences –Low pass filter from Daubechies 7/9 wavelet filter bank –Compression is done using H.264/AVC codec (JM12.0) SWAT run with 2 iterations Compared with –Bilinear interpolation –H.264 interpolation –Simple Inverse –Iterated Denoising / Shrinkage (ID) 2 iterations (similar complexity compared to SWAT) 10 iterations
19
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 19 PSNR comparison (uncompressed)
20
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 20 PSNR comparison (uncompressed)
21
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 21 PSNR comparison (uncompressed)
22
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 22 H264 ID (2 iterations)SWAT Visual Comparison (uncompressed)
23
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 23 H264 ID (2 iterations)SWAT Visual Comparison (uncompressed)
24
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 24 PSNR comparison (compression at QP=20)
25
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 25 PSNR comparison (compression at QP=25)
26
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 26 H264 SWAT Visual Comparison (compression at QP=25)
27
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 27 Visual Comparison (compression at QP=25) H264 SWAT
28
DoCoMo USA Labs All Rights Reserved Sandeep Kanumuri, NML 28 Conclusion SWAT algorithm renders high quality output and yet remains fast –Quality comparable to ID (10 iterations) –Complexity comparable to ID (2 iterations) Enabling Features –Over-complete transform representation –Simple basic transform (Hadamard, Integer DCT) –Sparse warped transform –Adaptive thresholding –Weighted inverse transform Reference –S. Kanumuri, O. G. Guleryuz and M. R. Civanlar, "Fast super-resolution reconstructions of mobile video using warped transforms and adaptive thresholding", SPIE Applications of Digital Image Processing XXX, August 2007 Flicker Reduction Application –To appear in SPIE 2008 (Applications of Digital Image Processing XXXI) E-mail: –Sandeep Kanumuri (skanumuri@docomolabs-usa.com)skanumuri@docomolabs-usa.com –Onur G. Guleryuz (guleryuz@docomolabs-usa.com)guleryuz@docomolabs-usa.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.