Single Image Super-Resolution: A Benchmark Chih-Yuan Yang 1, Chao Ma 2, Ming-Hsuan Yang 1 UC Merced 1, Shanghai Jiao Tong University 2
Motivation We would like to figure out some questions. Which is the best super-resolution algorithm? What is the influence of blur kernel width? What metric should be used?
Approach (step 1) We collect 11 state-of-the-art super-resolution algorithms 1.Bicubic interpolation 2.Back projection (Irani 93 : IP) 3.Fast image/video (Shan 07 : SLJT) 4.Gradient profile (Sun 08 : SSXS) 5.Self example (Glasner 09 : GBI) 6.Sparse regression (Kim 10 : KK) 7.Sparse representation (Yang 10 : YWHM) 8.Local self example (Freedman 11 : FF) 9.Adaptive regularization (Dong 11 : DZSW) 10.Simple function (Yang 13 : YY) 11.Anchored neighborhood regression (Timofte 13 : TSG)
Approach (step 2) We set 2 parameters Scaling factors as 6 values Blurring kernel width as 9 values to generate super-resolution images from 2 datasets Berkeley segmentation dataset (200 images) LIVE dataset (29 images)
Approach (step 3) We conduct a human subject study to collect perceptual scores and compute the ranked correlation coefficient between the perceptual scores and 8 metric indices 1.PSNR 2.Weighted PSNR 3.SSIM 4.Multi-scale SSIM 5.VIF (visual information fidelity) 6.UIQI (universal image quality index) 7.IFC (information fidelity criterion) 8.NQM (noise quality measure)
Flow chart (1) (2) (3) (4) (5) Prepare ground truth images
Flow chart (1) (2) (3) (4) (5) Generate low-resolution images
Flow chart (1) (2) (3) (4) (5) Generate super-resolution images
Flow chart (1) (2) (3) (4) (5) Compute metric indices
Flow chart (1) (2) (3) (4) (5) Compute correlation coefficients
Averaged Metric Indices BSD dataset (200 images) LIVE dataset (29 images) s=2 s=3 s=4 s=5 s=6 s=8
We find BSD dataset (200 images) s=2 s=3 s=4 the SLJT, FF, DZSW methods generate misaligned super-resolution results and low metric indices
We find BSD dataset (200 images) s=2 s=3 s=4 the best Gaussian kernel width is proportional to the scaling factor
Reason Information remained in a low-resolution image is determined by 2 factors 1.blurring 2.subsampling When a subsampling ratio is larger, a larger kernel maximizes the remained information in low- resolution images.
We find index / PSNR all algorithms work well for smooth images but poorly for highly textured images. Easiest Most challenging
Reason All test algorithms use appearance features and statistical approaches. Thus they effectively handle smooth regions but difficultly reconstruct textures.
Perceptual correlations Best: IFC Worst: VIF PSNR SSIM
Reason IFC is a metric modelled by natural image priors based on high-frequency features Our test images are all natural images The perceptual scores are determined by the reconstructed high-frequency details
Conclusions IFC metric shows higher correlation with perceptual scores than PSNR and SSIM Existing algorithms have difficulty to reconstruct high-frequency textures A scaling factor of 4 is already challenging
Future Work How to overcome the limitation of visual features and statistical approaches? How to evaluate super-resolution results without a ground truth image?
Code and datasets available 11 algorithms on MATLAB 4 of our implementation (IP, SSXS, GBI, FF) 7 of original release 400 Perceptual scores 130,000 super-resolution images 1M evaluation values