On the Efficiency of Image Metrics for Evaluating the Visual Quality of 3D Models Guillaume Lavoué Mohamed Chaker Larabi Libor Vasa Université de Lyon LIRIS Université de poitier XLIM-SIC University of West Bohemia
Same Max Root Mean Square Error (1.05 × 10-3) An illustration Watermarking Wang et al. 2011 Smoothing Taubin, 2000 Original 0.14 0.40 Watermarking Cho et al. 2006 Simplification Lindstrom, Turk 2000 Noise addition 0.51 0.62 0.84 Same Max Root Mean Square Error (1.05 × 10-3)
Quality metrics for static meshes MSDM [Lavoué et al. 2006] MSDM2 [Lavoué 2011] [Torkhani et al. 2012] Local curvature statistics Distorted model Local differences of statistics Matching Local Distortion Map Spatial pooling Original model Global Distortion Score
Our previous works Why not using Image Quality Metrics? Distortion score Why not using Image Quality Metrics? Such image-based approach has been already used for driving simplification [Lindstrom, Turk, 2000][Qu, Meyer, 2008]
Our study Determine the best set of parameters to use for such image- based quality assessment approach. Compare this approach to the most performing model-based metrics.
Many parameters In our study, we consider: Around 100,000 images Which 2D metric to use? How many views, which views? How to combine the 2D scores? Which rendering, lighting? In our study, we consider: 6 image metrics 2 rendering algorithms 9 lighting conditions 5 ways of combining image metric results 4 databases to evaluate the results Around 100,000 images
Image Quality Metrics State of the art algorithms Simple PSNR and Root Mean Square Error MSSIM (multi-scale SSIM) [Wang et al. 2003] VIF (visual information fidelity) [Sheikh and Bovik, 2006] IWSSIM (information content weighted SSIM) [Wang and LI, 2011] FSIM (feature similarity index) [Zhang et al. 2011] State of the art algorithms
Generation of 2D views and lightning conditions 42 cameras placed uniformly around the object Rendering using a single white directional light source The light is either fixed with respect to the camera, or with respect to the object 3 positions: front, top, top-right So we have 3*2 = 6 lighting conditions We also consider averages of object-light, camera-light and global 9 conditions
Image Rendering Protocols We consider 2 ways of computing the normals, with or without averaging on the neighborhood.
Pooling algorithms How to combine the per-image quality score into a single one? Minkowski norm is popular: We also consider image importance weights [Secord et al. 2011] Perceptual model of viewpoint preference Surface visibility
The MOS databases The LIRIS/EPFL General-Purpose Database 88 models (from 40K to 50K vertices) from 4 reference objects. Non uniform noise addition and smoothing. The LIRIS Masking Database 26 models (from 9K to 40K vertices) from 4 reference objects. Noise addition on smooth or rough regions. The IEETA Simplification Database 30 models (from 2K to 25K vertices) from 5 reference objects. Three simplification algorithms. The UWB Compression database 68 models from 5 reference objects Different kinds of artefacts from compression
Results and analysis Basically we have a full factorial experiments heavily used in statistics to study the effect of different factors on a response variable We consider 4 factors: The metric (6 possible values) The lighting (9 possible values) The pooling (5 possible values) The rendering (2 possible values). 540 possible combinations We consider two response variables: Sperman correlation over all the objects Sperman correlation averaged per objects
Results and analysis For a given factor associated with n possible values, we have n sets of paired spearman coefficients. To estimate the effect of a given factor on the objective metric performance, we conduct pairwise comparisons of each of its value between the others (i.e. n(n-1)/2 comparisons). We have paired values, so we can do better than a simple comparison of the means. Statistical significance test (not Student but Wilcoxon signed rank test). We study the median of paired differences, as well as the 25th and 75th percentiles.
Influence of the metrics IWSSIM provides the best results FSIM and MSSIM are 2nd best, significantlky better than MSE and PSNR. VIF provides instable results (see the percentiles).
Influence of the lighting Indirect illuminations provide better results Light has to be linked to the camera Object-front is not so bad, but not its performances are not stable.
Influence of the pooling Low values of P are better. Weights do not bring significant improvments.
Comparisons with 3D metrics For easy scenarios: 2D metrics are excellent However when the task becomes more difficult, 3D metrics are better But, still, simple image-based metrics are better than simple geometric ones.