Color Image Quality Assessment Part II: Image Quality Metrics

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Chapter 5: Space and Form Form & Pattern Perception: Humans are second to none in processing visual form and pattern information. Our ability to see patterns.

Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,

November 12, 2013Computer Vision Lecture 12: Texture 1Signature Another popular method of representing shape is called the signature. In order to compute.

Digital Image Processing

Contrast-Aware Halftoning Hua Li and David Mould April 22,

Chapter 3 Image Enhancement in the Spatial Domain.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

Hongliang Li, Senior Member, IEEE, Linfeng Xu, Member, IEEE, and Guanghui Liu Face Hallucination via Similarity Constraints.

 Image Characteristics  Image Digitization Spatial domain Intensity domain 1.

Image Processing IB Paper 8 – Part A Ognjen Arandjelović Ognjen Arandjelović

Chapter 4: Image Enhancement

Correlation & Regression Chapter 15. Correlation statistical technique that is used to measure and describe a relationship between two variables (X and.

H. R. Sheikh, A. C. Bovik, “Image Information and Visual Quality,” IEEE Trans. Image Process., vol. 15, no. 2, pp , Feb Lab for Image and.

EI San Jose, CA Slide No. 1 Measurement of Ringing Artifacts in JPEG Images* Xiaojun Feng Jan P. Allebach Purdue University - West Lafayette, IN.

Guillaume Lavoué Mohamed Chaker Larabi Libor Vasa Université de Lyon

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.

Zhengya Xu, Hong Ren Wu, Xinghuo Yu, Fellow, IEEE, Bin Qiu, Senior Member, IEEE Colour Image Enhancement by Virtual Histogram Approach IEEE Transactions.

Retinex Image Enhancement Techniques --- Algorithm, Application and Advantages Prepared by: Zhixi Bian and Yan Zhang.

Apparent Greyscale: A Simple and Fast Conversion to Perceptually Accurate Images and Video Kaleigh SmithPierre-Edouard Landes Joelle Thollot Karol Myszkowski.

Perceptual Evaluation of Colour Gamut Mapping Algorithms Fabienne Dugay The Norwegian Color Research Laboratory Faculty of Computer Science and Media Technology.

Introduction to Image Quality Assessment

CS292 Computational Vision and Language Visual Features - Colour and Texture.

A Novel 2D To 3D Image Technique Based On Object- Oriented Conversion.

1 Blind Image Quality Assessment Based on Machine Learning 陈欣

Multimedia Data The DCT and JPEG Image Compression Dr Mike Spann Electronic, Electrical and Computer.

Importance of region-of-interest on image difference metrics Marius Pedersen The Norwegian Color Research Laboratory Faculty of Computer Science and Media.

Perceived video quality measurement Muhammad Saqib Ilyas CS 584 Spring 2005.

Fast multiresolution image querying CS474/674 – Prof. Bebis.

Decision analysis and Risk Management course in Kuopio

Measures of Central Tendency

An automated image prescreening tool for a printer qualification process by † Du-Yong Ng and ‡ Jan P. Allebach † Lexmark International Inc. ‡ School of.

1 © 2010 Cengage Learning Engineering. All Rights Reserved. 1 Introduction to Digital Image Processing with MATLAB ® Asia Edition McAndrew ‧ Wang ‧ Tseng.

Introduction to Visible Watermarking IPR Course: TA Lecture 2002/12/18 NTU CSIE R105.

بسمه تعالی IQA Image Quality Assessment. Introduction Goal : develop quantitative measures that can automatically predict perceived image quality. 1-can.

BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1 Part 4 Curve Fitting.

(JEG) HDR Project: update from IRCCyN July 2014 Patrick Le Callet-Manish Narwaria.

What is Image Quality Assessment?

Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.

Descriptive Statistics: Numerical Methods

Purdue University Page 1 Color Image Fidelity Assessor Color Image Fidelity Assessor * Wencheng Wu (Xerox Corporation) Zygmunt Pizlo (Purdue University)

03/05/03© 2003 University of Wisconsin Last Time Tone Reproduction If you don’t use perceptual info, some people call it contrast reduction.

Estimation of Number of PARAFAC Components

Digital Image Processing Lecture 10: Image Restoration March 28, 2005 Prof. Charlene Tsai.

Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp

Digital Image Processing Lecture 10: Image Restoration

2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )

by Mitchell D. Swanson, Bin Zhu, and Ahmed H. Tewfik

An Improved Method Of Content Based Image Watermarking Arvind Kumar Parthasarathy and Subhash Kak 黃阡廷 2008/12/3.

Autonomous Robots Vision © Manfred Huber 2014.

Surround-Adaptive Local Contrast Enhancement for Preserved Detail Perception in HDR Images Geun-Young Lee 1, Sung-Hak Lee 1, Hyuk-Ju Kwon 1, Tae-Wuk Bae.

Bivariate Splines for Image Denoising*° *Grant Fiddyment University of Georgia, 2008 °Ming-Jun Lai Dept. of Mathematics University of Georgia.

1 Marco Carli VPQM /01/2007 ON BETWEEN-COEFFICIENT CONTRAST MASKING OF DCT BASIS FUNCTIONS Nikolay Ponomarenko (*), Flavia Silvestri(**), Karen.

A computational model of stereoscopic 3D visual saliency School of Electronic Information Engineering Tianjin University 1 Wang Bingren.

Objective Quality Assessment Metrics for Video Codecs - Sridhar Godavarthy.

Shadow Detection in Remotely Sensed Images Based on Self-Adaptive Feature Selection Jiahang Liu, Tao Fang, and Deren Li IEEE TRANSACTIONS ON GEOSCIENCE.

PERFORMANCE ANALYSIS OF VISUALLY LOSSLESS IMAGE COMPRESSION

Fast multiresolution image querying

Enhanced-alignment Measure for Binary Foreground Map Evaluation

Image Segmentation Techniques

A User Attention Based Visible Watermarking Scheme

Computer Vision Lecture 16: Texture II

CSc4730/6730 Scientific Visualization

A Review in Quality Measures for Halftoned Images

Spatiochromatic Properties of Natural Images and Human Vision

Digital television systems (DTS)

Lark Kwon Choi, Alan Conrad Bovik

Presentation transcript:

Color Image Quality Assessment Part II: Image Quality Metrics Marius Pedersen The Norwegian Colour and Visual Computing Laboratory, Gjøvik University College, Gjøvik, Norway marius.pedersen@hig.no Jan P. Allebach School of ECE, Purdue University West Lafayette, Indiana allebach@purdue.edu

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

What is an image quality metric? An objective mathematical way to calculate quality without asking observers. Measure of quality Image Metric

Different types of metrics Three main types of metrics: Full-reference. No-reference. Reduced-reference. Reproduction Measure of quality Metric Original

Existing image quality metrics Metrics usually follow a common framework. Different stages: Unless stated otherwise we focus on full-reference Many Fewer Original and reproduction Color space transforms Human visual system models Quality calculation Pooling Quality value

Colour space transforms Preparation for applying a model of the Human Visual System (HVS). This step is a tranformation from RGB (or another colour space) into a more suitable space. This space is usually adapted to the filtering, where a better representation of the perception of colour is achieved. For example an opponent colour space. Color space transforms

Human visual system models These models usually simulate low level features of the HVS, such as contrast sensitivity functions (CSFs) or masking. Other possibilites are high-level features, such based on the idea that our human visual system is adapted to extract information or structures from the image. Human visual system models

Quality calculation Usually quality calculation is a distance. Assumes that the original has the highest quality. Euclidean distance Done in a perceptually uniform color space. Nonlinearly transformed color space so that distance is proportional to ones ability to perceive changes in color. Recently, CIELAB most commonly used. Distance = quality B A Quality calculation 25/09/12 Eq. from http://en.wikipedia.org/wiki/Euclidean_distance

Pooling Pooling is the reduction of quality values. Many Pooling is the reduction of quality values. Quality map reduced to fewer values. Values from different metrics to an overall value. Motivation: Easier to manage one value than many. Most metrics pool by taking the average. Fewer Pooling Quality map Pooling

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

Classification of metrics Metrics can be classified into several categories: Mathematically based metrics. MSE or ∆ 𝐸 𝑎𝑏 ∗ operate only on the intensity of the distortions. M Pedersen, JY Hardeberg. Full-Reference Image Quality Metrics: Classification and Evaluation. Foundations and Trends® in Computer Graphics and Vision 7 (1), 1-80

Classification of metrics To understand the metrics we propose a classification of them into: Mathematically based metrics. MSE or ∆ 𝐸 𝑎𝑏 ∗ Low-level based metrics. S-CIELAB or S-DEE. take into account the visibility of the distortions using low-level models of the human visual system. M Pedersen, JY Hardeberg. Full-Reference Image Quality Metrics: Classification and Evaluation. Foundations and Trends® in Computer Graphics and Vision 7 (1), 1-80

Classification of metrics To understand the metrics we propose a classification of them into: Mathematically based metrics. MSE or ∆ 𝐸 𝑎𝑏 ∗ Low-level based metrics. S-CIELAB or S-DEE. High-level based metrics. SSIM or VIF. are based on the idea that our human visual system is adapted to extract information or structures from the image. M Pedersen, JY Hardeberg. Full-Reference Image Quality Metrics: Classification and Evaluation. Foundations and Trends® in Computer Graphics and Vision 7 (1), 1-80

Classification of metrics To understand the metrics we propose a classification of them into: Mathematically based metrics. MSE or ∆ 𝐸 𝑎𝑏 ∗ Low-level based metrics. S-CIELAB or S-DEE. High-level based metrics. SSIM or VIF. Other metrics. VSNR or CISM. are either based on other strategies or combine two or more of the above groups. M Pedersen, JY Hardeberg. Full-Reference Image Quality Metrics: Classification and Evaluation. Foundations and Trends® in Computer Graphics and Vision 7 (1), 1-80

Mathematically based metrics: MSE MSE is a mathematically based metric; it calculates the cumulative squared error between the original image and the distorted image. MSE is given as: where x and y indicate the pixel position, M and N are the image width and height. These simple mathematical models are usually not well correlated with perceived image quality. Still been of influence to other metrics.

Mathematically based metrics: ∆ 𝐸 𝑎𝑏 ∗ Metrics measuring color difference also belong to the group of mathematically based metrics. Lr,ar,br is the sample color and Lo,ao,bo is the reference color in CIELAB. ∆ 𝐸 𝑎𝑏 ∗ has served as a satisfactory tool for measuring perceptual difference between uniform color patches

Mathematically based metrics: ∆ 𝐸 𝑎𝑏 ∗ ∆ 𝐸 𝑎𝑏 ∗ has also been used to measure natural images, where the color difference of each pixel of the image is calculated. The mean of these differences is the overall indicator:

Example mathematically based metrics Original image from R. Halonen, M. Nuutinen, R. Asikainen, and P. Oittinen. Development and measurement of the goodness of test images for visual print quality evaluation. In S. P. Farnand and F. Gaykema, editors, Image Quality and System Performance VII, volume 7529, pages 752909–1–10, San Jose, CA, USA, Jan 2010. SPIE.

Example – image difference maps

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

Low-level based metrics Low-level based metrics simulates the low level features of the HVS, such as contrast sensitivity functions (CSFs) or masking. Contrast sensitivity is a measure of the ability to discern between luminance of different levels in a static image.

Typical CSF functions As introduced in the first part. CSF varies with many physical attributes: spatial frequency, orientation, light adaptation level, image area, viewing distance, retinal eccentricity. retinal eccentricity is a measure of how far away a given point in the visual field is from the fixation point (fovea). It is measured in degrees of retinal eccentricity. Figure from C. A. Bouman: Digital Image Processing - January 9, 2012 (25/09/12: https://engineering.purdue.edu/~bouman/ece637/notes/pdf/Opponent.pdf)

Low-level based metrics: S-CIELAB ∆ 𝐸 𝑎𝑏 ∗ was not correlated with perceived image difference. Zhang and Wandell proposed a spatial extension based on ∆ 𝐸 𝑎𝑏 ∗ They had two goals: a spatial filtering to simulate the blurring of the HVS. consistency with the basic CIELAB calculation for large uniform areas. Zhang, X. & Wandell, B. A. A spatial extension of CIELAB for digital color image reproduction. Proc. Soc. Inform. Display 96 Digest, Soc. Inform. Display 96 Digest, 1996, 731-734

Low-level based metrics: S-CIELAB Color separation: Image transformed into the O1O2O3 opponent color space. Spatial filter: Data in each color channel is filtered by a 2-dimensional separable spatial kernel. Color difference: CIELAB color space ∆ 𝐸 𝑎𝑏 ∗ to calculate color differences. Pooling: Usually taking the average. Figure from http://white.stanford.edu/~brian/scielab/scielab3/scielab3.pdf 14/09/12

S-CIELAB CSFs Figure from Johnson, G. M. & Fairchild, M. D. Darwinism of Color Image Difference Models. Color Imaging Conference, 2001, 108-112

Example S-CIELAB Want to test S-CIELAB? Matlab code available online at http://white.stanford.edu/~brian/scielab/scielab.html Loading Hats and HatsCompressed load images/hats load images/hatsCompressed Define viewing conditions: We choose two different conditions SPD = 23 (18in/72dpi) and SPD = 56 (44.5in/72dpi) SPD(DPImonitor/((180/pi)*atan(1/NoINCH))) Run S-CIELAB code

Example S-CIELAB - maps 18 inches viewing distance Mean=3.4, Min=0.4, Max=52.6, median=2.4 44.5 inches viewing distance Mean=2.3, Min=0.02, Max=28.2, median=1.7

Other low-level based metrics Spatial-DEE (S-DEE) This metric follows the S-CIELAB framework, but ∆ 𝐸 𝑎𝑏 ∗ is replaced with ΔEE. Spatial filters from Johnson and Fairchild. Adaptive Bilateral Filter (ABF) uses a bilateral filter to blur the image, while preserving edges, which is not the case when using CSFs. Simone, G.; Oleari, C. & Farup, I. PERFORMANCE OF THE EUCLIDEAN COLOR-DIFFERENCE FORMULA IN LOG-COMPRESSED OSA-UCS SPACE APPLIED TO MODIFIED-IMAGE-DIFFERENCE METRICS. 11th Congress of the International Colour Association (AIC), 2009 Wang, Z. & Hardeberg, J. Y. Development of an adaptive bilateral filter for evaluating color image difference. Journal of Electronic Imaging, 2012, 21, 023021-1-023021-10

Comparison of filtering methods Different filtering methods: CSFs (S-CIELAB), bilateral filter (from ABF), CSFs in NSCT (Pedersen et al.). Double check filtering from S-CIELAB Original S-CIELAB ABF NSCT Pedersen, M.; Liu, X. & Farup, I.. Improved Simulation of Image Detail Visibility using the Non-Subsampled Contourlet Transform. Color and Imaging Conference, 2013

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

High level based metrics High-level based metrics quantify quality based on the idea that our HVS is adapted to extract information or structures from the image.

High level based metrics: SSIM SSIM defines the structural information in an image as those attributes that represent the structure of the objects in the scene, independent of the average luminance and contrast. Quantifies perceived change in structural information. Incorporates luminance masking and contrast masking. Figure from Wang, Z.; Bovik, A. C.; Sheikh, H. R. & Simoncelli, E. P. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 2004, 13, 600-612

High level based metrics: SSIM where μ is the mean intensity for signals x and y, and σ is the standard deviation of the signals x and y. signals x and y are of size MxN. C is a constant defined as where L is the dynamic range of the image, and K1<<1. C2 is similar to C1 and is defined as: where K2<<1. These constants are used to stabilize the division of the denominator.

High level based metrics: SSIM SSIM is calculated for local windows in the image. A single value is given as: where X and Y are the reference and the distorted images, 𝑥 𝑗 and 𝑦 𝑗 are image content in local window j, and W indicates the total number of local windows.

Example SSIM Want to test SSIM? https://ece.uwaterloo.ca/~z70wang/research/ssim/ Transform the images to grayscale In the following example I have used Rgb2gray() in Matlab Run ssim_index(img,img2) Using default parameters K = [0.05 0.05]; window = ones(8); (window size) L = 100; (dynamic range)

Mean=0.89, Min=0.23, Max=0.995, median=0.92 Example SSIM - maps Mean=0.89, Min=0.23, Max=0.995, median=0.92

Other approaches Others metrics considered in this group are based on other approaches or metrics combining two or more of the above groups. Visual Signal to Noise Ratio (VSNR), based on near-threshold and suprathreshold properties of the HVS, incorporating both low-level features and mid-level features. Color image similarity measure, this can be divided into two parts; one dealing with the HVS and one with structural similarity. Generalization: S-CIELAB framework + SSIM Chandler, D. M. & Hemami, S. S. VSNR: A Wavelet-Based Visual Signal-to-Noise Ratio for Natural Images. IEEE Trans. Image Processing, 2007, 16, 2284-2298 J. Lee and T. Horiuchi. Image quality assessment for color halftone images based on color structural similarity. IEICE Trans. Fundamentals, E91A:1392–1399, 2008.

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

More on HVS modelling – masking There are additional aspects of the HVS that can be modeled: Luminance masking Contrast masking Masking in sound: Auditory masking occurs when the perception of one sound is affected by the presence of another sound.

Luminance masking Perception of lightness is a nonlinear function of luminance. Luminance masking: the luminance of the original image signal masks the variations in the distorted signal. Visibility threshold increases as background luminance increases Each image has the same amplitudes but different mean (lowest on the left). As can be seen, the pattern is more noticeable towards the left. When the average brightness is higher, the same amount of regional change amounts to a lower contrast as compared to a lower average brightness. Thus the same variation in a bright region would be less visible than in a darker region. 05/10/12: http://scien.stanford.edu/pages/labsite/1998/psych221/projects/98/dctune/yuke/page2.htm

Contrast masking The reduction in visibility of one image component caused by the presence of another image component with similar spatial location and frequency content is called “contrast masking”. Contrast masking can occur within a colour channel, across channels, across subbands, across orientations.

Example contrast masking A test contrast pattern (left) and three different masking contrast patterns (middle). The sum of the test and masks are shown to the right. The test pattern is difficult to see when the frequency of the test and mask are similar. Added Gaussian noise to both. Both images have the same mean luminance. Beach image 17/09/15: https://foundationsofvision.stanford.edu/chapter-7-pattern-sensitivity/

Pooling – one step further Pooling is very important for achieving an IQ metric correlated with the percept. Quality map Many Pooling Fewer Pooling Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. Z. Wang and X. Shang. Spatial pooling strategies for perceptual image quality assessment. In International Conference on Image Processing, pages 2945–2948, Atlanta, GA, Oct 2006. IEEE.

Type of pooling Pooling can usually be applied in three different stages: 1) Spatial pooling: combining values in the quality map in the image domain. Spatial pooling is always needed Spatial pooling

Type of pooling Pooling can usually be applied in three different stages: channel pooling: pooling values from different (color) channels. needed only when the image is decomposed into different channels. Channel pooling

Quality attribute pooling Type of pooling Pooling can usually be applied in three different stages: quality attribute pooling: combining several quality maps generated from different quality attributes (i.e. color, lightness) Only needed when different quality maps are calculated for each quality attribute. Quality attribute pooling

General formulation of pooling A general form of a spatial pooling approach is given by where wi is the weight given to the ith location and mi is the quality measure of the ith location. M is the pooled quality value. Most spatial pooling methods can be formulated in this way. In a simple average pooling method, wi is the same over the image space. Pooling can be divided into two categories: Quality based pooling Content based pooling

Quality based pooling Quality based methods assume that the weights wi are related to the quality value mi at the ith location of the quality map, i.e. These methods follow the principle that low quality values should be weighted more heavily compared to higher quality values. Common approaches: Minkowski pooling Monotonic function pooling (Wang and Shang) Percentile pooling (Moorthy and Bovik) Z. Wang and X. Shang. Spatial pooling strategies for perceptual image quality assessment. In International Conference on Image Processing, pages 2945–2948, Atlanta, GA, Oct 2006. IEEE. A. K. Moorthy and A. C. Bovik. Perceptually significant spatial pooling techniques forimage quality assessment. In D. E. Rogowitz and T. N. Pappas, editors, Human Vision and Electronic Imaging XIV, volume 7240 of Proceedings of SPIE, page 724012, San Jose, CA, Jan 2009.

Content based pooling Content based methods assume that the weights wi might be related to image content in the local region around the ith pixel. where ci is a measure of perceptual significance of image content in the local region around the ith location. Assumption: an error that appears on a perceptually significant region is much more annoying than a distortion appearing in an inconspicuous area. Common methods: Information-content weighting pooling Gaze based pooling Saliency pooling

Evaluation of pooling techniques Comparing the results from different metrics with different pooling methods against perceptual data. Gong, M. & Pedersen, M. Spatial Pooling for Measuring Color Printing Quality Attributes. Journal of Visual Communication and Image Representation, 2012, 23, 685-696. 25/09/12: http://www.sciencedirect.com/science/article/pii/S1047320312000600 The overall results indicate that: Pooling parameters are important. Pooling is metric dependent.

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

Introduction: evaluation of metrics In order to know if an image quality metric correlates with the human percept, some kind of evaluation of the metric is required. The most common to compare the results of the metrics to the results of human observers.

Pair comparison In pair comparison experiments observers judge quality based on a comparison of image pairs, i.e which image in the pair is the best according to a given criterion For example which has the highest quality or is the least different from an original. These experiments can be either forced-choice, where the observer needs to give an answer, or the observer is not forced to make a decision and may judge the two reproductions as equals (tie). No information on the distance between the images is recorded, making it less precise than category judgment, but less complex. Pair comparison is the most popular method to evaluate e.g. gamut mapping*, and is often preferred due to its simplicity, requiring little knowledge by the user. * CIE. Guidelines for the evaluation of gamut mapping algorithms. Technical Report ISBN: 3-901-906-26-6, CIE TC8-03, 156:2004.

Example pair comparison experiment For the first trial the observer judged the left patch to be closer to the reference, the same with the second trial, and in the third trial the right. The observer judges all combinations of pairs.

Category judgement In category judgment the observer is instructed to judge an image according to a criterion, and the image is assigned to a category. Five or seven categories are commonly used, with or without a description of the categories. One advantage of category judgment is that information on the distance between images is recorded, but the task is more complex than pair comparison for the observers. Category judgment experiments are often faster than pair comparison, with fewer comparisons necessary.

Category judgment experiment Reference Test set 50 30 50 70 Trial 1 30 40 50 60 70 Categories: 1-7 4 2 1 2 4

Rank order The observer is presented with a number of images, who is asked to rank them based on a given criterion. Rank order can be compared to doing a pair comparison of all images simultaneously. If the number of images is high, the task quickly becomes challenging to the observer. However, it is a fast way of judging many images and a simple type of experiment to implement.

Rank order example The observer ranks the reproductions from best to worst according to a given criteria.

b Reproduction Original Correlation The most common measure of correlation is the Pearson product-moment correlation coefficient a linear correlation between two variables (X and Y) The correlation value r is between −1 and +1.

Non-linear correlation Metric scores might not linearly fit the results from observers. Solution: non-linear fitting. Sheikh et al. proposed a 5-parameter logistic function: Various number of parameters used by different researchers. Overfitting can be a problem. H.R. Sheikh et al., A statistical evaluation of recent full reference image quality assessment algorithms, IEEE Trans. Image Processing, vol. 15, no. 11, pp. 3440-3451, 2006“ Image from http://sse.tongji.edu.cn/linzhang/IQA/IQA.htm (04/07/13)

Other performance measures Rank correlation (Spearman and Kendall Tau) Root-Mean-Squared-Error F-statistic for comparing the variance of two sets of sample points. Outlier ratio (percentage of the number of predictions outside the range of ±2 times of the standard deviations) of the predictions. Requires access to the individual scores, which is normally not given in databases. Rank order method (Pedersen and Hardeberg, CGIV, 2007) Video Quality Experts Group. FINAL REPORT FROM THE VIDEO QUALITY EXPERTS GROUP ON THE VALIDATION OF OBJECTIVE MODELS OF MULTIMEDIA QUALITY ASSESSMENT, PHASE I. 2008 Pedersen, M. & Hardeberg, J. Y. Rank Order and Image Difference Metrics 4th European Conference on Colour in Graphics, Imaging, and Vision (CGIV), IS&T, 2008, 120-125

Example - evaluation of metrics Evaluation of metrics is very important to ensure their performance. Requires a database of images and corresponding subjective scores. Use an existing database or create a new database.

Existing image quality databases Name CID:IQ TID LIVE (Release 2) Toyama CPIQ IRCCyN/IVC VCL@FER VAIQ TUD JPEGXR HTI IBBI MMSP 3D A57 WIQ TID2013 TID2008 CSIQ DRIQ IVC Watermarking 3D image Art image TUD1 TUD2 Enrico Broken Arrows Fourier Subband Meerwald Year 2014 2013 2008 2006 2010 2012 2005 2007 2009 2011 Color or Gray Color Gray Number of reference image 23 25 29 14 30 26 10 5 12 6 8 42 11 9 3 7 Number of distortion type 24 17 2 15 4 1 Number of distortion level X Number of image 690 3000 1725 808 196 896 104 195 105 130 315 132 96 120 575 16 55 60 54 80 Number of observer 985 838 35 No Specify 20 118 18 Thanks to Xinwei Liu for putting together the table.

Our evaluation of metrics 6 state-of-the-art databases. Compression, gamut mapping, noise, contrast, color, etc. 22 state of the art metrics selected. SSIM, S-CIELAB, VSNR, SHAME, PSNR, etc. Compare the results from the observers to the quality values from the metrics. Correlation as performance measure. TID2008 Skip this slide? Dugay IVC Ajagamelle Simone Pedersen M Pedersen, JY Hardeberg. Full-Reference Image Quality Metrics: Classification and Evaluation. Foundations and Trends® in Computer Graphics and Vision 7 (1), 1-80

Evaluation results Results show that performance depends on: Images, type of distortion, and magnitude of the distortion. Metrics perform better for simple and single distortions, and worse for complex and multiple distortions. Skip this slide?

Evaluation CID:IQ database (www.colourlab.no/CID) 60 image quality metrics Results from 50 cm viewing distance Compare the results from the observers to the quality values from the metrics. Correlation as performance measure. Marius Pedersen. EVALUATION OF 60 FULL-REFERENCE IMAGE QUALITY METRICS ON THE CID:IQ. International Conference on Image Processing (ICIP). 5 pages. September 2015. Quebec, Canada.

Linear Pearson correlation 50 cm CID has the highest correlation coefficient, but it not statistically significantly different from many other metrics, such as MAD, WSSI, colorPSNRHA, and VIF.

Non-linear Pearson correlation 50 cm WSSI has the highest Pearson correlation coefficient, but it is not statistically significantly different from MSSIM. The highest performing color metric is CID.

Synopsis What is an image quality metric Classification of metrics Mathematically based metrics Low-level based metrics High-level based metrics Important factors for metrics Masking Pooling Evaluation of metrics Image quality attributes

One metric for overall quality? Researchers still search for «the holy grail»: one metric to measure overall quality. However, image quality is complex, and one metric might not be suitable to measure all aspects. Solution: Divide overall quality into quality attributes. Image quality attributes = terms of perception Sharpness, contrast, color, etc.

Subset of quality attributes – CPQAs Pedersen et al. proposed six Color Printing Quality Attributes (CPQAs): Color contains aspects related to color, such as hue, saturation, and color rendition, except lightness. Lightness is considered so perceptually important that it is beneficial to separate it from the color CPQA. Lightness will range from ”light” to ”dark”. Contrast can be described as the perceived magnitude of visually meaningful differences, global and local, in lightness and chromaticity within the image. Sharpness is related to the clarity of details and definition of edges. In color printing some artifacts can be perceived in the resulting image. These artifacts, like noise, contouring, and banding, contribute to degrading the quality of an image if detectable. The physical CPQA contains all physical parameters that affect quality, such as paper properties and gloss. Even though these are made for printing, they are general enough to be used in other areas; i.e. display. The selection of metrics must be based on the proporties of the attributes. I.e. for sharpness the metrics should account for details and edges. Pedersen, M.; Bonnier, N.; Hardeberg, J. Y. & Albregtsen, F. Attributes of Image Quality for Color Prints. Journal of Electronic Imaging, 2010, 19, 011016-1-13

Evaluation of printer workflows Using the quality attributes proposed by Pedersen et al. (2010) Suitable metrics for each of the attributes were found. Four different printers evaluated. Details can be found in Pedersen, M. Image quality metrics for the evaluation of printing workflows. University of Oslo, 2011 Color Lightness Sharpness Contrast Artifacts Physical Pedersen, M.; Bonnier, N.; Hardeberg, J. Y. & Albregtsen, F. Attributes of Image Quality for Color Prints Journal of Electronic Imaging, 2010, 19, 011016-1-13

Calculate metrics for different attributes Framework Creating a digital version of the printed image. Using the framework by Pedersen and Amirshahi. Print the images Scan Perform registration Calculate metrics for different attributes Visualize results Pedersen, M. & Amirshahi, S. A. Framework the evaluation of color prints using image quality metrics. 5th European Conference on Colour in Graphics, Imaging, and Vision (CGIV), IS&T, 2010, 75-82

Sharpness Visualization of results are done with spider plots. Pedersen, M. Image quality metrics for the evaluation of printing workflows. PhD thesis. University of Oslo, 2011

Noise Pedersen, M. Image quality metrics for the evaluation of printing workflows. PhD thesis. University of Oslo, 2011

Color Pedersen, M. Image quality metrics for the evaluation of printing workflows. PhD thesis. University of Oslo, 2011

Evaluation of projection systems Similar to the printing evaluation, but using a camera instead of a scanner. Ping Zhao, and Marius Pedersen. Measuring Perceived Sharpness of Projection Displays with A Calibrated Camera. Submitted. Ping Zhao, Marius Pedersen, Jon Yngve Hardeberg, and Jean-Baptiste Thomas . Measuring the Relative Image Contrast Of Projection Displays. Journal of Imaging Science and Technology (JIST), Volume 59, Issue 3, Page 030404-1-030404-13, Society for Imaging Science and Technology, May, 2015. Ping Zhao, Marius Pedersen, Jon Yngve Hardeberg, and Jean-Baptiste Thomas. Image Registration for Quality Assessment of Projection Displays. Published in Proceedings of 21st International Conference on Image Processing (ICIP 2014), Page 3488-3492, Paris, France, October, 2014.

Thank you for your attention Contact information: Marius Pedersen E-mail: marius.pedersen@hig.no Web: www.colourlab.no Phone: (+47) 61 13 52 46 Mobile: (+47) 93 63 43 85