Quality Evaluation and Comparison of SVC Encoders 371-10-21 Esti Hagag, Maayan David Dr. Ofer Hadar; Dan Grois
Introduction Video coding: H.264 AVC & SVC SVC or scalable video coding is an extension of the H.264 standard. SVC offers scalable video streams which can be displayed on various devices with different display resolutions, computation capabilities and bandwidth limitations.
Introduction Cont. Spatial Scalability: the base layer contains a reduced size of the original video sequence. Each additional layer enhances the resolution, resulting in a greater video image with higher resolution. This is intended for devices that have different displaying capabilities i.e. small screens cell phones. Temporal Scalability: The base layer is a low frame-rate version of the original video and each additional layer adds more frames resulting in a higher frame-rate of the video.
Introduction Cont. Quality: All layers have the same spatial resolution. Each layer adds to the quality of the video image which is determined by the available bit-rate or computational capabilities of the end device.
Project Goal Evaluation and comparison of encoders that exists in the industry in order to determine the most suitable SVC based encoder in terms of quality metrics and computational complexity. Different quality metrics are tested to see which metric is the most accurate for quality evaluation.
Quality Metrics Many types of metrics are available for evaluating video quality. The graphs below represent only a few metrics from the total we’ve used for analysis and evaluation. Such metrics are: APSNR - Average Peak Signal to Noise Ratio OPSNR - Overall Peak Signal to Noise Ratio MSE - Mean Squared Error MSAD - Mean Sum of Absolute Difference NQI - New Quality Index VQM - Video Quality Measurement Techniques SSIM - Structural Similarity Delta - Average difference between 2 sequences
Quality Metrics Cont. APSNR - Average Peak Signal-to-Noise Ratio Calculating PSNR values for all frames and averaging them. Advantages: If all video frames have similar complexity, APSNR is quite accurate. Disadvantages: If a few frames score much higher PSNR, this could effect the total PSNR value for all frames.
Quality Metrics Cont. Example Codec A: Frame 1, every pixel has a squared error of 100, PSNR for this frame is 20*log(255/10) = 28.131dB Frame 2, every pixel has a squared error of 100, PSNR for this frame is 20*log(255/10) = 28.131dB Frame 3, every pixel has a squared error of 100, PSNR for this frame is 20*log(255/10) = 28.131dB Frame 4, every pixel has a squared error of 100, PSNR for this frame is 20*log(255/10) = 28.131dB Frame 5, every pixel perfect with 0 error, PSNR for this frame is Infinite. we can use a max =100 Average PSNR= (28.131+28.131+28.131+28.131+100)/5 = 42.50dB Codec B: Frame 1, every pixel has a squared error of 80, PSNR for this frame is 20*log(255/10) = 29.100dB Frame 2, every pixel has a squared error of 80, PSNR for this frame is 20*log(255/10) = 29.100dB Frame 3, every pixel has a squared error of 80, PSNR for this frame is 20*log(255/10) = 29.100dB Frame 4, every pixel has a squared error of 80, PSNR for this frame is 20*log(255/10) = 29.100dB Frame 5, every pixel has a squared error of 80, PSNR for this frame is 20*log(255/10) = 29.100dB Average PSNR = 29.100dB
Quality Metrics Cont. OPSNR - Overall PSNR Here we average the MSE for all frames and then calculate the PSNR using the average MSE Codec A: mean squared error = (100 * 4 + 0 * 1)/5 = 80 Overall/global PSNR = 20*log(255/sqrt(80)) = 29.100 dB Codec B: Overall/global PSNR = 20*log(255/sqrt(80)) = 29.100 DB
Quality Metrics Cont. MSE - Mean Squared Error Advantages: Simple and gives the strength of error between to values. A rough estimate between two compared images Disadvantages: Results doesn’t reflect quality in terms of the HVS
Quality Metrics Cont. MSAD The value of this metric is the mean absolute difference of the color components in the correspondent points of image. This metric is used for testing codecs and filters.
Quality Metrics Cont. Delta The value of this metric is the mean difference of the color components in the correspondent points of image. This metric is used for testing codecs and filters.
Quality Metrics Cont. VQM - Video Quality Measurement Techniques VQM uses DCT to correspond to human perception.
VQM vs. MSE
Quality Metrics Cont. SSIM - Structural Similarity Index SSIM Index is based on the measuring of three components: luminance similarity, contrast similarity and structural similarity.
Quality Metrics Cont. NQI - New Quality Index Objective quality factor composed of three different factors: Loss of correlation Luminance distortion Contrast distortion Range is [-1,1] where value 1 is the highest and is achieved if and only if the source and tested images are the same.
Software Encoders VSoft - Vanguard Software Solutions VSoft’s AVC Professional SDK offer the ability to integrate VSoft’s video compression technology into a wide array of applications. Support for Windows, Mac OS X and Linux development. Ideal for High-quality broadcasting Video conferencing Surveillance Real-time applications
Software Encoders Cont. Features Fourth-generation VSoft H.264 codec implementation, unparalleled in compression ratio, video quality and code efficiency. SD and full HD encoding in real-time on off-the-shelf PC systems! Fully standard-compliant H.264 codec with support for Baseline, Main and High Profiles of the H.264 standard, as well proprietary enhancements. Scalable Video Coding (SVC) implementation for multiple-resolution profiles.
Software Encoders Cont. JSVM - Joint Scalable Video Module (OpenSource) The JSVM software is the reference software for the Scalable Video Coding (SVC) project of the Joint Video Team (JVT) of the ISO/IEC Moving Pictures Experts Group (MPEG) and the ITU-T Video Coding Experts Group (VCEG). Since the SVC project is still under development, the JSVM Software as is also under development and changes frequently. The JSVM software is written in C++ and is provided as source code.
Software Encoders Cont. Elecard Video Quality Estimator A software tool for testing and comparing video sequences Calculates video quality metrics (PSNR, NQI, VQM, SSIM, DELTA, MSE, MSAD)
Configuration & Settings Several sequences were taken in different resolution: QCIF: 176x144 CIF: 352x288 4CIF: 704x576 HD: 1280x720 Sequences with fast/slow moving objects Input video format: YUV 4:2:0 Spatial Scalability was chosen for SVC encoding. Layer #0 QCIF Layer #1 CIF Layer #2 4CIF Constant BR mode
Evaluation & Results “Station” is video that is a zoom-out of a train station. Therefore there are many motion vectors and new objects appearing from the borders of the frame. We have encoded the sequences in SVC spatial scalability mode with 3 resolution layers with the following bitrates: Layer #0: 230 Kbps Layer #1: 680 Kbps Layer #2: 2048 Kbps JSVM was better in almost all tests that were done and outputs greater image quality than the VSoft encoder. VSoft is built to be a real-time encoder unlike the JSVM encoder and it has more time constraints that can affect quality.
Evaluation & Results
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont. The next sequence was encoded in AVC mode, HD resolution of 1280x720 characterized by a small amount of motion vectors Several bitrates were tested: 500 Kbps 1750 Kbps 3500 Kbps The results show that both encoders are very much alike. The differences in quality was very little and it’s not possible to determine which encoder is better in terms of quality.
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont. The next tests were done on a sequence with many motion vectors with little new objects appearing during the sequence. The sequences are in 3 different resolutions and bitrates: QCIF 176x144 80 Kbps 264 Kbps 448 Kbps CIF 352x288 160 Kbps 524 Kbps 896 Kbps 4CIF 704x576 320 Kbps 1184 Kbps 2048 Kbps
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont.
Evaluation & Results Cont.
Conclusions Better Stream with a slow motion - VSOFT Stream with a slow motion - JSVM Quality Tools JSVM 40.8371 40.961933 APSNR 40.629233 40.799433 OPSNR 0.7560333 0.7621 NQI 0.7568333 0.7375 VQM 0.9607667 0.9619 SSIM VSOFT -0.8447 -1.2401667 Delta 20.588333 20.038333 MSE 5.3703333 5.3409 MSAD
Conclusions Cont. Better Stream with a fast motion - VSOFT Stream with a fast motion - JSVM Quality Tools JSVM 37.34407 38.66901 APSNR 37.16446 38.3708 OPSNR 0.5502 0.623767 NQI 0.774467 0.686211 VQM 0.931344 0.946833 SSIM VSOFT -8.20157 -9.52664 Delta 201.6625 148.1047 MSE 66.15974 55.51599 MSAD
Conclusions Cont. Both encoders delivers similar quality results. JSVM is better in sequences with high rate of change and many motion vectors JSVM can deliver almost the same quality as VSoft with smaller bitrates VSoft is more efficient in terms of memory consumption and run time and can deliver good quality in a short time.
Future Work Perform evaluation tests on hardware based encoders Acquiring more encoders to evaluate