Perception-motivated High Dynamic Range Video Encoding

Slides:



Advertisements
Similar presentations
Multimedia System Video
Advertisements

High Dynamic Range Imaging Samu Kemppainen VBM02S.
INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.
MovieLabs Proposals: An Extended Dynamic Range EOTF
(JEG) HDR Project Boulder meeting January 2014 Phil Corriveau-Patrick Le Callet- Manish Narwaria.
Image (and Video) Coding and Processing Lecture 5: Point Operations Wade Trappe.
Computer graphics & visualization HDRI. computer graphics & visualization Image Synthesis – WS 07/08 Dr. Jens Krüger – Computer Graphics and Visualization.
Light, Color & Perception CMSC 435/634. Light Electromagnetic wave – E & M perpendicular to each other & direction Photon wavelength, frequency f = c/
VQEG Rennes meeting june 2012
SWE 423: Multimedia Systems
Computer Science 335 Data Compression.
JPEG Still Image Data Compression Standard
Digital Audio, Image and Video Hao Jiang Computer Science Department Sept. 6, 2007.
T.Sharon-A.Frank 1 Multimedia Image Compression 2 T.Sharon-A.Frank Coding Techniques – Hybrid.
Gradient Domain High Dynamic Range Compression
Saarbr ü cker IT-Dialog Automobilindustrie A Cave System for Interactive Modeling of Global Illumination in Car Interior Kirill Dmitriev, Thomas Annen,
1 JPEG Compression CSC361/661 Burg/Wong. 2 Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg.
Image Compression JPEG. Fact about JPEG Compression JPEG stands for Joint Photographic Experts Group JPEG compression is used with.jpg and can be embedded.
Image and Video Compression
Image Compression - JPEG. Video Compression MPEG –Audio compression Lossy / perceptually lossless / lossless 3 layers Models based on speech generation.
Trevor McCasland Arch Kelley.  Goal: reduce the size of stored files and data while retaining all necessary perceptual information  Used to create an.
Tone mapping with slides by Fredo Durand, and Alexei Efros Digital Image Synthesis Yung-Yu Chuang 11/08/2005.
Page 18/30/2015 CSE 40373/60373: Multimedia Systems 4.2 Color Models in Images  Colors models and spaces used for stored, displayed, and printed images.
64-bits of Glorious Light An Introduction to HDRR CSE3AGT - Paul Taylor 2010.
MPEG-1 and MPEG-2 Digital Video Coding Standards Author: Thomas Sikora Presenter: Chaojun Liang.
MPEG: (Moving Pictures Expert Group) A Video Compression Standard for Multimedia Applications Seo Yeong Geon Dept. of Computer Science in GNU.
Validation of Color Managed 3D Appearance Acquisition Michael Goesele Max-Planck-Institut für Informatik (MPI Informatik) Vortrag im Rahmen des V 3 D 2.
Willy Kuang Feb. 3, 2004 HDR images rendering in iCAM Simg726 Final Project Update.
Spatial Tone Mapping in High Dynamic Range Imaging Zhaoshi Zheng.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
EE 7700 High Dynamic Range Imaging. Bahadir K. Gunturk2 References Slides and papers by Debevec, Ward, Pattaniak, Nayar, Durand, et al…
CS654: Digital Image Analysis Lecture 17: Image Enhancement.
Concepts of Multimedia Processing and Transmission IT 481, Lecture 5 Dennis McCaughey, Ph.D. 19 February, 2007.
Data Compression. Compression? Compression refers to the ways in which the amount of data needed to store an image or other file can be reduced. This.
CIS679: Multimedia Basics r Multimedia data type r Basic compression techniques.
Image Compression Supervised By: Mr.Nael Alian Student: Anwaar Ahmed Abu-AlQomboz ID: IT College “Multimedia”
Film Digital Image Synthesis Yung-Yu Chuang 11/5/2008 with slides by Pat Hanrahan and Matt Pharr.
1 Chapter 2: Color Basics. 2 What is light?  EM wave, radiation  Visible light has a spectrum wavelength from 400 – 780 nm.  Light can be composed.
03/05/03© 2003 University of Wisconsin Last Time Tone Reproduction If you don’t use perceptual info, some people call it contrast reduction.
Chapter 1. Introduction. Goals of Image Processing “One picture is worth more than a thousand words” 1.Improvement of pictorial information for human.
Digital Image Processing Image Compression
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
G52IIP, School of Computer Science, University of Nottingham 1 G52IIP Summary Topic 1 Overview of the course Related topics Image processing Computer.
Rendering Synthetic Objects into Real Scenes: Bridging Traditional and Image-based Graphics with Global Illumination and High Dynamic Range Photography.
Digital Media Lecture 4: Bitmapped images: Compression & Convolution Georgia Gwinnett College School of Science and Technology Dr. Jim Rowan.
G52IIP, School of Computer Science, University of Nottingham 1 G52IIP 2011 Summary Topic 1 Overview of the course Related topics Image processing Computer.
G52IIP, School of Computer Science, University of Nottingham 1 Summary of Topic 2 Human visual system Cones Photopic or bright-light vision Highly sensitive.
The Reason Tone Curves Are The Way They Are. Tone Curves in a common imaging chain.
CS 101 – Sept. 14 Review Huffman code Image representation –B/W and color schemes –File size issues.
Colors. Color Systems In computer graphics, we use RGB colors. But… –Can it represent all colors? –Is it linear? For example, (1.0, 1.0, 1.0) is white.
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Introduction to JPEG m Akram Ben Ahmed
03/03/03© 2003 University of Wisconsin Last Time Subsurface scattering models Sky models.
Image Representation Last update st March Heejune Ahn, SeoulTech.
Graphics II Image Processing I. Acknowledgement Most of this lecture note has been taken from the lecture note on Multimedia Technology course of University.
Ec2029 digital image processing
JPEG. Introduction JPEG (Joint Photographic Experts Group) Basic Concept Data compression is performed in the frequency domain. Low frequency components.
IS502:M ULTIMEDIA D ESIGN FOR I NFORMATION S YSTEM M ULTIMEDIA OF D ATA C OMPRESSION Presenter Name: Mahmood A.Moneim Supervised By: Prof. Hesham A.Hefny.
JPEG Image Coding Standard
Light, Color & Perception
Digital 2D Image Basic Masaki Hayashi
Perception and Measurement of Light, Color, and Appearance
Software Equipment Survey
Human Vision Nov Wu Pei.
Introduction to Digital Image Analysis Part I: Digital Images
Digital Image Fundamentals
Digital Image Synthesis Yung-Yu Chuang 11/8/2007
Presentation transcript:

Perception-motivated High Dynamic Range Video Encoding Rafal Mantiuk, Grzegorz Krawczyk, Karol Myszkowski, Hans-Peter Seidel INFORMATIK

High Dynamic Range The human eye can see a range of luminance from pitch dark to sunlight, which spans about 12 orders of magnitude. This is much more than a standard monitor or projector can display. This is why we coined for those devices a term “low dynamic range”. When we talk about high dynamic range, we usually mean a technology, that can cover almost complete range of luminance the human eye can see.

High vs Low Dynamic Range Video LDR Video Intended for existing displays Relative pixel brightness HDR Video Intended for the human eye Photometric or radiometric units [cd/m2, Watt/m2sr] How is HDR video different from ordinary LDR video, for instance MPEG-4. The major difference comes from different design goals: MPEG-4 video was meant to be displayed on existing display devices, and therefore is limited by the precision of those. HDR video, on the other hand, tries to capture physical world with the accuracy that is limited by the capabilities of the human eye. Therefore, HDR video stores absolute photometric or radiometric values, which have physical meaning and MPEG-4 encodes pixel values, which can represent relative brightness.

High Dynamic Range Video Goal: Efficient encoding of full dynamic range of luminance perceived by the human observer The goal to achieve: An efficient encoding of full dynamic range of luminance perceived by the human observer 1st demo

Overview HDR Pipeline HDR Video Encoding Results Demo & Applications Luminance Quantization Edge Coding Results vs. MPEG-4 vs. OpenEXR Demo & Applications *give an overview of the exiting HDR technology in the context of HDR pipeline *present some details of our HDR video compression *than show result of a benchmark of our compression against MPEG-4 and OpenEXR *next we will give a demonstration of our compression and show a few possible applications

Acquisition  Storage  Display Related Work HDR Pipeline Acquisition  Storage  Display related work in the context of a complete HDR pipeline: from acquisition to display

Acquisition  Storage  Display Related Work HDR Pipeline Acquisition  Storage  Display Global Illumination HDR Cameras HDRC (IMS Chips) Lars III (Silicon Vision) Autobrite (SMal Camera Technologies) LM9628 (National) Digital Pixel System (Pixim) Technology overview [Nayar2003] Video is acquired or generated in the first stage of the HDR pipeline. The natural source of HDR sequencies is Global Illumination rendering, where acurate physical lighting information is computed. To acquire video of a real scene, we can use a HDR camera, There are at least several models of such cameras already available on the market. HDRC – IMS Chips

Acquisition  Storage  Display Related Work HDR Pipeline Acquisition  Storage  Display Still images Radiance – RGBE [Ward91] OpenEXR [Bogart2003] logLuv TIFF [Ward98] HDR JPEG [Ward2004] Video No video format Naïve way – store 3x floating points – produce too much data To cope with that several formats for still images have been proposed the most recent is an HDR extension of an ordinary JPEG format was presented at the APGV conference. But so far no HDR video compression was proposed, and this is the subject of our paper. Why shouldn’t we choose one of still image format and compress separate frames: Video compression is much more effcient.

Acquisition  Storage  Display Related Work HDR Pipeline Acquisition  Storage  Display LDR Displays But Tone Mapping necessary HDR displays start to appear University of British Columbia [Seetzen2004] If we would like to show HDR images on low dynamic range display, like LCD or CRT, we need to apply tone mapping.

HDR Encoding Framework Detail level 1: Input & Output LDR bitstream Video encoder HDR overview of the complete encoding framework, and in particular, how does it differs from MPEG-4. The first basic difference is the input: * MPEG-4 – 8-bit RGB low dynamic range data * HDR Video – floating points in absolute XYZ color space White: MPEG Orange: HDR Encoder

HDR Encoding Framework Detail level 2: Color Transform LDR Color YCrCb Video bitstream Transform Encoder HDR L u'v' p the lower level first – color transformation result – different color space (more suitable for encoding) major difference – Lpuv instead of YCrCb Lpuv can represent HDR data, effective for encoding White: MPEG Orange: HDR Encoder

HDR Encoding Framework Detail level 3: Edge Coding DCT Variable LDR Coding length bitstream Color Motion Tran. Comp. HDR Edge Run- Coding length further down into details a few more MPEG processing blocks – motion compensation, DCT coding and Variable-Length coding our second major extension: another pipeline for encoding sharp edges White: MPEG Orange: HDR Encoder

Encoding of Color Color Tran. Comp. Motion Coding DCT Run- length Edge LDR RGB HDR XYZ bitstream Variable discuss our approach to the color transform

Encoding of Color How to represent color data? Floating Points – ineffective compression Integers – ok, but require quantization How to quantize color data? Quantization errors < threshold of perception Use uniform color space (L*u*v*, L*a*b*) [Ward98] Find minimum number of bits Color (u*v*) – 8 bits are enough Naïve approach – floating points – ineffective better approach - use integers instead of floating points * a continuous variable, like luminance, must be quantized before it can be represented as an integer. * how to quantize color data in the best way, that is in the way that quantization error are not visible. * make sure that quantization error is below threshold of human perception. * to ensure that we can use a uniform color space, like Luv or Lab, and quantize uniformly using the proper amount of bits 8 bits turn out to be sufficient for color information. L component is meant for LDR devices and cannot be used.

Encoding of Luminance How to quantize luminance? Gamma correction? Logarithm? 8 6 log(Y)? 4 log Luminance Y 2 the color can be represented using existing color spaces, we cannot do the same with luminance Those color spaces were not mean for HDR data. How to find the proper mapping for L? LDR – gamma correction – works good human brightness perception can not be approximated with a single power function for the full range of luminance. logarithm – eye response is not logarithmic -2 -4 Integer representation

Threshold Versus Intensity Psychophysical measurements The smallest perceivable difference Y for a certain adaptation level YA tvi [Ferwerda96, CIE 12/2.1] Y log Threshold Y Instead of approximating human perception with a single algebraic function, we would like to derive an optimal quantization from the actual psychophysical measurements. One of characteristics of the human visual perception is a threshold versus intensity function. Threshold versus intensity function, drawn on the right, tells what is the smallest difference of luminance that is visible at a certain adaptation level. YA - Adaptation Luminance log Adaptation Luminance YA

Luminance Quantization Just below threshold of perception Maximum quantization error f Y tvi ) ( max e log Luminance Y Assumption: quantization error (e_max) is below threshold of human perception (tvi(y)/f), where f >= 1 Integers Lp

Luminance Quantization Just below threshold of perception Maximum quantization error )) ( 2 ) 1 l tvi f dl d y × = - f Y tvi ) ( max e decrease threshold mapping to ) ( space in - f Y L l P y log Luminance Y From this assumption we can derive a differential equation. A numerical solution of this equation gives the optimal compander function that can be used for quantization. We give a detailed derivation of this formula in the paper. 10 – 11 bits are enough Capacity function [Ashkihmin02] Grayscale Standard Display Function [DICOM03] Integers Lp

Luminance Quantizations Comparison 2 cvi 11-bit percep. quant. 32-bit LogLuv RGBE log Contrast Threshold -2 The X-axis spans the full dynamic range of luminance. The Y-axis denotes quantization error. The green line is a threshold of human perception. Every distortion that is below that line should not be visible. RGBE is a common exponent encoding used in Radiance format. LogLuv is logarithmic encoding used in TIFF format And finally orange line shows our perceptual quantization. All three encodings do not cause any visible distortions – they are well below the green line. The difference is that the perceptual mapping is also aligned to the threshold of human perception. The other encodings allocate too many bits to encode low-level illumination. -4 -4 -2 2 4 6 8 log Adapting Luminance

Edge Coding Color Tran. Comp. Motion Coding DCT Run- length Edge LDR RGB HDR XYZ bitstream Variable Our second major modifications involved adding an additional pipeline for encoding a sharp contrast edges.

Edge Coding: Motivation HDR video can contain sharp contrast edges Light sources, shadows DCT coding of sharp contrast may cause high frequency artifacts DCT coding Edge coding But why is such extension needed? HDR video can contain a sharp edges of much larger contrast than in case of the LDR video. In case of such sharp edges, the quantization of the DCT coefficients may results in high-frequency artifacts, like those seen in the middle image. When we employ the proposed edge coding, we practically eliminate those artifacts. However this is done at the cost of slightly higher bit-rate.

Edge Coding: Solution Solution: Encode sharp edges in spatial domain, the rest in frequency domain Run-length encoding The general idea of edge coding is shown on this slide. Discrete Cosine Transform is in general not effective for encoding a single sharp edges: they result in high values of all frequency coefficients. To avoid such ineffective coding, we want to split the original signal into two signals: one that contains only sharp edges and the other one that contains smooth values. The smooth signal does not contain high frequencies and can be efficiently encoded using Discrete Cosine Transform. The sharp edge signal contains only sparse values, so it can be run-length encoded. DCT encoding

Edge Coding: Algorithm original I horizontal decomposition edge block horiz. edges II horizontal DCT III vertical decomposition The decomposition of signals shown on the previous slide is quite easy in 1D case, but it is not so trivial if we have to handle 2D data, like video frames. However the extension to 2D turns out to be quite neat, if we combine signal decomposition with 2D Discrete Cosine Transform. We perform 2D decomposition in four steps: Firstly, for each 8-by-8 pixels block we remove sharp edges from rows and place tham in the edge block. Note that the edge signal requires one column less data than the original block, because it contains only contrast differences. edge block vert. edges IV vertical DCT

Edge Coding: Algorithm original I horizontal decomposition edge block horiz. edges II horizontal DCT III vertical decomposition see: how the signal of a single row changes after removing sharp edges. edge block vert. edges IV vertical DCT

Edge Coding: Algorithm original I horizontal decomposition edge block horiz. edges II horizontal DCT III vertical decomposition Secondly, we perform 1D Discrete Cosine Transform on smoothed rows. Important observation: Only the DC coefficients – the first column - can contain sharp edges. This is because the horizontal signal has been smoothed out in the previous steps and their high frequency coefficients are usually low. edge block vert. edges IV vertical DCT

Edge Coding: Algorithm original I horizontal decomposition edge block horiz. edges II horizontal DCT III vertical decomposition This is why in the third step we can decompose only the first column and place the edge information of that column in the blank space of the edge block. edge block vert. edges IV vertical DCT

Edge Coding: Algorithm original I horizontal decomposition edge block horiz. edges II horizontal DCT III vertical decomposition Finally, we perform the vertical Discrete Cosine Transformation to finalize our hybrid coding. edge block vert. edges IV vertical DCT

Results 2x size of tone-mapped MPEG-4 video 20-30x saving compared to intra-frame compression (OpenEXR) We compared our HDR Video encoding with a low dynamic range MPEG-4 video compression and with an intra-frame HDR format – OpenEXR. MPEG-4 is LDR compression so we have to compare tone mapped video: for MPEG-4 before compression for HDR video after compression We used an Universal Image Quality Index (Wang and Bovik) metric to make sure that the quality of the both tone mapped sequences was the same. As can be seen on the chart, the MPEG-4 stream was approximately half of the size of the HDR sequence. Note however that MPEG-4 sequence obviously contained only partial information compared to HDR sequence. When we compared our encoding to a frame-by-frame compression, we got between 20 and 30 times savings. Of course this comparison is not fair, since OpenEXR is pure intra-frame compression and there are no mechanism for motion compensation. But this still gives an order of magnitude of a possible savings. Bit-stream Size

Demo & Applications Display dependent rendering Choice of tone-mapping Extended postprocessing To demonstrate capabilities of our encoding on the example of the HDR Video player.

Conclusions HDR video compression Applications Modest changes to MPEG-4 Lpu’v’ color space Luminance quantization (10-11 bits) Edge coding Applications On-the-fly tone mapping Blooming, motion blur, night vision Tuned for display LDR / HDR Display In this talk, we presented details on our HDR video compression: *We showed that only moderate modification to MPEG-4 are required to encode HDR data; *We derived a new color space for efficient representation of HDR data; *We also improved compression of high-contrast edges In the demo we showed several potential applications of HDR video: * like on-the fly tone mapping (adapted for dd), or * real time post-processing: physically accurate blooming, motion blur, and nigh vision We also proposed video that is tuned for particular monitor or TV set and light conditions

Acknowledgments HDR Images and Sequences Comments and help HDR Camera Paul Debevec SpheronVR Jozef Zajac Christian Fuchs Patrick Reuter HDR Camera HDRC(R) VGAx courtesy of IMS CHIPS www.hdrc.com Comments and help Volker Blanz Scott Daly Michael Goesele Jeffrey Schoner

Thank you http://www.mpi-sb.mpg.de/resources/hdrvideo/