Aditya Mavlankar, Pierpaolo Baccichet, David Varodayan and Bernd Girod Optimal Slice Size for Streaming Regions of High-Resolution Video with Virtual Pan/Tilt/Zoom Functionality Aditya Mavlankar, Pierpaolo Baccichet, David Varodayan and Bernd Girod Information Systems Laboratory Stanford University TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA
Outline High-resolution video streaming with IROI Proposed coding scheme for IROI video streaming Analysis of optimal slice size selection Experimental results
High-Resolution Video Streaming with IROI Related work Interactive image browsing with JPEG-2000 [Taubman et al. 2003] Interactive streaming of lightfields [Ramanathan et al. 2004] Interactive streaming of panoramic videos [Heymann et al. 2005] ... Sources of high-resolution videos High-resolution digital imaging sensors (CMOS technology) High-resolution videos stitched from multiple cameras Application scenarios Surveillance Instructional videos Snow cams in ski resorts Interactive TV with virtual pan/tilt/zoom
Demo
H.264/AVC Based Coding Scheme ROI Resolution layer N - ↑ ROI Resolution layer 1 - Need enough random access. Should be able to stream portions of video from any resolution and any region. Temporal prediction only on the base layer. ↑ P slices Overview video Hierarchical B pictures
Tradeoff due to Slice Size Small slice size Entire scene takes more bits to encode Slice headers Lack of context continuation across slices for context adaptive coding Cannot exploit inter-pixel correlation across slices Less pixel overhead: Can adapt to ROI due to fine granularity of slice grid Pixel Overhead ROI
Tradeoff Observed for Pedestrian Area, layer 2 1 1.5 2 2.5 Number of pixels transmitted per rendered pixel 0.2 0.3 0.4 0.5 Bit per pixel for coding given layer Pedestrian Area zf 2. 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Tradeoff Observed for Pedestrian Area, layer 2 0.4 0.45 0.5 0.55 0.6 Bits transmitted per rendered pixel Pedestrian Area zf 2. 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Tradeoff Observed for Pedestrian Area, layer 3 1 1.5 2 2.5 Number of pixels transmitted per rendered pixel 0.1 0.2 0.3 0.4 Bit per pixel for coding given layer Pedestrian Area zf 3. 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Tradeoff Observed for Pedestrian Area, layer 3 0.28 0.3 0.32 0.34 0.36 0.38 0.4 Bits transmitted per rendered pixel Pedestrian Area zf 3. 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Pixel Overhead Analysis in 1-D segment index Imagine an infinitely long line of pixels. In this example, SOI SOI SOI SOI # pixels transmitted (random variable) To simplify the analysis, consider the 1D case.
Pixel Overhead Analysis in 2-D ROI Expected number of pixels transmitted
Optimization Criterion and Constraints Practical constraints narrow down the search: slice dimensions have to be multiples of macroblock width many values can be ruled out since they are likely to be suboptimal constraints due to display dimensions, e.g., restrictions on translation of ROI Bit per pixel for coding given layer Number of pixels transmitted per rendered pixel Goal is to minimize the bits transmitted per second. … for instance for a given resolution layer, if o_h,I is equal to the height of the display area then the the ROI cannot translate vertically. It can only move horizontally. Then you don’t need slices in the vertical direction.
Model Vs Experimental Results (Pedestrian Area, layer 2) 1 1.5 2 2.5 Number of pixels transmitted per rendered pixel 0.2 0.3 0.4 0.5 Bit per pixel for coding given layer Model Experiments 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Model Vs Experimental Results (Pedestrian Area, layer 2) 0.4 0.45 0.5 0.55 0.6 Bits transmitted per rendered pixel Model Experiments 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Model Vs Experimental Results (Pedestrian Area, layer 3) 1 1.5 2 2.5 Number of pixels transmitted per rendered pixel 0.1 0.2 0.3 0.4 Bit per pixel for coding given layer Model Experiments 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Model Vs Experimental Results (Pedestrian Area, layer 3) 0.28 0.3 0.32 0.34 0.36 0.38 0.4 Bits transmitted per rendered pixel Model Experiments 160x160 128x128 64x64 32x32 Slice size in pixels [ ]
Summary Coding scheme provides random access to arbitrary resolutions arbitrary spatial regions within every resolution Slice size is optimized given the video signal the QP the ROI display area dimensions Other coding parameters could be further optimized, for example, joint selection of the QP for the base layer and the enhancement layers
The End
Backup Slides Follow Hereafter
Parts of the Client’s Display Overview display area ROI display area Just to define the terminology: we call this the overview display and we call this the ROI display. The location of the ROI is shown by overlaying a rectangle on the overview video. You might have noticed that the size and color of the rectangle vary according to the zoom factor.
Region-of-Interest Trajectory ROI ROI ROI Original video is available in resolutions Now we define that the ROI trajectory as the path over which the ROI moves. Lets say these are the various resolutions or zoom factors possible. And the ROI is allowed to move about within any of these resolutions, e.g. this animation. by for and , i.e., highest resolution
Pixel Overhead Analysis in 1-D segment index Imagine an infinitely long line of pixels. In this example, SOI SOI SOI SOI Pixel Overhead Theorem: Given that , increases monotonically with is independent of To simplify the analysis, consider the 1D case.
Pixel Overhead Analysis in 2-D ROI Expected value of pixel overhead in 2-D Expected number of pixels to be transmitted