Download presentation
Presentation is loading. Please wait.
Published byMarcus Hubbard Modified over 6 years ago
1
Entropy Slices for Parallel Entropy Coding K. Misra, J. Zhao and A
Entropy Slices for Parallel Entropy Coding K. Misra, J. Zhao and A. Segall l
2
Entropy Slices Introduction: Entropy Slice
Introduce partitioning of slices into smaller “entropy” slices Entropy slice Reset context models Restrict definition for neighborhood Process identical to current slice by entropy decoder Key difference: reconstruction uses information from neighboring entropy slices
3
Entropy Slices PIPE UVLC V2V CABAC
We now introduce major advantages for the entropy slice concept Advantage #1 - Parallelization: Entropy slices do not depend on information outside of the entropy slice and can be decoded independently Allows for parallelization of entire entropy decoding loop – including context adaptation and bin coding Advantage #2 - Generalization Entropy slices can be used for all entropy coding engines currently under study in the TMuC and TMuC software Moreover, we have software available for PIPE and CABAC PIPE UVLC V2V CABAC
4
Entropy Slices Advantages #3 – No impact on single thread/core:
Parallelization capability does not come at the expense of single thread/core applications A single thread/core process may Decode all entropy slices prior to reconstruction OR Decode entropy slice and then reconstruct without neighbourhood reset This is friendly to any architecture
5
Entropy Slices Advantage #4 –Easy Adaptation to Decoder Design
Bit-stream can be partitioned into a large number of entropy slices with little overhead For example, we will show performance of 32 entropy slices for 1080p on next slide – this would translate to ~128 slices for 4k. Decoder can schedule N entropy decoders easily, where N is arbitrary One example: for 32 slices, architecture with parallelization of 4 (N=4) would assign 8 slices per decoder. Another example: for 32 slices, architecture with N=8 would assign 4 slices per decoder Additionally, for large resolutions (4k,8k) possible to scale to 100s of decoders for GPU implementations
6
Entropy Slices Advantage #5 –Coding Efficiency
Insertion of Entropy Slices results in negligible impact on coding efficiency. For example, if configure the encoder for a parallelization factor of 32, we get:
7
Entropy Slices Advantage #6 –Specification
Entropy slices allow simple and direct specification of parallelization at the Profile and Level stage This is accomplished by: Specifying the maximum number of bins in an Entropy Slice Specifying the maximum number of Entropy Slices per picture Allows addition specification of PIPE/V2V configurations Maximum number of bins per bin coder in an Entropy Slice Additional advantage: straightforward to determine conformance at encoder
8
Syntax Entropy Slices Slice header Indicate slice is “entropy slice”
Send only information necessary for entropy decoding
9
Conclusions We have presented the concept of an “entropy slice” for the HEVC system Advantages include: Parallel entropy decoding (both context adaptation and/or bin coding) Generalization to any entropy coding system under study No impact on serial implementations Easy adaptation to different parallelization factors at the decoder Negligible impact on coding efficiency (<0.2%) Direct path for specifying parallelization at the profile/level stage Software is available
10
In the last meeting, two topics were discussed
Entropy Slices In the last meeting, two topics were discussed Size of entropy slice headers Extension to potential architectures that do not decouple parsing and reconstruction We address these in the next slides…
11
Header Size Entropy Slices Very small (as asserted previously)
Quantitative 2 bytes + NALU (1 byte) for 1080p Scales for resolutions due to first_lctb_in_slice
12
Extension to additional architectures
Entropy Slices Extension to additional architectures Previous meeting there was interest in extending the method to architectures that do no buffer symbols between parsing and reconstruction This anticipates “joint-wave-front” processing of both parsing and reconstruction loops We investigated this issue and concluded the following: In the current TMuC design, we observe that it is not possible to do wavefront processing of the parsing stage. If we configure the TMuC to support wavefront parsing, the extension of entropy slices is straightforward
13
EC Init : Use cabac_init_idc to initialize entropy coder
Entropy Slices Our approach: provide additional entry-points without neighbor restriction “Entropy slice” entry-points EC Init EC Init : Use cabac_init_idc to initialize entropy coder Confidential 13
14
Entropy Slices Entropy + Reconstruction steps : 16 Confidential 14
What is worst case maximum buffer size? Entropy + Reconstruction steps : 16 Confidential 14 14
15
Entropy Slices Syntax Signal that the bin coding engine will be reset at start of each LCU row Allow signaling cabac_init_idc for the reset
16
Performance Entropy Slices 4x parallelism:
Maintain initial 32x parallelism Additionally: Four entry points in the ES (aligned with LCU rows; result 4x speedup) RD performance - .3% Max parallelism: Maintain initial 32x parallelization Additionally: one entry point for every LCU row 17x for 1080p RD performance %
17
Entropy Slices Conclusion Entropy slices well tested and flexible Demonstrated in multiple environments (JM, JMKTA, TMuC) Demonstrated with CABAC and CAV2V Friendly to serial and parallel architectures (including both decoupled and coupled parsing/reconstruction architectures) From the last meeting: “The basic concept of desiring enhanced high-level parallelism of the entropy coding stage to be in the HEVC design is agreed.” We propose Adoption of the entropy slice technology into the TM Evaluation of the “joint-wavefront” extension in a CE
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.