Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pyramid Vector Quantization

Similar presentations


Presentation on theme: "Pyramid Vector Quantization"— Presentation transcript:

1 Pyramid Vector Quantization

2 What is Pyramid Vector Quantization?
A Vector Quantizer That has a simple algebraic structure To perform gain-shape quantization

3 Motivation

4 Why Vector Quantization?
3 classic advantages (Lookabaugh et al. 1989): Space filling advantage: VQ codepoints tile space more efficiently Example: 2-D, squares vs. hexagons Maximum possible gain for large dimension: dB Shape advantage: VQ can use more points where PDF is higher 1.14 dB gain for 2-D Gaussian, 2.81 for high dimension Memory advantage: exploit statistical dependence between vector components

5 Why Vector Quantization?
3 classic advantages (Lookabaugh et al. 1989): Space filling advantage: VQ codepoints tile space more efficiently Example: 2-D, squares vs. hexagons Maximum possible gain for large dimension: dB Shape advantage: VQ can use more points where PDF is higher Can be mitigated with entropy coding Memory advantage: exploit statistical dependence between vector components Transform coefficients are not strongly correlated

6 Why Vector Quantization
Important: Space advantage applies even when values are totally uncorrelated Another important advantage Can have codebooks with less than 1 bit per dimension

7 Why Algebraic VQ? Trained VQ impractical for high rates, large dimensions High dimension → large LUTs, lots of memory Exponential in bitrate No codebook structure → slow search “Algebraic” VQ solves these problems Structured codebook: no LUTs, fast search Space-filling lattice for arbitrary dimension unknown: have to approximate PVQ asymptotically optimal for Laplacian sources

8 Why Gain-Shape Quantization?
Separate “gain” (energy) from “shape” (spectrum) Vector = Magnitude × Unit Vector (point on sphere) Potential advantages Can give each piece different rate allocations Preserve energy (contrast) instead of low-passing Scalar can only add energy by coding ±1’s Implicit activity masking Can derive quantization resolution from the explicitly coded energy Better representation of coefficients

9 How it Works (High-Level)

10 Simple Case: PVQ without a Predictor
Scalar quantize gain Place K unit pulses in N dimensions Up to N = 1024 dimensions for large blocks Only has N-1 degrees of freedom Normalize to unit norm K is derived implicitly from the gain Can also code K and derive gain

11 Codebook for N=3 and different K

12 PVQ vs. Scalar Quantization

13 PVQ with a Predictor Video provides us with useful predictors
We want to treat vectors in the direction of the prediction as “special” They are much more likely! Subtracting and coding the residual would lose energy preservation Solution: align the codebook axes with the prediction, and treat one dimension differently

14 2-D Projection Example Input Input

15 2-D Projection Example Input + Prediction Prediction Input

16 2-D Projection Example Input + Prediction
Compute Householder Reflection Prediction Input

17 2-D Projection Example Input + Prediction
Compute Householder Reflection Apply Reflection Prediction Input

18 2-D Projection Example Input + Prediction
Compute Householder Reflection Apply Reflection Compute & code angle Prediction θ Input

19 2-D Projection Example Input + Prediction
Compute Householder Reflection Apply Reflection Compute & code angle Code other dimensions Prediction θ Input

20 What does this accomplish?
Creates another “intuitive” parameter, θ “How much like the predictor are we?” θ = 0 → use predictor exactly θ determines how many pulses go in the “prediction” direction K (and thus bitrate) for remaining N-1 dimensions adjusted down Remaining N-1 dimensions have N-2 degrees of freedom (no redundancy) Can repeat for more predictors

21 Details...

22 Band Structure DC coded separately with scalar quantization
AC coefficients grouped into bands Gain, theta, etc., signaled separately for each band Layout ad-hoc for now Scan order in each band optimized for decreasing average variance

23 Band Structure 4x4 8x8 16x16 Scan order is possibly over-fit...

24 To Predict or Not to Predict...
θ ≥ π/2 → Prediction not helping Could code large θ’s, but doesn’t seem that useful Need to handle zero predictors anyway Current approach: code a “noref” flag Currently jointly code up to 4 flags at once, with fixed order-0 probability per band (5% of KF rate) Patches in review cut this down this a lot Force noref=1 when predictor is zero in keyframes Separate probabilities for each block size Adapt the probabilities

25 Quantization Matrix Simple approach (what we’re doing now)
Separate quantization resolution for each band Keep flat quantization within bands Advanced approach? Scaling after normalization complicated Unit pulses no longer “unit” (how to sum to K?) Householder reflection scrambles things further Better(?): Pre-scale vector by quantization factors Effects on energy preservation?

26 Quantization Matrix Example
Flat Quantizer (base Q=35) Adjusted Per-Band (base Q=23)

27 Quantization Matrix Example
Flat Quantizer (base Q=35) Adjusted Per-Band (base Q=23) Metrics: +15% PSNR, +12% SSIM, -18% PSNR-HVS

28 Activity Masking Goal: Use better resolution in flat areas
Low contrast → low energy (gain) Derivations in doc/video_pvq.lyx, doc/theoretical_results.lyx Currently wrong/incomplete, will fix Step 1: Compand gain (g) Goal: Q ∝ g2α (x264 uses α = 0.173) Quantize ĝ = (Qgĥ)β, encode ĥ β = 1/(1-2α) Qg = (Q/β)β

29 Activity Masking cotd. Step 2: Choose θ resolution
D = (g - ĝ)2 + gĝ(Dθ + sin θ sin ϑ Dpvq) Dθ = 2 – 2cos(θ – ϑ) = distortion due to θ quant. Dpvq = distortion due to PVQ Assume g = ĝ, ignore Dpvq... Qθ = (dĝ/dĥ)/ĝ = β/ĥ


Download ppt "Pyramid Vector Quantization"

Similar presentations


Ads by Google