Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Audio/Video Compression 4 zLecture 3: Multimedia Networks zLecture 4: Audio/Video Compression zImage & Video Compression Standards zSpeech & Audio Compression.

Similar presentations


Presentation on theme: "1 Audio/Video Compression 4 zLecture 3: Multimedia Networks zLecture 4: Audio/Video Compression zImage & Video Compression Standards zSpeech & Audio Compression."— Presentation transcript:

1 1 Audio/Video Compression 4 zLecture 3: Multimedia Networks zLecture 4: Audio/Video Compression zImage & Video Compression Standards zSpeech & Audio Compression Standards zWavelet Transform & its Application in Compression

2 2 Introduction to Audio/Video Compression 4 zWith today’s technology, only compression makes storage/transmission of digital audio/video streams possible zRedundancy exploitation for compression based on human perceptive features

3 3 Introduction to Audio/Video Compression 4 zSpatial redundancy: Values of neighboring pixels strongly correlated in natural images zTemporal redundancy: Adjacent frames in a video sequence often show very little change, a strong audio signal in a given time segment can mask certain lower level distortion in future & past segments

4 4 Introduction to Audio/Video Compression 4 zSpectral redundancy: In multispectral images, spectral values of same pixel across spectral bands correlated, an audio signal can completely mask a sufficiently weaker signal in its frequency-vicinity zRedundancy across scale: Distinct image features invariant under scaling zRedundancy in stereo: Correlations between stereo images/audio channels

5 5 Introduction to Audio/Video Compression 4 zSpatial/spectral redundancies: Transform Coding zTemporal redundancy: DPCM (differential pulse code modulation), motion estimation/motion compensation zFirst compression methods: lossless yHuffman coding yZiv-Lempel coding yArithmetic coding zInadequate for transmission media of low bandwidth (e.g., ISDN) or for devices of low data throughput (e.g., CD-ROM)

6 6 Introduction to Audio/Video Compression 4 zLossless vs. lossy compression zIntraframe vs. interframe compression zSymmetrical vs. asymmetrical compression zReal-time: Encoding-decoding delay<=50 ms zScalable: Frames coded at different resolutions or quality levels zRecent advanced compression methods reduce bandwidths enormously without reduction of perceptive quality

7 7 Introduction to Audio/Video Compression 4 zEntropy coding: Arithmetic coding, Huffman coding, Run-length coding zSource coding: DPCM, DCT, DWT, motion-estimation/motion compensation zHybrid Coding: H.261, H.263, H.263+, JPEG, MPEG1, MPEG2, MPEG4, Perceptual Audio Coder Preprocessing Source coding Entropy coding Uncompressed data Hybrid coding = source coding + entropy coding Compressed data

8 8 Wavelet Theory 4 zA unified framework for analysis of non-stationary signals zWavelet transform (WT): Alternative to classical Short-Time Fourier Transform (STFT) or Gabor Transform zBy contrast to STFT, WT does “constant-Q” or relative bandwidth frequency analysis: short windows at high frequencies and long windows at low frequencies

9 9 Short-Time Fourier Transform 4 zFourier Transform (FT): zX(f): Projection of signal x(t) along exp(j2  ft) zHow signal energy being distributed over frequencies

10 10 Short-Time Fourier Transform 4 zTo know local energy distribution, STFT is introduced: zg(t): A window of finite support zAround local time , how signal energy being distributed over frequencies

11 11 Short-Time Fourier Transform 4 zGiven f, STFT( ,  ): Output of a bandpass filter having the window function (modulated to f) as its impulse response zResolution in time/frequency by window g(t):

12 12 Short-Time Fourier Transform 4 zUncertainty Principle (Heisenberg): zOnce window g(t) chosen, resolution in time/frequency fixed

13 13 Continuous Wavelet Transform (CWT) 4 zIf  can be kept constant, resolution in frequency becomes arbitrarily good at low frequencies while resolution in time becomes arbitrarily good at high frequencies zCWT follows the above idea but all impulse responses of filter bank are defined as scaled versions of the same prototype or basic wavelet h(t)

14 14 Continuous Wavelet Transform (CWT) 4 zLet z zh(t): Any bandpass function

15 15 Continuous Wavelet Transform (CWT) 4 zFT of h a (t):

16 16 Continuous Wavelet Transform (CWT) 4 zResolution in frequency of h a (t):

17 17 Continuous Wavelet Transform (CWT) 4 zGiven a fixed frequency f 0, if scale a is chosen as

18 18 Continuous Wavelet Transform (CWT) 4 zBy definition of CWT: zScale a not linked to frequency modulation but related to time- scaling

19 19 Continuous Wavelet Transform (CWT) 4 zSignal x(at) seen through a constant length filter centered at  /a zLarger scale a is, more contracted signal x(t) becomes zSmaller scale a is, more dilated signal x(t) becomes zLarger scales: CWT( ,a) provides more global view of signal x(t) zSmaller scales: CWT( ,a) provides more detailed view of signal x(t)

20 20 Continuous Wavelet Transform (CWT) 4 zDefine wavelet h a,  z :Inner product or correlation between x(t) and h a,  zCWT( ,a) called analysis stage (of signal x(t)) at scale a

21 21 Continuous Wavelet Transform (CWT) 4 zx(t) can be recovered from multi-scale analysis if z

22 22 Continuous Wavelet Transform (CWT) 4 zEnergy conservation: zSignal energy distributed at scale a by: z : wavelet spectrogram, or scalogram, distribution of signal energy in time-scale plane (associated with area measure )

23 23 Continuous Wavelet Transform (CWT) 4 zLarger scales  more global view  courser resolutions zSmaller scales  more detailed view  finer resolutions zCWT decomposition of signal over scales  signal energy distribution with various resolutions

24 24 Discrete Wavelet Transform (DWT) 4 zTwo methods developed independently in late 70’s and early 80’s ySubband Coding yPyramid Coding or multiresolution signal analysis

25 25 Multiresolution Pyramid 4 zGiven an original sequence x(n), n  Z, define a lower resolution signal: Where g(n) : a halfband lowpass filter

26 26 Multiresolution Pyramid 4 zAn approximation of x(n) from y(n) : Where y’(2n) = y(n), y’(2n+1) = 0 g’(n) : an interpolative filter

27 27 Multiresolution Pyramid 4 zIf g(n) and g’(n) are perfect halfband filters, i.e., then a(n) provides a perfect halfband lowpass approximation to x(n)

28 28 Multiresolution Pyramid 4 zIt can be proved :

29 29 Multiresolution Pyramid 4 zLet d(n) = x(n) - a(n) zThen x(n) = a(n) +d(n) zBut  redundancy between a(n) and d(n) : yIf x(n) uses sampling rate f s, d(n) and y(n) use sampling rate f s or f s /2, respectively

30 30 Multiresolution Pyramid 4 zPyramid decomposition : a redundant representation zBut redundancy upper bounded by : 1 + 1/2 + 1/4 + … < 2 in one dimensional system x(n) y(n) y (n) d(n)d (n)

31 31 Multiresolution Pyramid 4 zFor perfect halfband lowpass filters g(n) and g’(n), it is clear that d(n) contains frequencies above  /2 of x(n), and thus can also be subsampled by two without loss of information. zIn a pyramid, it is possible to take very good lowpass filters and derive visually pleasing course versions zIn a subband scheme, critical sampling is accomplished at a price of a constraint filter design and a relatively poor lowpass version as a course approximation : undesirable if the course version is used for viewing in a compatible subchannel

32 32 Subband Coding 4 zOne stage of a pyramid decomposition  a half rate low resolution signal + a full rate difference signal z# (samples) increased by 50% zIf filter g(n) and g’(n) meet certain conditions, oversampling can be avoided zSubband coding first popularized in speech compression does not produce such redundancy

33 33 Subband Coding 4 zA full-band one dimensional signal is decomposed into two subbands using an analysis filter bank zIdeally, the analysis filter bank consists of a lowpass filter and a highpass filter with nonoverlapping frequency responses and unit gain over their respective bandwidth zAfter filtering, lowpass and highpass signals each have only a half of original bandwidth or “frequency content”, and thus can be downsampled in half zBut ideal filters are unrealizable

34 34 Subband Coding 4 zBy using overlapping responses, frequency gaps in subband signals can be prevented zAliasing will be introduced when lowpass and highpass signals are downsampled in half zThe aliasing effect can be eliminated to produce perfect reconstruction at synthesis stage zLowpass and highpass signals will each have a bandwidth more than a half of original bandwidth zQuadrature Mirror Filters (QMF) for analysis/synthesis filtering

35 35 Subband Coding 4 zOutput signals from analysis bank after downsampling: y 1 (n)=(h 1 *x)(2n) y 2 (n)=(h 2 *x)(2n) z zAfter quantization, y 1 (n) and y 2 (n)  zAfter upsampling, become:

36 36 Subband Coding 4 zOutput signals from synthesis bank: zReconstructed signal:

37 37 Subband Coding 4 zIgnoring quantization or coding effect, zIf H 1 (z), G 1 (z) are ideal lowpass filters and H 2 (z), G 2 (z) are ideal highpass filters,

38 38 Subband Coding 4 zThen

39 39 Subband Coding 4 zImplying zIndicating is the aliasing component when filters are not ideal, which is desired to be zero

40 40 Subband Coding 4 zTo have perfect reconstruction in non-ideal filtering case, the iff conditions are: zIf H 2 (z)=H 1 (-z), G 1 (z)=2H 1 (z), G 2 (z)=-2H 1 (-z), the aliased term becomes zero and the reconstructed is given:

41 41 Subband Coding 4 zFor perfect reconstruction, we need or zUsing symmetric linear phase FIR of length N for H 1 results in

42 42 Subband Coding 4 zAs N=even, zQMF filters   /2 0 1

43 43 Subband Coding 4 zIf subband filters H i (z), G i (z) satisfy three conditions perfect reconstruction results, too zAliased term

44 44 Multiresolution Wavelet Representation and Approximation 4 zEmbedded linear spaces in L 2 (R): zLet A j be an orthogonal projection on V j : zLet O j be the orthonormal complement of V j in V j+1 :

45 45 Multiresolution Wavelet Representation and Approximation 4 zLet D j be an orthogonal projection on O j : zThen an original signal A 0 f can be decomposed as:

46 46 Multiresolution Wavelet Representation and Approximation 4 zA -J f = the orthogonal projection of A 0 f on zD -j f = the orthogonal projection of A 0 f on O -j zD -j f and D -k f : orthogonal to each other or uncorrelated to each other z D -j f : orthogonal to A -J f, or uncorrelated to A -J f zA -J f : a coarse version of A 0 f z : details of A 0 f arranged from coarser to finer

47 47 Multiresolution Wavelet Representation and Approximation 4 zLet be an orthonormal basis of V j : zA j f can be characterized by the coefficients of orthonormal expansion: zThe sequence denoted by and called a discrete approximation of f in V j

48 48 Multiresolution Wavelet Representation and Approximation 4 zLet be an orthonormal basis of O j zD j f characterized by the coefficients zThe sequence denoted by and called a discrete approximation of f in O j

49 49 Multiresolution Wavelet Representation and Approximation 4 zThus, A 0 f can be characterized by z can be further characterized by zThis set of discrete signals is called orthogonal “wavelet” representation z is organized as a coarse version added by increasing fine details zThe orthogonal representation: decorrelated representation

50 50 Multiresolution Wavelet Representation and Approximation 4 zIf we require: zA j f is band-limited such that it can be sampled by a rate of 2 j, i.e., 2 j samples per time or length unit

51 51 Multiresolution Wavelet Representation and Approximation 4 zTranslation invariant with A 0 : zTranslation invariant with produced by

52 52 Multiresolution Wavelet Representation and Approximation 4 zThen ‘s can be constructed by a scaling function z zFurthermore, let then

53 53 Multiresolution Wavelet Representation and Approximation 4 z filtered by and downsampled by two zLet

54 54 Multiresolution Wavelet Representation and Approximation 4 zLet then ‘s can be constructed by zLet then

55 55 Multiresolution Wavelet Representation and Approximation 4 z filtered by and downsampled by two zFrom zH,G: Quadrature Mirror Filters

56 56 Multiresolution Wavelet Representation and Approximation 4

57 57 Multiresolution Wavelet Representation and Approximation 4

58 58 Multiresolution Wavelet Representation and Approximation 4 zThink of: zThen, analysis stage for subband or wavelet decomposition is the same zHigher resolution signal  Two low resolution signals through filtering by and downsampling by two

59 59 Multiresolution Wavelet Representation and Approximation 4 zSynthesis stage for subband or wavelet decomposition is different zFor subband: low resolution signals upsampled by two, followed by filtering by, followed by summation to reconstruct higher resolution signal zFor wavelet: low resolution signals filtered by the same, and downsampled by two, followed by summation to reconstruct higher resolution signal

60 60 Multiresolution Wavelet Representation and Approximation 4 zAfter filtering at analysis stage, two produced signals have only a half resolution as the original signal zDownsampling by two is justifiable zBefore filtering at synthesis stage, upsampling by two on two low resolution signals in subband decomposition seems not well justifiable

61 61 Multiresolution Wavelet Representation and Approximation 4

62 62 Multiresolution Wavelet Representation and Approximation 4

63 63 Multiresolution Wavelet Representation and Approximation 4


Download ppt "1 Audio/Video Compression 4 zLecture 3: Multimedia Networks zLecture 4: Audio/Video Compression zImage & Video Compression Standards zSpeech & Audio Compression."

Similar presentations


Ads by Google