Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alex Chen Nader Shehad Aamir Virani Erik Welsh

Similar presentations


Presentation on theme: "Alex Chen Nader Shehad Aamir Virani Erik Welsh"— Presentation transcript:

1 Alex Chen Nader Shehad Aamir Virani Erik Welsh
W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

2 Overview Approach Psychoacoustic Modeling Filter Banks Quantization
Demonstration Results Further Research

3 Approach Encoding: Filter Banks Quantization Input Encoded Signal
Psychoacoustic Model Decoding: Encoded Signal Inverse Quantization Reconstruction Filter Banks Output

4 Psychoacoustic Model Based on studies that show hearing capabilities affected by: Environment Limitations of human auditory system Used to eliminate portions of signal average human won’t hear Two key properties: Absolute threshold of hearing Auditory masking

5 Absolute Threshold of Hearing
Experiment: Plot audible threshold of tone Observations: Auditory system sensitive to some frequencies Frequencies within “critical bandwidth” treated similarly Basis for Bark scale

6 Auditory Masking Tones and noise drown out less powerful sounds
Affect neighboring frequencies Affect critical bandwidth Effects add to produce overall masking threshold Mask quantization

7 Filter Banks Theory Array of bandpass filters
Break up signal into frequency subbands Allows for variable coding scheme

8 Analysis and Synthesis Banks
1) Analysis filters divide up the signal 2) Down-sample 3) Quantize 4) Up-sample 5) Synthesis filters remove distortions 6) Reconstruct the signal

9 Filter Bank Design Phase
Tradeoff between fine and coarse frequency resolution Piccolo vs. Castanets Non-stationary signals We used non-adaptive approach

10 Filter Bank Implementation
We used Cosine Modulated PR (perfect reconstruction) filter banks with 32 filters each Output is a delayed version of the input (linear phase) Distortion arises from quantization only

11 Quantization Two types Narrow-band Full-range Current input
Overhead cost Full-range Independent of current input No overhead Sampled Input Quantized Version Reconstructed Input

12 Quantization Narrow Band Full-Range More accurate
Lower compression ratio Full-Range Less accurate Higher compression ratio Using 3-bit Quantization Input: Levels: Recon.: Total Error: .16 Input: Output: Recon: Total Error: .34

13 Demonstration Sine wave Chime Percussion Modern Full range
Narrow range Chime 8-bit Percussion Full Range Narrow Range Modern 8-bit

14 Sine Wave (time) Full-Range Quantization Narrow Quantization

15 Sine Wave (freq) Full-Range Quantization Narrow Quantization

16 Sine Wave (freq error) Full-Range Quantization Narrow Quantization

17 Modern (time) Full-Range Quantization Narrow Quantization

18 Modern (freq) Full-Range Quantization Narrow Quantization

19 Modern (freq error) Full-Range Quantization Narrow Quantization

20 Results Full Range: Smallest File, Worst Sound Quality
Narrow Range: Better Sound Quality, Larger File MP3: Industry Standard

21 Further Research Filter Banks Better Psychoacoustic Model
Wavelets Dynamic Frequency Ranges Better Psychoacoustic Model Tone Designation Pre- and Post- Echo Bit Allocation Writing a File


Download ppt "Alex Chen Nader Shehad Aamir Virani Erik Welsh"

Similar presentations


Ads by Google