Download presentation
Published byJohn McPherson Modified over 11 years ago
1
Alex Chen Nader Shehad Aamir Virani Erik Welsh
W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh
2
Overview Approach Psychoacoustic Modeling Filter Banks Quantization
Demonstration Results Further Research
3
Approach Encoding: Filter Banks Quantization Input Encoded Signal
Psychoacoustic Model Decoding: Encoded Signal Inverse Quantization Reconstruction Filter Banks Output
4
Psychoacoustic Model Based on studies that show hearing capabilities affected by: Environment Limitations of human auditory system Used to eliminate portions of signal average human won’t hear Two key properties: Absolute threshold of hearing Auditory masking
5
Absolute Threshold of Hearing
Experiment: Plot audible threshold of tone Observations: Auditory system sensitive to some frequencies Frequencies within “critical bandwidth” treated similarly Basis for Bark scale
6
Auditory Masking Tones and noise drown out less powerful sounds
Affect neighboring frequencies Affect critical bandwidth Effects add to produce overall masking threshold Mask quantization
7
Filter Banks Theory Array of bandpass filters
Break up signal into frequency subbands Allows for variable coding scheme
8
Analysis and Synthesis Banks
1) Analysis filters divide up the signal 2) Down-sample 3) Quantize 4) Up-sample 5) Synthesis filters remove distortions 6) Reconstruct the signal
9
Filter Bank Design Phase
Tradeoff between fine and coarse frequency resolution Piccolo vs. Castanets Non-stationary signals We used non-adaptive approach
10
Filter Bank Implementation
We used Cosine Modulated PR (perfect reconstruction) filter banks with 32 filters each Output is a delayed version of the input (linear phase) Distortion arises from quantization only
11
Quantization Two types Narrow-band Full-range Current input
Overhead cost Full-range Independent of current input No overhead Sampled Input Quantized Version Reconstructed Input
12
Quantization Narrow Band Full-Range More accurate
Lower compression ratio Full-Range Less accurate Higher compression ratio Using 3-bit Quantization Input: Levels: Recon.: Total Error: .16 Input: Output: Recon: Total Error: .34
13
Demonstration Sine wave Chime Percussion Modern Full range
Narrow range Chime 8-bit Percussion Full Range Narrow Range Modern 8-bit
14
Sine Wave (time) Full-Range Quantization Narrow Quantization
15
Sine Wave (freq) Full-Range Quantization Narrow Quantization
16
Sine Wave (freq error) Full-Range Quantization Narrow Quantization
17
Modern (time) Full-Range Quantization Narrow Quantization
18
Modern (freq) Full-Range Quantization Narrow Quantization
19
Modern (freq error) Full-Range Quantization Narrow Quantization
20
Results Full Range: Smallest File, Worst Sound Quality
Narrow Range: Better Sound Quality, Larger File MP3: Industry Standard
21
Further Research Filter Banks Better Psychoacoustic Model
Wavelets Dynamic Frequency Ranges Better Psychoacoustic Model Tone Designation Pre- and Post- Echo Bit Allocation Writing a File
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.