Alex Chen Nader Shehad Aamir Virani Erik Welsh

Name: Alex Chen Nader Shehad Aamir Virani Erik Welsh
Uploaded: 2017-12-15T16:38:45+00:00
Duration: PTM5S21
Channel: John McPherson
Description: Alex Chen Nader Shehad Aamir Virani Erik Welsh

Alex Chen Nader Shehad Aamir Virani Erik Welsh
W.A.V.S. Compression Alex Chen Nader Shehad Aamir Virani Erik Welsh

Overview Approach Psychoacoustic Modeling Filter Banks Quantization
Demonstration Results Further Research

Approach Encoding: Filter Banks Quantization Input Encoded Signal
Psychoacoustic Model Decoding: Encoded Signal Inverse Quantization Reconstruction Filter Banks Output

Psychoacoustic Model Based on studies that show hearing capabilities affected by: Environment Limitations of human auditory system Used to eliminate portions of signal average human won’t hear Two key properties: Absolute threshold of hearing Auditory masking

Absolute Threshold of Hearing
Experiment: Plot audible threshold of tone Observations: Auditory system sensitive to some frequencies Frequencies within “critical bandwidth” treated similarly Basis for Bark scale

Auditory Masking Tones and noise drown out less powerful sounds
Affect neighboring frequencies Affect critical bandwidth Effects add to produce overall masking threshold Mask quantization

Filter Banks Theory Array of bandpass filters
Break up signal into frequency subbands Allows for variable coding scheme

Analysis and Synthesis Banks
1) Analysis filters divide up the signal 2) Down-sample 3) Quantize 4) Up-sample 5) Synthesis filters remove distortions 6) Reconstruct the signal

Filter Bank Design Phase
Tradeoff between fine and coarse frequency resolution Piccolo vs. Castanets Non-stationary signals We used non-adaptive approach

Filter Bank Implementation
We used Cosine Modulated PR (perfect reconstruction) filter banks with 32 filters each Output is a delayed version of the input (linear phase) Distortion arises from quantization only

Quantization Two types Narrow-band Full-range Current input
Overhead cost Full-range Independent of current input No overhead Sampled Input Quantized Version Reconstructed Input

Quantization Narrow Band Full-Range More accurate
Lower compression ratio Full-Range Less accurate Higher compression ratio Using 3-bit Quantization Input: Levels: Recon.: Total Error: .16 Input: Output: Recon: Total Error: .34

Demonstration Sine wave Chime Percussion Modern Full range
Narrow range Chime 8-bit Percussion Full Range Narrow Range Modern 8-bit

Sine Wave (time) Full-Range Quantization Narrow Quantization

Sine Wave (freq) Full-Range Quantization Narrow Quantization

Sine Wave (freq error) Full-Range Quantization Narrow Quantization

Modern (time) Full-Range Quantization Narrow Quantization

Modern (freq) Full-Range Quantization Narrow Quantization

Modern (freq error) Full-Range Quantization Narrow Quantization

Results Full Range: Smallest File, Worst Sound Quality
Narrow Range: Better Sound Quality, Larger File MP3: Industry Standard

Further Research Filter Banks Better Psychoacoustic Model
Wavelets Dynamic Frequency Ranges Better Psychoacoustic Model Tone Designation Pre- and Post- Echo Bit Allocation Writing a File

Alex Chen Nader Shehad Aamir Virani Erik Welsh

Similar presentations

Presentation on theme: "Alex Chen Nader Shehad Aamir Virani Erik Welsh"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Alex Chen Nader Shehad Aamir Virani Erik Welsh

Similar presentations

Presentation on theme: "Alex Chen Nader Shehad Aamir Virani Erik Welsh"— Presentation transcript:

Similar presentations

About project

Feedback