Presentation is loading. Please wait.

Presentation is loading. Please wait.

[Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006.

Similar presentations


Presentation on theme: "[Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006."— Presentation transcript:

1

2 [Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006

3 State of the Art in Speech/Audio Speech and audio processing may be divided into “low-level” and “high-level” inference Speech enhancement, compression, and coding are all widely used technologies This low-level work is the most mature High-level tasks will drive future advances Speech/music database information retrieval Automatic speaker and speech recognition But low-level issues also remain…

4 Fundamental Questions How to obtain highly structured representations of speech and audio signals? Time frequency “atoms” as building blocks How can statistical inference enable advances in speech signal processing? A means to obtain an “atomic decomposition” Statistical modeling of time- frequency coefficients provides a principled solution

5 Representative Applications Missing data in the context of VOIP: Original Missing Restored Source / Speaker Separation Source 1 Source 2 Mixture 1 Mixture 2 Recovery 1 Recovery 2

6 Digital Speech/Audio Processing

7 Speech Production

8 Time-Scale Modification

9 Male & Female Speaker Original Fast Faster Slower Trumpet Original Fast Slow Speech and Quasi-Periodic Audio Sinewave-based Modification Voicing-dependent Rate Factor

10 More Time-Scale Modification Falling Can, Bongo Drums, Loon Original Slow Complex Non-Speech Signals Phase-Vocoder-based Modification Event-Dependent Phase Coherence

11 Pitch and Vocal Tract Change Male & Female Speaker Original Low pitch/Long vocal tract High pitch/Short vocal tract Male Speaker Original and Monotone Sinewave-based Modification

12 Speech Coding Female Speaker Original CELP 8000 bps Sine 4800 bps Sine 2400 bps Sinewave-based Code-Excited Linear Prediction Male Speaker Original CELP 8000 bps Sine 4800 bps Sine 2400 bps

13 Noise Reduction Cell Phone Noise, Cocktail Party, Automobile Noise Original Enhanced Adaptive Wiener Filter Adaptation Based on Spectral Change

14 Compression Low-noise case Original 1.5 dB Reduction 3.0 dB Reduction Reduction of Peak-to-RMS amplitude ratio Based on Sinewave Analysis/Synthesis High-noise case Original 1.5 dB Reduction 3.0 dB Reduction


Download ppt "[Advanced] Speech & Audio Signal Processing ES 157/257: Speech and Audio Processing Prof. Patrick Wolfe, Harvard DEAS 02 February 2006."

Similar presentations


Ads by Google