Presentation is loading. Please wait.

Presentation is loading. Please wait.

„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.

Similar presentations


Presentation on theme: "„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June."— Presentation transcript:

1

2 „Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June 2005 - Mainz, Germany Bernd Iser biser@harmanbecker.com

3 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 2 Contents  Motivation  Model for Speech Production Process  Bandwidth Extension Generation of the excitation signal -Non-linear characteristics -Results using non-linear characteristics Generation of the spectral envelope -Codebook approach -Neural network approach -Linear mapping approach Power adjustment  Current Results Audio samples  Outlook

4 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 3 Band limited audio signal: Original audio signal: Motivation Problem:Degradation of speech quality due to suppression/cancelation of frequency bands (e.g., transmission over telephone network) Idea:Extrapolate missing frequency components out of bandlimited signal Advantage:Network as well as transmission system can remain unchanged But:In most cases environment provides more bandwidth (e.g., - MOST-bus: 11025 Hz sampling rate or - GSM: 8000 Hz sampling rate)

5 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 4 Generation of the Excitation Signal Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Block diagram of BWE:

6 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 5 Generation of the Excitation Signal Extension of pitch structure in case of voiced sounds. Generation of a noise like excitation signal in case of unvoiced sounds. Generation of a „broadband“ excitation signal:

7 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 6 Generation of the Excitation Signal  „Harmonic Modeling“ Placing spectral components (pitch, voicing) Function generators: sine (pitch, voicing), noise,...  Shifting / modulation approaches (frequency / time domain) Fixed Pitch adaptive (requires pitch analysis!)  Application of non-linear characteristics Piecewise defined characteristics (distributions): halfway-, fullway-rectification, saturation... Quadratic-, cubic-, tanh-,... characteristics (functions) Approaches for the generation of a „broadband“ excitation signal:

8 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 7 Generation of the Excitation Signal Applied to a har- monic signal filtered by a bandpass the resulting signal shows the missing harmonics. Notice the aliasing in the upper frequencies. Application of a non-linear characteristic:

9 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 8 Generation of the Excitation Signal If the input signal is upsampled (e.g., by the factor of 4) before the half-way rectification is performed, almost no aliasing can be observed after lowpassfiltering and downsampling. Application of a non-linear characteristic:

10 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 9 Predictor error filter Predictor error filtering for extracting the excitation signal Generation of the Excitation Signal Application of a cubic characteristic in the time domain:

11 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 10 Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Generation of the Spectral Envelope

12 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 11 Generation of the Spectral Envelope Extension of spectral envelope. Placing formants of estimated envelope where broadband formants are.

13 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 12 Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Codebook „Narrowband“ and „broadband“ codebook trained jointly using envelopes of wideband data and bandlimited counterparts Weight codebook entries with inverse distance to input envelope and sum them up (LSF) Possibility of including other features than spectral envelope in „narrowband“ codebook using a special distance measure Codebook approach as classification stage with post processing by e.g., neural network or linear mapping Can be implemented taking predecessor and successor into account

14 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 13 Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Neural network Exploit quasy-stationarity of speech by using a memory Feeding NN with other features than just spectral envelope Various architectures and training algorithms Can be used as post processing after codebook classification

15 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 14 Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Linear mapping Can be implemented taking predecessor and successor into account Can be used as post processing after codebook classification

16 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 15 Generation of the Spectral Envelope Codebook: „Narrowband“ codebook „Broadband“ codebook Comparison (distance measure) Envelope input signal Output of „broadband“ counterpart Weighting the codebook entries with the „inverse“ distance

17 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 16 Generation of the Spectral Envelope With N being the LSF order and M the codebook size, respectively Computation of the output LSFs:

18 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 17 Spectral distortion: City block distance Euclidean distance Minkowski distance 1.Initialising: Compute the centroid for the whole training data. 2.Splitting: Each centroid is splitted into two near vectors by the application of a perturbance. 3.Quantization: The whole training data is assigned to the centroids by the application of a certain distance measure and afterwards the centroids are calculated again. Step 3 is executed again and again until the result doesn‘t show any significant changes. 4.Is the desired codebook size reached => abort. Otherwise continue with step 2. Generation of the Spectral Envelope Training of codebook (LBG-algorithm): Likelihood ratio distance measure:

19 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 18 Generation of the Spectral Envelope Linear Mapping: Narrowband input features (LPC, CC, LSF): Broadband input features (LPC, CC, LSF): Aim to find mapping matrix: Optimization criterion: Leads to optimal mapping matrix:

20 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 19 Generation of the Spectral Envelope

21 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 20 Generation of the Spectral Envelope Linear Mapping as post processing algorithm after codebook classification: Note that this principle can be applied to other approaches. E.g., one could exchange the multiplication with the linear mapping matrix with the processing by a neural network which has been trained corresponding to the classification to the respective codebook entry.

22 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 21 Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Power Adjustment

23 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 22 Power Adjustment Power comparison: Computation of the gain out of the ratio of the power of the extended signal to the input signal within the telephone band

24 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 23 Current Results Setup used to produce results:  Database TIMIT processed with WM NetSim tool (training, english) -Phone filter / GSM / phone filter  Algorithm Excitation signal -Lower part extended using half way rectification -Higher part extended using half way rectification Spectral envelope -Codebook classification using 64 entries -Post processing with linear mapping

25 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 24 Current Results Audio samples: Female 1Female 2Male 1Male 2 Telephone limited Extended

26 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 25 Outlook Outlook on future work:  Integration of additional features into codebook training Pitch information Information on „voicedness“  Add „comfort-noise“  Training of neural network Using additional features In combination with codebook

27 Bernd Iser 2nd Workshop on Wideband Speech Quality - June 2005 26 Thank you for your attention!


Download ppt "„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June."

Similar presentations


Ads by Google