„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.

Slides:



Advertisements
Similar presentations
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Frequency modulation and circuits
Copyright 2001, Agrawal & BushnellVLSI Test: Lecture 181 Lecture 18 DSP-Based Analog Circuit Testing  Definitions  Unit Test Period (UTP)  Correlation.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Digital Coding of Analog Signal Prepared By: Amit Degada Teaching Assistant Electronics Engineering Department, Sardar Vallabhbhai National Institute of.
CMP206 – Introduction to Data Communication & Networks Lecture 2 – Signals.
Pitch Prediction From MFCC Vectors for Speech Reconstruction Xu shao and Ben Milner School of Computing Sciences, University of East Anglia, UK Presented.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
VOICE CONVERSION METHODS FOR VOCAL TRACT AND PITCH CONTOUR MODIFICATION Oytun Türk Levent M. Arslan R&D Dept., SESTEK Inc., and EE Eng. Dept., Boğaziçi.
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Communications & Multimedia Signal Processing Report of Work on Formant Tracking LP Models and Plans on Integration with Harmonic Plus Noise Model Qin.
Communications & Multimedia Signal Processing Meeting 7 Esfandiar Zavarehei Department of Electronic and Computer Engineering Brunel University 23 November,
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.
Speech Recognition in Noise
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
HIWIRE Progress Report – July 2006 Technical University of Crete Speech Processing and Dialog Systems Group Presenter: Alex Potamianos Technical University.
Voice Transformations Challenges: Signal processing techniques have advanced faster than our understanding of the physics Examples: – Rate of articulation.
Pitch Prediction for Glottal Spectrum Estimation with Applications in Speaker Recognition Nengheng Zheng Supervised under Professor P.C. Ching Nov. 26,
1/21 Chapter 5 – Signal Encoding and Modulation Techniques.
LE 460 L Acoustics and Experimental Phonetics L-13
All Rights Reserved © Alcatel-Lucent 2006, 2008 Enabling high efficiencies Digital signal conditioning in modern RF transmitters Thomas BOHN June 2008.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Lecture 18 DSP-Based Analog Circuit Testing
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
Basics of Neural Networks Neural Network Topologies.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
Singer similarity / identification Francois Thibault MUMT 614B McGill University.
Speaker Identification by Combining MFCC and Phase Information Longbiao Wang (Nagaoka University of Technologyh, Japan) Seiichi Nakagawa (Toyohashi University.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
Singer Similarity Doug Van Nort MUMT 611. Goal Determine Singer / Vocalist based on extracted features of audio signal Classify audio files based on singer.
Performance Comparison of Speaker and Emotion Recognition
More On Linear Predictive Analysis
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
Sung-Won Yoon, David ChoiEE368C Project Proposal Bandwidth Extrapolation of Audio Signals Sung-Won Yoon David Choi February 8 th, 2001.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
UNIT-IV. Introduction Speech signal is generated from a system. Generation is via excitation of system. Speech travels through various media. Nature of.
Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.
Speech Recognition through Neural Networks By Mohammad Usman Afzal Mohammad Waseem.
High Quality Voice Morphing
ARTIFICIAL NEURAL NETWORKS
Digital Communications Chapter 13. Source Coding
Vocoders.
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Neuro-Computing Lecture 4 Radial Basis Function Network
Vocoders.
network of simple neuron-like computing elements
Bandwidth Extrapolation of Audio Signals
Uses of filters To remove unwanted components in a signal
Auditory Morphing Weyni Clacken
Presentation transcript:

„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June Mainz, Germany Bernd Iser

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Contents  Motivation  Model for Speech Production Process  Bandwidth Extension Generation of the excitation signal -Non-linear characteristics -Results using non-linear characteristics Generation of the spectral envelope -Codebook approach -Neural network approach -Linear mapping approach Power adjustment  Current Results Audio samples  Outlook

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Band limited audio signal: Original audio signal: Motivation Problem:Degradation of speech quality due to suppression/cancelation of frequency bands (e.g., transmission over telephone network) Idea:Extrapolate missing frequency components out of bandlimited signal Advantage:Network as well as transmission system can remain unchanged But:In most cases environment provides more bandwidth (e.g., - MOST-bus: Hz sampling rate or - GSM: 8000 Hz sampling rate)

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Excitation Signal Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Block diagram of BWE:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Excitation Signal Extension of pitch structure in case of voiced sounds. Generation of a noise like excitation signal in case of unvoiced sounds. Generation of a „broadband“ excitation signal:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Excitation Signal  „Harmonic Modeling“ Placing spectral components (pitch, voicing) Function generators: sine (pitch, voicing), noise,...  Shifting / modulation approaches (frequency / time domain) Fixed Pitch adaptive (requires pitch analysis!)  Application of non-linear characteristics Piecewise defined characteristics (distributions): halfway-, fullway-rectification, saturation... Quadratic-, cubic-, tanh-,... characteristics (functions) Approaches for the generation of a „broadband“ excitation signal:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Excitation Signal Applied to a har- monic signal filtered by a bandpass the resulting signal shows the missing harmonics. Notice the aliasing in the upper frequencies. Application of a non-linear characteristic:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Excitation Signal If the input signal is upsampled (e.g., by the factor of 4) before the half-way rectification is performed, almost no aliasing can be observed after lowpassfiltering and downsampling. Application of a non-linear characteristic:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Predictor error filter Predictor error filtering for extracting the excitation signal Generation of the Excitation Signal Application of a cubic characteristic in the time domain:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Generation of the Spectral Envelope

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Extension of spectral envelope. Placing formants of estimated envelope where broadband formants are.

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Codebook „Narrowband“ and „broadband“ codebook trained jointly using envelopes of wideband data and bandlimited counterparts Weight codebook entries with inverse distance to input envelope and sum them up (LSF) Possibility of including other features than spectral envelope in „narrowband“ codebook using a special distance measure Codebook approach as classification stage with post processing by e.g., neural network or linear mapping Can be implemented taking predecessor and successor into account

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Neural network Exploit quasy-stationarity of speech by using a memory Feeding NN with other features than just spectral envelope Various architectures and training algorithms Can be used as post processing after codebook classification

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Approaches for the generation of a „broadband“ spectral envelope out of the „narrowband“ information:  Linear mapping Can be implemented taking predecessor and successor into account Can be used as post processing after codebook classification

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Codebook: „Narrowband“ codebook „Broadband“ codebook Comparison (distance measure) Envelope input signal Output of „broadband“ counterpart Weighting the codebook entries with the „inverse“ distance

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope With N being the LSF order and M the codebook size, respectively Computation of the output LSFs:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Spectral distortion: City block distance Euclidean distance Minkowski distance 1.Initialising: Compute the centroid for the whole training data. 2.Splitting: Each centroid is splitted into two near vectors by the application of a perturbance. 3.Quantization: The whole training data is assigned to the centroids by the application of a certain distance measure and afterwards the centroids are calculated again. Step 3 is executed again and again until the result doesn‘t show any significant changes. 4.Is the desired codebook size reached => abort. Otherwise continue with step 2. Generation of the Spectral Envelope Training of codebook (LBG-algorithm): Likelihood ratio distance measure:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Linear Mapping: Narrowband input features (LPC, CC, LSF): Broadband input features (LPC, CC, LSF): Aim to find mapping matrix: Optimization criterion: Leads to optimal mapping matrix:

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Generation of the Spectral Envelope Linear Mapping as post processing algorithm after codebook classification: Note that this principle can be applied to other approaches. E.g., one could exchange the multiplication with the linear mapping matrix with the processing by a neural network which has been trained corresponding to the classification to the respective codebook entry.

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Power adjustment Envelope estimation Band stop Narrowband parameters Removing spectral envelope Excitation signal extension Input signalOutput signal Phase manipulation Excitation signal (source) Spectral envelope (filter) Model gain Power Adjustment

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Power Adjustment Power comparison: Computation of the gain out of the ratio of the power of the extended signal to the input signal within the telephone band

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Current Results Setup used to produce results:  Database TIMIT processed with WM NetSim tool (training, english) -Phone filter / GSM / phone filter  Algorithm Excitation signal -Lower part extended using half way rectification -Higher part extended using half way rectification Spectral envelope -Codebook classification using 64 entries -Post processing with linear mapping

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Current Results Audio samples: Female 1Female 2Male 1Male 2 Telephone limited Extended

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Outlook Outlook on future work:  Integration of additional features into codebook training Pitch information Information on „voicedness“  Add „comfort-noise“  Training of neural network Using additional features In combination with codebook

Bernd Iser 2nd Workshop on Wideband Speech Quality - June Thank you for your attention!