Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.

Slides:



Advertisements
Similar presentations
Wideband Speech Coding for CDMA2000® Systems
Advertisements

VMR-WB – Operation of the 3GPP2 Wideband Speech Coding Standard M. Jelinek†, R. Salami‡ and S. Ahmadi * †University of Sherbrooke, Canada ‡VoiceAge Corporation,
ITU-T SG13 futures session – July 25, D1 Present document contains informations proprietary to France Telecom. Accepting this document means for.
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Time-Frequency Analysis Analyzing sounds as a sequence of frames
STQ Workshop, Sophia-Antipolis, February 11 th, 2003 Packet loss concealment using audio morphing Franck Bouteille¹ Pascal Scalart² Balazs Kövesi² ¹ PRESCOM.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Speech Enhancement through Noise Reduction By Yating & Kundan.
Philippe Gournay, Bruno Bessette, Roch Lefebvre
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Ranko Pinter Simoco Digital Systems
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
A STUDY OF DESIGN COMPROMISES FOR SPEECH CODERS IN PACKET NETWORKS 1.INTRODUCTION In voice over packet networks, the coding gain achieved by prediction-based.
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
2001/05/24Chin-Kai Wu, CS, NTHU1 Improved frame erasure concealment for CELP-based coders Juan Carlos De Martin, Takahiro Unno, Vishu Viswanathan DSPS.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
System Microphone Keyboard Output. Cross Synthesis: Two Implementations.
Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:
OCTOBER 23-24, 2012 VOCODER TECHNOLOGY
A VOICE ACTIVITY DETECTOR USING THE CHI-SQUARE TEST
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
1 ELEN 6820 Speech and Audio Processing Prof. D. Ellis Columbia University Midterm Presentation High Quality Music Metacompression Using Repeated- Segment.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
UNIVERSITÉ DE SHERBROOKE - Philippe G OURNAY Senior Research Engineer VoiceAge Corporation University of Sherbrooke François R OUSSEAU, Roch L EFEBVRE.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
D1 - 27/10/2015 The present document contains information that remains the property of France Telecom. The recipient’s acceptance of this document implies.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
CCN COMPLEX COMPUTING NETWORKS1 This research has been supported in part by European Commission FP6 IYTE-Wireless Project (Contract No: )
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
1.INTRODUCTION The use of the adaptive codebook (ACB) in CELP-like speech coders allows the achievement of high quality speech, especially for voiced segments.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
More On Linear Predictive Analysis
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
A UDIO B ANDWIDTH D ETECTION IN THE EVS C ODEC University of Sherbrooke, Canada VoiceAge Corporation, Montréal, Canada Fraunhofer IIS, Erlagen, Germany.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
IEEE GlobalSIP, Orlando, FL, USA, December 14-16, 2015 Enhanced AMR-WB Bandwidth Extension in 3GPP EVS Codec Magdalena Kaniewska, Stéphane Ragot Orange.
MEMORY-LESS GAIN QUANTIZATION IN THE EVS CODEC Vladimir Malenovsky Milan Jelinek University of Sherbrooke/VoiceAge Corp. CANADA.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
2nd Workshop on Wideband Speech Quality - June nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd.
HIGH-RESOLUTION SINUSOIDAL MODELING OF UNVOICED SPEECH GEORGE P. KAFENTZIS, YANNIS STYLIANOU MULTIMEDIA INFORMATICS LABORATORY DEPARTMENT OF COMPUTER SCIENCE.
Codec 2 ● open source speech codec ● low bit rate (2400 bit/s and below) ● applications include digital speech for HF and VHF radio ● fills gap in open.
Codec 2 open source speech codec
Digital Communications Chapter 13. Source Coding
Vocoders.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Mohamed Chibani, Roch Lefebvre and Philippe Gournay
Speech and Audio Processing
ON THE ARCHITECTURE OF THE CDMA2000® VARIABLE-RATE MULTIMODE WIDEBAND (VMR-WB) SPEECH CODING STANDARD Milan Jelinek†, Redwan Salami‡, Sassan Ahmadi*, Bruno.
Linear Predictive Coding Methods
Scalable Speech Coding for IP Networks: Beyond iLBC
Packet loss concealment using audio morphing
Vocoders.
Linear Prediction.
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential nature of its content and his or her engagement not to reproduce it, not to transmit it to a third party, not to reveal its content and not to use it for commercial purposes without previous FTR&D written consent. Method of Packet Errors Cancellation Suitable for any Speech and Sound Compression Scheme STQ Workshop, Sophia-Antipolis, February 11 th 2003 Balazs KÖVESI, Dominique MASSALOUX

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D2 Introduction  Context  Application like VoIP or audio streaming –Possible high packet loss rate (up to 10 %)  Proposition of a frame error concealment (FEC) method  Copes with high packet loss rate  Relies on CELP synthesis scheme  Independent from the codec type  Speech oriented but also suitable for music  Includes adaptive gain control  Avoids "robot" voice  Ensures the decoder memory update  Smoothing after an erased period

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D3 Plan  Basic principle of the new FEC method  Implementation in a MDCT codec  Generalization to other codec types  Conclusion

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D4 Basic principle of the new FEC Decoded signal decoder Valid data Indication of erased data Storage of decoded samples Synthesis of missing samples Decoder update reconstructed signal Smoothing with decoded signal

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D5  The MDCT transform  Analysis with 50 % overlap  Synthesis with overlap-add Implementation in a MDCT codec memory for the next frame overlap-add -windowing -TF -T  F transform -FT -F  T transform -windowing -overlap-add Analysis window n-1 20 ms Analysis window n t Synthesis window n-1 decoded frame n 20 ms t Synthesis window n

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D6  Effect of frame erasure  the loose of x bitstream frames affects x+1 output frames  these frames have to be synthesized in the decoder Implementation in a MDCT codec Disturbed: frames n-1 & n 20 ms Erased: frame n t

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D7  Memorizing part  After decoding a valid frame –The 40 ms output memory is updated –The energy of the frame is calculated –The energy memory buffer is updated ( 5 s ) Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D8 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal LTP analysis & V/UV detection

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D9 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  LPC filter modelizes the spectral envelope  Coefficients not transmitted –LPC analysis order can be higher than in a usual CELP 16kHZ)  better performance on music past decoded signal LPC coefficients filter A(z) classical method 20 ms

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D10 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  Precise pitch estimation is crucial for the good performance  Only integer pitch (P) values are examined [50 Hz, 600 Hz]  Normalized correlations on the last 2P samples  Pitch criteria: maximum correlation + multiple & fractional verifications  V/UV criteria: selected correlation value + energy value  5 s energy memory energy evolution of the last two pitch periods past decoded signal p1p1 p2p2 p3p3 p4p4 correlation calculations

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D11 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  LPC analysis filtering past decoded signal A(z) past excitation signal

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D12 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  Excitation signal generation for the LPC synthesis filtering  voiced excitation: –2 components –harmonic, lower frequency bands LTP filter combined with a low pass filter –less harmonic, higher frequency bands LTP filter combined with a high pass filter + randomly evolving pitch

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D13  Excitation signal generation for the LPC synthesis filtering  voiced excitation: –2 components –harmonic, lower frequency bands LTP filter combined with a low pass filter –less harmonic, higher frequency bands LTP filter combined with a high pass filter + randomly evolving pitch Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  unvoiced excitation –non harmonic, lower frequency bands –“randomized” LTP filtering + low pass filtering + sudden energy variations are suppressed

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D14 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  LPC synthesis filtering excitation signal synthesized signal A(z) 1

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D15 Implementation in a MDCT codec Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthe- sized signal LTP parameters (B(z)) LPC parameters (A(z)) LPC analysis LTP analysis & V/UV detection calc. past excitation signal LTP filtering 1/B(z) LPC synthesis 1/A(z) Adaptive gain control Memory of past decoded signal  Important in case of long erased periods (> 20 ms) Q 2 adaptation laws: –stationary –non-stationary – The adaptations also depend on the pitch value Q decision available from the LTP analysis background noise level 40 ms t t

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D16  Recoverable information Implementation in a MDCT codec Synthesized frames n-1, n, n+1 20 ms Erased: frame n-1 & n t decoded frame n+2

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D17  Recoverable information  for the first erased frame Implementation in a MDCT codec MDCT transform on the first 2 synthesized frames t Partly recovered frame n-1IMDCT transform

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D18  Decoder memory update Implementation in a MDCT codec t Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal MDCT transform on the last 2 synthesized frames IMDCT memory to update IMDCT transform (F  T + windowing) updated IMDCT memory

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D19  Recoverable information  for the last erased frame Implementation in a MDCT codec t Partly recovered frame n+1IMDCT transform

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D20  Smoothing part  without smoothing Implementation in a MDCT codec t Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthesized frames n-1, n, n+1 decoded frame n+2 discontinuity synthesized domain error-free domain

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D21  Smoothing part  A codec independent solution: Implementation in a MDCT codec t Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal Synthesized frames n-1, n, n+1 decoded frame n+2 synthesized domain error-free domain Extra synthesized samples crossfading  1 0

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D22  Smoothing part  with MDCT smoothing Implementation in a MDCT codec t smooth transition at frame n+1overlap-add like crossfading Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal synthesized domain error-free domain

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D23 s can be adapted to any coding scheme Q was successfully implemented in – temporal codecs (G.711, G.721, G722) – in a CELP codec (G.723.1) – in a hierarchical codec composed of a CELP and a transform codec  Memorizing and synthesis part are codec independent  Decoder memory update –very important for recursive codecs (CELP) –general solution: coding – decoding on the synthesized frames –too complex for CELP –less complex solution: backtracking  Smoothing –a general solution: crossfading –more efficient smoothing can be found for some coding schemes (ex.: MDCT) –the decoder memory update ensure the smoothing in CELP codecs Generalization to other codec types Valid data Decoded signal decoder Erased data indication Storage of decoded samples Synthesis of missing samples Decoder update synthesized signal Smoothing with decoded signal

France Telecom R&D Diffusion of this document is subject to France Telecom authorization D24  A general FEC method for any coding scheme  optimal for speech voice, good performances on music  avoids too synthetic sound for voiced frames  keeps the nature of the unvoiced frames  enhanced energy management Q careful update of the decoder memory  smoothing after an erased period s Informal subjective tests have shown its good behavior  Successfully implemented in group communication applications  Perspectives:  speech / music decision + enhanced music mode  … Conclusion