Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.

Slides:



Advertisements
Similar presentations
Speech Coding Techniques
Advertisements

LPC10 2.4kbps federal standard in speech coding
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Digital Coding of Analog Signal Prepared By: Amit Degada Teaching Assistant Electronics Engineering Department, Sardar Vallabhbhai National Institute of.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
4.2 Digital Transmission Pulse Modulation (Part 2.1)
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Fundamental of Wireless Communications ELCT 332Fall C H A P T E R 6 SAMPLING AND ANALOG-TO-DIGITAL CONVERSION.
COMP 249 :: Spring 2005 Slide: 1 Audio Coding Ketan Mayer-Patel.
Waveform SpeechCoding Algorithms: An Overview
1 Audio Compression Multimedia Systems (Module 4 Lesson 4) Summary: r Simple Audio Compression: m Lossy: Prediction based r Psychoacoustic Model r MPEG.
CS :: Fall 2003 Audio Coding Ketan Mayer-Patel.
Pulse Modulation 1. Introduction In Continuous Modulation C.M. a parameter in the sinusoidal signal is proportional to m(t) In Pulse Modulation P.M. a.
Speech coding. What’s the need for speech coding ? Necessary in order to represent human speech in a digital form Applications: mobile/telephone communication,
LE 460 L Acoustics and Experimental Phonetics L-13
Fundamentals of Digital Communication
Chapter Seven: Digital Communication
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
ECE 4371, Fall, 2014 Introduction to Telecommunication Engineering/Telecommunication Laboratory Zhu Han Department of Electrical and Computer Engineering.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Speech and Audio Coding Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2013 Last updated
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
CE Digital Signal Processing Fall 1992 Waveform Coding Hossein Sameti Department of Computer Engineering Sharif University of Technology.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
1 PCM & DPCM & DM. 2 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
1 Speech Synthesis User friendly machine must have complete voice communication abilities Voice communication involves Speech synthesis Speech recognition.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Linear Predictive Analysis 主講人:虞台文. Contents Introduction Basic Principles of Linear Predictive Analysis The Autocorrelation Method The Covariance Method.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2009.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Digital Multiplexing 1- Pulse Code Modulation 2- Plesiochronous Digital Hierarchy 3- Synchronous Digital Hierarchy.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
CHAPTER 3 DELTA MODULATION
PCM & DPCM & DM.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
4.2 Digital Transmission Pulse Modulation Pulse Code Modulation
More On Linear Predictive Analysis
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 13 Oct. 3 rd, 2006.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
CELP / FS-1016 – 4.8kbps Federal Standard in Voice Coding
Fundamentals of Multimedia Chapter 6 Basics of Digital Audio Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Digital Communications Chapter 13. Source Coding
Vocoders.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
CS 4594 Data Communications
Linear Predictive Coding Methods
Mobile Systems Workshop 1 Narrow band speech coding for mobile phones
Vocoders.
PCM & DPCM & DM.
Linear Prediction.
Presentation transcript:

Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail

Outline 1.Introduction  Overview of Speech Coding  Properties of a Speech Coder  Modeling the Speech Production System  Linear Prediction 2.Different Coding Techniques  Waveform Coders  Parametric Coders  Hybrid Coders  Coding Standards 3.PCM & DPCM 4.Linear Predictive Coding 5.Conclusion 6.References

1. Introduction Block Diagram of a speech coding system Sampling Frequency = 8kHz Number of Bits per sample = 8 Bit Rate = 8. 8kHz = 64 kbps Overview of Speech Coding

Properties of a Speech Coder  Low Bit-Rate  High Speech Quality  Robustness Across Different Speakers / Languages  Robustness in the Presence of Channel Errors  Good Performance on Non speech Signals  Low Memory Size and Low Computational Complexity  Low Coding Delay 1. Introduction

Modeling the Speech Production System 1. Introduction Speech = Voiced + Unvoiced sounds

1. Introduction Modeling the Speech Production System Autocorrelation values for the signal frames. Left: Unvoiced. Right: Voiced.

1. Introduction Modeling the Speech Production System Signal from a source is filtered by a time-varying filter with resonant properties similar to that of the vocal tract. The gain controls A v and A N determine the intensity of voiced and unvoiced excitation. The frequency of higher formant are attenuated by -12 dB/octave (due to the nature of our speech organs).

1. Introduction Linear Prediction Linear prediction as system identification.  Linear prediction is a practical method of spectrum estimation, where the PSD can be captured using a few coefficients.  These coefficients or linear prediction coefficients can be used to construct the synthesis filter.

1. Introduction Linear Prediction Linear prediction as system identification. Predicted Signal Prediction error

Outline 1.Introduction  Overview of Speech Coding  Properties of a Speech Coder  Modeling the Speech Production System  Linear Prediction 2.Different Coding Techniques  Waveform Coders  Parametric Coders  Hybrid Coders  Coding Standards 3.PCM & DPCM 4.Linear Predictive Coding 5.Conclusion 6.References

2. Different Coding Techniques Waveform Coders  Original shape of the signal waveform is preserved  Coders can be applied to any signal source  Coders are better suited for high bit-rate coding, since performance drops sharply with decreasing bit-rate.  In practice, these coders work best at a bit-rate of 32 kbps and higher.  Some examples of this class include various kinds of pulse code modulation (PCM) and adaptive differential PCM (ADPCM)

Parametric Coders  The speech signal is generated from a model, which is controlled by some parameters.  Parameters are estimated from the input speech signal  No attempt to preserve the original shape of the waveform  Accuracy and sophistication of the mode account for the quality.  The most successful model is based on linear prediction. In this approach, the human speech production mechanism is summarized using a time-varying filter ( with the coefficients of the filter found using the linear prediction analysis procedure.)  This class of coders works well for low bit-rate.  Bit-rate is in the range of 2 to 5 kbps.  Example coders of this class include linear prediction coding (LPC) and mixed excitation linear prediction (MELP). 2. Different Coding Techniques

Hybrid Coders  Combines the strength of a waveform coder with that of a parametric coder  As in waveform coders, an attempt is made to match the original signal with the decoded signal in the time domain  This class dominates the medium bit-rate coders, with the code-excited linear prediction (CELP) algorithm and its variants the most outstanding representatives  A hybrid coder tends to behave like a waveform coder for high bit-rate, and like a parametric coder at low bit-rate, with fair to good quality for medium bit- rate. 2. Different Coding Techniques

Coding Standards

Outline 1.Introduction  Overview of Speech Coding  Properties of a Speech Coder  Modeling the Speech Production System  Linear Prediction 2.Different Coding Techniques  Waveform Coders  Parametric Coders  Hybrid Coders  Coding Standards 3.PCM & DPCM 4.Linear Predictive Coding 5.Conclusion 6.References

3. PCM & DPCM Pulse Code Modulation  Invented 1926, deployed  Basic idea: assign smaller quantization stepsize for small-amplitude regions and larger quantization stepsize for large-amplitude regions (Non-uniform Quantization)  Two types of nonlinear compressing functions Mu-law adopted by North American telecommunications systems A-law adopted by European telecommunications systems  Mu-law(A-law) compresses the signal to 8 bits/sample or 64Kbits/second (without compandor, we would need 12bits/sample)

 -law 3. PCM & DPCM where A is the peak-input magnitude and  is a constant that controls the degree of compression. Pulse Code Modulation

 -law Examples 3. PCM & DPCM Pulse Code Modulation

A -law 3. PCM & DPCM Pulse Code Modulation with Ao a constant that controls the degree of compression.

A -law Examples 3. PCM & DPCM Pulse Code Modulation

3. PCM & DPCM Differential Pulse Code Modulation  Since speech signals are slowly varying, it is possible to eliminate the temporal redundancy by prediction  Quantizing the prediction-error Signal  i[n] are entered into the quantizer’s decoder to obtain the quantized prediction error, which is combined with the prediction x p [n] to form the quantized input. DPCM encoder (top) and decoder (bottom)

3. PCM & DPCM Differential Pulse Code Modulation PCM quantized Signal (left) and Quantization error (right) DPCM quantized Signal (left) and Quantization error (right)  Comparison between PCM and DPCM  Half the bit rate was used in DPCM and a higher SNR was achieved

Outline 1.Introduction  Overview of Speech Coding  Properties of a Speech Coder  Modeling the Speech Production System  Linear Prediction 2.Different Coding Techniques  Waveform Coders  Parametric Coders  Hybrid Coders  Coding Standards 3.PCM & DPCM 4.Linear Predictive Coding 5.Conclusion 6.References

4. Linear Predictive Coding  Linear prediction coding relies on a highly simplified model for speech production The LPC model of speech production  Parameters of the model are estimated from the speech samples

4. Linear Predictive Coding The LPC model of speech production  Parameters of the model are estimated from the speech samples These include:  Voicing: whether the frame is voiced or unvoiced.  Gain: mainly related to the energy level of the frame.  Filter coefficients: specify the response of the synthesis filter.  Pitch period: in the case of voiced frames, time length between consecutive excitation impulses.

4. Linear Predictive Coding  By carefully allocating bits for each parameter so as to minimize distortion, an impressive compression ratio can be achieved.  For instance, the bit-rate of 2.4kbps for the FS1015 coder is 53.3 times lower than the corresponding bit-rate for 16-bit PCM  Estimating the parameters is the responsibility of the encoder.  The decoder takes the estimated parameters and uses the speech production model to synthesize speech

4. Linear Predictive Coding Block diagram of the LPC encoder.

4. Linear Predictive Coding Block diagram of the LPC decoder.

4. Linear Predictive Coding  The Voicing Detector is a key element to successful coding.  The purpose of the voicing detector is to classify a given frame as voiced or unvoiced.  Measurements that a voicing detector relies on to accomplish its task :  Energy or  Zero Crossing Rate  Prediction Gain

4. Linear Predictive Coding Top left: A speech waveform. Top right: Magnitude sum function. Bottom left: Zero crossing rate. Bottom right: Prediction gain.

4. Linear Predictive Coding Bandwidth: 2.4kbps Samples/frame : 180 samples Frame Size: 22.5ms = frames/sec

4. Linear Predictive Coding Speech Coder Standard FS1015-LPC10Coefficient 10 FS1016-CELPCode Excitation MELPMixed Excitation IS-54 VCELPVector Sum Excited IS-96 QCELPQualComm Code Excited LD-CELP G.728Low-Delay Code-Excited G.729 CS-ACELPConjugate-structure Algebraic-Code-Excited

5. Conclusion  An overview of speech coding was introduced with a brief explanation of the speech production model. Properties of different coding techniques were also co0mpared. For wire line transmission coding, PCM and DPCM were covered. Linear Prediction Coding which is a basic for modern wireless systems was also introduced.

6. References  Speech Coding Algorithms “Wai C. Chu”  Digital Communications “Bernard Skalr”