SPEECH CODING Maryam Zebarjad Alessandro Chiumento.

Slides:



Advertisements
Similar presentations
Speech Coding Techniques
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Spatial and Temporal Data Mining
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Fundamental of Wireless Communications ELCT 332Fall C H A P T E R 6 SAMPLING AND ANALOG-TO-DIGITAL CONVERSION.
Department of Computer Engineering University of California at Santa Cruz Data Compression (2) Hai Tao.
COMP 249 :: Spring 2005 Slide: 1 Audio Coding Ketan Mayer-Patel.
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Waveform SpeechCoding Algorithms: An Overview
CS :: Fall 2003 Audio Coding Ketan Mayer-Patel.
Lossy Compression Based on spatial redundancy Measure of spatial redundancy: 2D covariance Cov X (i,j)=  2 e -  (i*i+j*j) Vertical correlation   
Modulation, Demodulation and Coding Course Period Sorour Falahati Lecture 2.
Fundamentals of Digital Communication
Modulation, Demodulation and Coding Course
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Digital Communication I: Modulation and Coding Course
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
AUDIO COMPRESSION msccomputerscience.com. The process of digitizing audio signals is called PCM PCM involves sampling audio signal at minimum rate which.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
By Sarita Jondhale1 Signal Processing And Analysis Methods For Speech Recognition.
CE Digital Signal Processing Fall 1992 Waveform Coding Hossein Sameti Department of Computer Engineering Sharif University of Technology.
MPEG Audio coders. Motion Pictures Expert Group(MPEG) The coders associated with audio compression part of MPEG standard are called MPEG audio compressor.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
1 PCM & DPCM & DM. 2 Pulse-Code Modulation (PCM) : In PCM each sample of the signal is quantized to one of the amplitude levels, where B is the number.
1 Audio Compression. 2 Digital Audio  Human auditory system is much more sensitive to quality degradation then is the human visual system  redundancy.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
Linear Predictive Analysis 主講人:虞台文. Contents Introduction Basic Principles of Linear Predictive Analysis The Autocorrelation Method The Covariance Method.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
Outline Transmitters (Chapters 3 and 4, Source Coding and Modulation) (week 1 and 2) Receivers (Chapter 5) (week 3 and 4) Received Signal Synchronization.
EE 3220: Digital Communication
CELLULAR COMMUNICATIONS MIDTERM REVIEW. Representing Oscillations   w is angular frequency    Need two variables to represent a state  Use a single.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
More On Linear Predictive Analysis
SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
Frequency Domain Coding of Speech 主講人:虞台文. Content Introduction The Short-Time Fourier Transform The Short-Time Discrete Fourier Transform Wide-Band Analysis/Synthesis.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Sub-Band Coding Multimedia Systems and Standards S2 IF Telkom University.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Presentation III Irvanda Kurniadi V. ( )
Applications of Multirate Signal Processing
Principios de Comunicaciones EL4005
Digital Communications Chapter 13. Source Coding
Vocoders.
Chapter 13 Basic Audio Compression Techniques
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Linear Predictive Coding Methods
Vocoders.
PCM & DPCM & DM.
Linear Prediction.
Govt. Polytechnic Dhangar(Fatehabad)
Presentation transcript:

SPEECH CODING Maryam Zebarjad Alessandro Chiumento

SPEECH PROPERTIES 2 categories: Voiced and Unvoiced  Voiced: quasi-periodic in the time domain and harmonically structured in the frequency domain  Unvoiced: random-like and broadband (like white noise)‏ Why speech coding?  Efficient transmission  Efficient storage Problems: High quality with the lowest bit-rate possible

Performance measures 2 ways of measuring: Objective  SNR, long term  SEGSNR, short term Subjective  DRT Diagnostic Rhyme Test  DAM Diagnostic Acceptability Measure  MOS Mean Opinion Score 4 standards for speech quality: Broadcast, Network, Communications, Synthetic

Coding Techniques: WAVEFORM CODERS digitalize speech on a sample-by-sample basis. The goal is to have the output waveform closely match the input waveform. Scalar and vector quantization Sub-band coders Transform coders SINUSOIDAL ANALYSIS-SYNTHESIS They relay on the sinusoidal representation of the speech waveform Short - Time Fourier Transform models Sinusoidal Transform Coding Multiband Excitation Coder VOCODERS Speech – specific coders Formant Vocoders Channel Vocoders LPC Vocoders

Scalar and Vector Quantization SQ: every sample is mapped into a specific code Examples : PCM, DPCM, DM, ADPCM....

Scalar and Vector Quantization VQ: the data (speech) is compressed by encoding it in blocks. The incoming vectors are formed from consecutive data samples or from model parameters. Examples: VPCM, GS-VQ, A-VQ...

Sub-band Coders Unlike SQ and VQ this coders rely more on frequency- domain properties of speech. the signal band is divided into frequency sub-bands using a bank of bandpass filters. The output of each filter is then sampled (or down-sampled) and encoded. Example: AT&T, CCITT (G.722),...

Transform Coders Work on spectral properties of speech (like SBC)‏ They use unitary transforms whose parameters are quantized at the transmitter and decoded and inverse-transformed at the receiver The potential for bit-rate reduction in transform coding lies in the fact that unitary transforms tend to generate near- uncorrelated transform components which can be coded independently Although there are many possible transforms that can be used (DCT, DFT, WHT, KLT,…) all share the property of unitarity:

Example: Adaptive Transformation Coder It employs DCT and has high performance

Speech Coding Using Sinusoidal Analysis – Synthesis Models This speech coders relay on the sinusoidal representation of the speech waveform Speech Analysis-Synthesis Using the Short-Time Fourier Transform  Speech is slowly time-varying (quasi-stationary) and can be modeled by its short time spectrum Analysis expressionSynthesis expression h(n) is the sliding analysis window and is often constrained to be about 5 – 20 ms

Speech Coding Using Sinusoidal Analysis – Synthesis Models Speech Analysis-Synthesis Using the Sinusoidal Transform Coding  The speech is represented by linear combination of sinusoids with time-varying amplitudes, phases and frequencies: McAulay - Quartieri The number of sinusoids L is time-varying, the possibility to reduce bit-rate comes from the fact that voiced speech is highly periodic and L can be adjusted accordingly. Furthermore the statistical properties of the Short-Time spectrum of unvoiced speech are preserved.

Vocoders Speech specific Low bit rate but performance degrades for non speech signals 4 types:  Channel, Formant, Homomorphic, LPC LPC Vocoders are divided in 3 categories based in excitation models:  2-state excitation  Mixed excitation  residual

LPC Vocoder For a p-th order forward linear prediction the present sample if predicted from linear compination of p past samples The prediction parameters are obtained by minimizing the mean square forward prediction error where For forward estimation:

The system can be solved using the recursion: Levinson – Durbin

Wokplan Implementation of:  LPC Vocoder  DCT Transform Coder  DPCM Coder Comparison of three methods for specific speech signals