SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak.

Slides:



Advertisements
Similar presentations
LPC10 2.4kbps federal standard in speech coding
Advertisements

Speech & Audio Coding TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Liner Predictive Pitch Synchronization Voiced speech detection, analysis and synthesis Jim Bryan Florida Institute of Technology ECE5525 Final Project.
Look Who’s Talking Now SEM Exchange, Fall 2008 October 9, Montgomery College Speaker Identification Using Pitch Engineering Expo Banquet /08/09.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Speech Sound Production: Recognition Using Recurrent Neural Networks Abstract: In this paper I present a study of speech sound production and methods for.
A 12-WEEK PROJECT IN Speech Coding and Recognition by Fu-Tien Hsiao and Vedrana Andersen.
Itay Ben-Lulu & Uri Goldfeld Instructor : Dr. Yizhar Lavner Spring /9/2004.
Speech in Multimedia Hao Jiang Computer Science Department Boston College Oct. 9, 2007.
Automatic Lip- Synchronization Using Linear Prediction of Speech Christopher Kohnert SK Semwal University of Colorado, Colorado Springs.
CELLULAR COMMUNICATIONS 5. Speech Coding. Low Bit-rate Voice Coding  Voice is an analogue signal  Needed to be transformed in a digital form (bits)
Speech Coding Nicola Orio Dipartimento di Ingegneria dell’Informazione IV Scuola estiva AISV, 8-12 settembre 2008.
Speech & Audio Processing
1 Audio Compression Techniques MUMT 611, January 2005 Assignment 2 Paul Kolesnik.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
EE2F1 Speech & Audio Technology Sept. 26, 2002 SLIDE 1 THE UNIVERSITY OF BIRMINGHAM ELECTRONIC, ELECTRICAL & COMPUTER ENGINEERING Digital Systems & Vision.
Digital Voice Communication Link EE 413 – TEAM 2 April 21 st, 2005.
Communications & Multimedia Signal Processing Formant Tracking LP with Harmonic Plus Noise Model of Excitation for Speech Enhancement Qin Yan Communication.
Analysis & Synthesis The Vocoder and its related technology.
Voice Transformation Project by: Asaf Rubin Michael Katz Under the guidance of: Dr. Izhar Levner.
Department of Computer Engineering University of California at Santa Cruz Data Compression (2) Hai Tao.
Speech coding. What’s the need for speech coding ? Necessary in order to represent human speech in a digital form Applications: mobile/telephone communication,
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Speech Coding Using LPC. What is Speech Coding  Speech coding is the procedure of transforming speech signal into more compact form for Transmission.
Page 0 of 23 MELP Vocoders Nima Moghadam SN#: Saeed Nari SN#: Supervisor Dr. Saameti April 2005 Sharif University of Technology.
Comparing Audio Signals Phase misalignment Deeper peaks and valleys Pitch misalignment Energy misalignment Embedded noise Length of vowels Phoneme variance.
Speech Coding Submitted To: Dr. Mohab Mangoud Submitted By: Nidal Ismail.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
SPEECH CODING Maryam Zebarjad Alessandro Chiumento.
1 Linear Prediction. 2 Linear Prediction (Introduction) : The object of linear prediction is to estimate the output sequence from a linear combination.
1 Linear Prediction. Outline Windowing LPC Introduction to Vocoders Excitation modeling  Pitch Detection.
Speech Coding Techniques. Introduction Efficient speech-coding techniques Advantages for VoIP Digital streams of ones and zeros The lower the bandwidth,
Compression No. 1  Seattle Pacific University Data Compression Kevin Bolding Electrical Engineering Seattle Pacific University.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Structure of Spoken Language
1 Speech Synthesis User friendly machine must have complete voice communication abilities Voice communication involves Speech synthesis Speech recognition.
Submitted By: Santosh Kumar Yadav (111432) M.E. Modular(2011) Under the Supervision of: Mrs. Shano Solanki Assistant Professor, C.S.E NITTTR, Chandigarh.
ECE 5525 Osama Saraireh Fall 2005 Dr. Veton Kepuska
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Audio processing methods on marine mammal vocalizations Xanadu Halkias Laboratory for the Recognition and Organization of Speech and Audio
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
ITU-T G.729 EE8873 Rungsun Munkong March 22, 2004.
A Comparison Of Speech Coding With Linear Predictive Coding (LPC) And Code-Excited Linear Predictor Coding (CELP) By: Kendall Khodra Instructor: Dr. Kepuska.
1 Audio Coding. 2 Digitization Processing Signal encoder Signal decoder samplingquantization storage Analog signal Digital data.
More On Linear Predictive Analysis
Present document contains informations proprietary to France Telecom. Accepting this document means for its recipient he or she recognizes the confidential.
EE Audio Signals and Systems Speech Production Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Chapter 20 Speech Encoding by Parameters 20.1 Linear Predictive Coding (LPC) 20.2 Linear Predictive Vocoder 20.3 Code Excited Linear Prediction (CELP)
Voice Sampling. Sampling Rate Nyquist’s theorem states that a signal can be reconstructed if it is sampled at twice the maximum frequency of the signal.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Linear Prediction.
1 Speech Compression (after first coding) By Allam Mousa Department of Telecommunication Engineering An Najah University SP_3_Compression.
Signal Prediction and Transformation Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD
Figure 11.1 Linear system model for a signal s[n].
Vocoders.
Subject Name: Digital Communication Subject Code:10EC61
Linear Prediction.
1 Vocoders. 2 The Channel Vocoder (analyzer) : The channel vocoder employs a bank of bandpass filters,  Each having a bandwidth between 100 HZ and 300.
Linear Predictive Coding Methods
Mobile Systems Workshop 1 Narrow band speech coding for mobile phones
The Vocoder and its related technology
Vocoders.
Speech coding.
Richard M. Stern demo January 12, 2009
Linear Prediction.
EE Audio Signals and Systems
Speech Processing Final Project
Presentation transcript:

SPEECH CODING Maryam Zebarjad Alessandro Chiumento Supervisor : Sylwester Szczpaniak

Outline Properties of speech signals Why coding ? Implemented tecniques Differential Pulse-Code Modulation DCT Tranfrorm Coder LPC Vocoder Results

SPEECH PROPERTIES Speech is produced when air is forced from the lungs through the vocal cords and along the vocal tract. It can be modeled by two states: Voiced Speech:- produced by the vibrations of the vocal cords. - quasi-periodic in the time domain and harmonically structured in the frequency domain. Unvoiced Speech:- produced, for example, by high speed air passing through a constriction in the vocal tract (mouth and lips) - random-like and broadband (like white noise).

Why coding ? Original speech signal has to be processed in order to be :  MINIMIZE DIMENSIONS (storage)  MINIMIZE BITRATE (transmission) VOIPMOBILE TELEPHONY

DPCM We have done DPCM about a wave file and here is the result for different prediction orders: -we have the coder and decoder signal for the prediction orders of 1, 2, 5, 10, 19. -we have corresponding wave files for each stage -we also have the SNR for each prediction order For the auto correlation method these were the basic formula as previously stated

The DPCM Method with autocorrelation

The Sriginal Signal

Coder Signal for Prediction Order of 1

Decoder Signal for Prediction Order of 1

Coder Signal for the Prediction order of 2

Decoder Signal for the Prediction Order of 2

Coder Signal for the Prediction Order of 5

Decoder Signal for the Prediction Order of 5

Coder Signal for the Prediction Order of 10

Decoder Signal for the Prediction Order of 10

Coder Signal for the Prediction Order of 19

Decoder Signal for the Prediction Order of 19

SNR Then by the following formula we calculate the Decoder SNR for each prediction order

LPC Vocoder Vocoders rely strongly on the properties of speech. Two – state excitation model: - pulses for voiced signal - random noise for unvoiced signal Vocal tract is modeled as an all-pole function. Source-System synthesis model where

LPC Vocoder We have to find:- pitch period - gain - poles of the system

LPC Vocoder V/UV DETECTION is done by  taking the energy of each frame and compare it to a threshold.  Taking the zero-crossing rate and compare it to a threshold. PITCH DETECTION is done by  Autocorrelation method : we cross-correlate the signal with it self, the output has a max after the pitch period. POLES OF THE SYSTEM are estimated using:  LPC, in our case the LEVINSON-DURBIN algorithm GAIN IS ESTIMATED :  If the frame is UnVoiced we take the sqrt of the average power of the frame.  If the frame is Voiced we use the average power for every pitch period.

LPC Vocoder ORIGINAL SAMPLE SYNTHETIZED SAMPLES

DCT Transform Coder There is no standard Same structure than vocoder

DCT Transform Coder Discrete Cosine Trasform is a unitary transform that expresses the incoming signal as a finite sum of cosine functions: So if the signal is periodic we need a “small” number of cosines (coefficients) instead if the signal is non periodic the cosines have to be many more.

DCT Transform Coder Voiced frame : waveform DCT coefficients Unvoiced frame : waveform DCT coefficients

DCT Transform Coder ORIGINAL SAMPLE Synthetized sample 22.5ms 720 coeff V 1460 coeff UV 22.5ms 40 coeff V 1460 coeff UV 50ms 720 coeff V 1460 coeff UV