CEPSTRAL ANALYSIS Cepstral analysis synthesis on the mel frequency scale, and an adaptative algorithm for it. Cecilia Caruncho Llaguno.

Slides:



Advertisements
Similar presentations
Speech Coding Workshop 2000 Jean-Marc Valin, Roch Lefebvre 1 IEEE Speech Coding Workshop Sept 17–20, 2000 Lake Lawn Resort Delavan, WI Jean-Marc Valin,
Advertisements

ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
EE513 Audio Signals and Systems Digital Signal Processing (Synthesis) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Page 0 of 34 MBE Vocoder. Page 1 of 34 Outline Introduction to vocoders MBE vocoder –MBE Parameters –Parameter estimation –Analysis and synthesis algorithm.
Digital Signal Processing – Chapter 11 Introduction to the Design of Discrete Filters Prof. Yasser Mostafa Kadah
Sampling and quantization Seminary 2. Problem 2.1 Typical errors in reconstruction: Leaking and aliasing We have a transmission system with f s =8 kHz.
D SP InputDigital Processing Output Algorithms Typical approach to DASP systems development Algorithms are in the focus at development of any digital signal.
December 2006 Cairo University Faculty of Computers and Information HMM Based Speech Synthesis Presented by Ossama Abdel-Hamid Mohamed.
Feature Extraction for ASR Spectral (envelope) Analysis Auditory Model/ Normalizations.
Overview of Adaptive Multi-Rate Narrow Band (AMR-NB) Speech Codec
Goals of Adaptive Signal Processing Design algorithms that learn from training data Algorithms must have good properties: attain good solutions, simple.
Modeling of Mel Frequency Features for Non Stationary Noise I.AndrianakisP.R.White Signal Processing and Control Group Institute of Sound and Vibration.
Analysis & Synthesis The Vocoder and its related technology.
Spectral Processing of Point-sampled Geometry
Warped Linear Prediction Concept: Warp the spectrum to emulate human perception; then perform linear prediction on the result Approaches to warp the spectrum:
Representing Acoustic Information
EE513 Audio Signals and Systems Digital Signal Processing (Systems) Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
GCT731 Fall 2014 Topics in Music Technology - Music Information Retrieval Overview of MIR Systems Audio and Music Representations (Part 1) 1.
Basics of Signal Processing. SIGNALSOURCE RECEIVER describe waves in terms of their significant features understand the way the waves originate effect.
„Bandwidth Extension of Speech Signals“ 2nd Workshop on Wideband Speech Quality in Terminals and Networks: Assessment and Prediction 22nd and 23rd June.
Chapter 2. Signals Husheng Li The University of Tennessee.
LECTURE Copyright  1998, Texas Instruments Incorporated All Rights Reserved Encoding of Waveforms Encoding of Waveforms to Compress Information.
Simple Image Processing Speaker : Lin Hsiu-Ting Date : 2005 / 04 / 27.
International Conference on Intelligent and Advanced Systems 2007 Chee-Ming Ting Sh-Hussain Salleh Tian-Swee Tan A. K. Ariff. Jain-De,Lee.
Concepts of Multimedia Processing and Transmission IT 481, Lecture #4 Dennis McCaughey, Ph.D. 25 September, 2006.
Jacob Zurasky ECE5526 – Spring 2011
1 PATTERN COMPARISON TECHNIQUES Test Pattern:Reference Pattern:
Basics of Neural Networks Neural Network Topologies.
Speech Parameter Generation From HMM Using Dynamic Features Keiichi Tokuda, Takao Kobayashi, Satoshi Imai ICASSP 1995 Reporter: Huang-Wei Chen.
Speech Signal Representations I Seminar Speech Recognition 2002 F.R. Verhage.
Speaker Recognition by Habib ur Rehman Abdul Basit CENTER FOR ADVANCED STUDIES IN ENGINERING Digital Signal Processing ( Term Project )
Authors: Sriram Ganapathy, Samuel Thomas, and Hynek Hermansky Temporal envelope compensation for robust phoneme recognition using modulation spectrum.
VOCODERS. Vocoders Speech Coding Systems Implemented in the transmitter for analysis of the voice signal Complex than waveform coders High economy in.
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Noise Reduction Two Stage Mel-Warped Weiner Filter Approach.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 27,
Professors: Eng. Diego Barral Eng. Mariano Llamedo Soria Julian Bruno
Equalization Techniques By: Mohamed Osman Ahmed Mahgoub.
Spectral Observer with Reduced Information Demand György Orosz, László Sujbert, Gábor Péceli Department of Measurement and Information Systems Budapest.
RCC-Mean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recognition in presence of Telephone Noise Amin Fazel Sharif.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 3 – Digital Audio Representation Klara Nahrstedt Spring 2014.
Copyright ©2010, ©1999, ©1989 by Pearson Education, Inc. All rights reserved. Discrete-Time Signal Processing, Third Edition Alan V. Oppenheim Ronald W.
By Sarita Jondhale 1 Signal preprocessor: “conditions” the speech signal s(n) to new form which is more suitable for the analysis Postprocessor: operate.
Speech Processing Using HTK Trevor Bowden 12/08/2008.
Institut für Nachrichtengeräte und Datenverarbeitung Prof. Dr.-Ing. P. Vary On the Use of Artificial Bandwidth Extension Techniques in Wideband Speech.
CS654: Digital Image Analysis Lecture 11: Image Transforms.
EEL 6586: AUTOMATIC SPEECH PROCESSING Speech Features Lecture Mark D. Skowronski Computational Neuro-Engineering Lab University of Florida February 20,
Estimation of Doppler Spectrum Parameters Comparison between FFT-based processing and Adaptive Filtering Processing J. Figueras i Ventura 1, M. Pinsky.
Proposed Courses. Important Notes State-of-the-art challenges in TV Broadcasting o New technologies in TV o Multi-view broadcasting o HDR imaging.
Lifecycle from Sound to Digital to Sound. Characteristics of Sound Amplitude Wavelength (w) Frequency ( ) Timbre Hearing: [20Hz – 20KHz] Speech: [200Hz.
Jean Baptiste Joseph Fourier
PATTERN COMPARISON TECHNIQUES
EEE4176 Applications of Digital Signal Processing
ARTIFICIAL NEURAL NETWORKS
Spoken Digit Recognition
Linear Prediction Simple first- and second-order systems
topics Basic Transmission Line Equations
Neuro-Fuzzy and Soft Computing for Speaker Recognition (語者辨識)
Ala’a Spaih Abeer Abu-Hantash Directed by Dr.Allam Mousa
Digital Systems: Hardware Organization and Design
AUDIO SURVEILLANCE SYSTEMS: SUSPICIOUS SOUND RECOGNITION
6. Time and Frequency Characterization of Signals and Systems
Homomorphic Speech Processing
Uses of filters To remove unwanted components in a signal
Chapter 7 Finite Impulse Response(FIR) Filter Design
Speech Processing Final Project
Fixed-point Analysis of Digital Filters
ELEN E4810: Digital Signal Processing Topic 11: Continuous Signals
Presenter: Shih-Hsiang(士翔)
Combination of Feature and Channel Compensation (1/2)
Presentation transcript:

CEPSTRAL ANALYSIS Cepstral analysis synthesis on the mel frequency scale, and an adaptative algorithm for it. Cecilia Caruncho Llaguno

Sources Cepstral analysis on the mel frequency scale – Satoshi Imai - Tokio Institute of Technology, 1983 An adaptative algorithm for mel-cepstral analysis of speech – Toshiako Fukada - Canon Inc. Kawasaki, 1992 – Keeichi Tokuda, Takao Kobayasi, and Satoshi Imai - Tokio Institute of Technology, 1992

Basic Concepts Cepstral Analysis – Definition Definition – Features Features Mel frequency scale

Cepstral analysis Main features – Good characteristics for representation – Log spectral envelope → accurate & efficient – Small sensitivity & quantization noise – Small spectral distortion – LMA filter → high quality speech synthesis

Cepstral analysis Complex logarithm Inverse Z transform In unit circle |z|<1

Mel frequency scale Human hearing sense → non-linear frequency scale Linear up to 1000 Hz, logarithmic above.

Mel cepstral analysis system

Spectral envelope extraction by the improved cepstral method Approximation of the mel scale

Spectral envelope extraction by the improved cepstral method Former method: – Fine structure → The spectral envelope is not suficiently separated from the pitch parameter Present method: – Can extract the envelope without being affected by the fine structure.

Mel Log Spectrum Approximation filter Why do we use it? – High quality – Simple – Coefficient sensitivities – Quantization characteristics Transfer function Quantization of the filter parameterfilter parameter

MLSA transfer function Ideal Basic filter:

MLSA transfer function Ideal MLSA filterNot realizablePadé approximation:

Filter parameters

Data rate Filter coefficients → bounded Digitalization → quantizer q → data amount b s (bits/frame)

Data rate Spectral envelope: b s bits/frame Pitch parameter: b p bits/frame Period of transmission: T seconds Averall bit rate of this system: B (bits/second)

Data rate Speech quality T (ms)MqBp (bit)B (kbits/s)Speech quality Very high Fairly good Still good

Spectral distortion Distortion caused by the interpolation Distortion caused by the quantization

Spectral estimation based on mel-cepstral representation Model spectrum

Spectral estimation based on mel-cepstral representation Unbiased Estimator of Log Spectrum by S. Imai and C. Furuichi → minimization of ε

Spectral estimation based on mel-cepstral representation Newton-Raphson method:

Adaptative mel-cepstral analysis algorithm H → Unit matrix → μ... adaptation step size ε (n)... estimate of ε at time n e(n) → output of the inverse filter 1/D(z) at time n →

Adaptative mel-cepstral analysis algorithm

Conclusions MLSA – Simple – Good stathistical features – Small spectral distortions Adaptative algorithm – Computationally efficient – Fast convergence properties

Questions? Thank you for your attention Muchas gracias por su atención