Download presentation
Presentation is loading. Please wait.
Published byBaldwin Bridges Modified over 6 years ago
1
Frequency Domain Perceptual Linear Predicton (FDPLP)
-with Marios Athineos, Dan Ellis, Sriram Ganapathy and Samuel Thomas cosine transfrorm frequency This FDPLP technique estimates models of temporal trajectories of spectral energies in frequency sub-bands. This is done by computing linear predictor on cosine transform of the signal instead of doing it on the signal itself as is in the case of the conventional linear prediction. Windowing the cosine transform of the signal at a given place is ensuring that the all-pole LP model is fitting only the signal in a frequency span given by the window width. The technique yields results that are of the same nature as MFCC coefficient that are usually used in speaker ID, so the direct replacement in most existing systems is possible. However, when the gains of the FDLP models are excluded, the technique is more robust to Decomposition into AM and FM components. Straightforward alleviation of effects of linear distortions and , reverberations . 1
2
Telephone speech Digit recognition accuracy [%] - ICSI Meeting Room Digit Corpus clean reverberated PLP FDPLP Improvements on real reverberations similar (IEEE Signal Proc.Letters 08) Reverberant speech Gain included Gain excluded Phoneme recognition accuracy [%] TIMIT HTIMIT PLP-MRASTA FDPLP
3
FDLP decomposition of the signal
AM component (temporal envelope) FM component (carrier)
4
Model without its gain component
Reverberant speech (convolution with a long impulse response of the room) Convolution turns into addition in log spectral domain, as long as the most of the room impulse response fits into the analysis window! Ignoring FDLP model gain makes the representation invariant to revebs. 3 s window 30 s window Model without its gain component
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.