Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.

Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution periodic contribution Production of /z/:

Motivation & Aims Most speech sounds are predominantly voiced or unvoiced. What happens when the two components are “mixed”? Voiced and unvoiced components have different natures: unvoiced: aperiodic signal from turbulence-noise sources voiced: quasi-periodic signal from vocal-fold vibration Why not extract their features separately? Do the two contributions contain complementary information? Human speech recognition still performs well in noise. How? Does it take advantage of harmonic properties? Introduction

Voiced and unvoiced parts of a speech signal aperiodic contribution periodic contribution Production of /z/: Introduction

Automatic Speech Recognition Front End Pattern Recognition speech signal speech labels Feature Extraction: conversion of speech signals to a sequence of parameter vectors Dynamic Programming: matching of observation sequences to models of known utterances Introduction

u(n)v(n) Harmonic Decomposition Pitch optimisation PSHF block diagram raw pitch waveform + _ optimised pitch f 0 raw f 0 opt aperiodic waveform s(n) periodic waveform N opt s w (n) v w (n) ^ window w(n) window u w (n) ^ PSHF

Decomposition example (waveforms) Original Periodic part Aperiodic part PSHF

Decomposition example (spectrograms) Original Periodic part Aperiodic part PSHF

Decomposition example (MFCC specs.) Original Periodic part Aperiodic part PSHF

Parameterisations SPLIT: MFCC+Δ, +Δ 2 catPSHF PCA26: PCA78: PCA13: PCA39: MFCC +Δ, +Δ 2 cat PSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA MFCC+Δ, +Δ 2 catPSHF PCA BASE: MFCC waveformfeatures +Δ, +Δ 2 Method

Speech Database: Aurora 2.0 TIdigits database at 8 kHz, filtered with G.712 channel Connected English digit strings (male & female speakers) Method

Description of the experiments Baseline experiment: [base] standard parameterisation of the original waveforms (i.e., MFCC+D+A) Split experiments: [split] adjustment of stream weights (voiced vs. unvoiced) PCA experiments: [pca26, pca78, pca13 and pca39] decorrelation of the feature vectors, and reduction of the number of coefficients Method

Split experiments results Results

Summary of results Results

Conclusions PSHF module split Aurora’s speech waveforms into two synchronous streams (periodic and aperiodic). Used separately, accuracy was slighty degraded, however together, it was substantially increased in noisy conditions. Periodic speech segments provide robustness to noise. Apply Linear Discriminant Analysis (LDA) to the two- stream feature vector. Evaluate the performance of this front end in a more general task, such as phoneme recognition. Test the technique for speaker recognition. Further Work

COLUMBO PROJECT: Harmonic Decomposition applied to ASR David M. Moreno 1 Philip J.B. Jackson 2 Javier Hernando 1 Martin J. Russell 3 http://www.ee.surrey.ac.uk/ Personal/P.Jackson/Columbo/ 123

Pitch Optimisation: vowel /u/ Cost function Spectrum derived from a 268-point DFT

Harmonic Decomposition: vowel /u/

Word accuracy results (%)

Observation probability, with stream weights

Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.

Similar presentations

Presentation on theme: "Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution.

Similar presentations

Presentation on theme: "Improved ASR in noise using harmonic decomposition Introduction Pitch-Scaled Harmonic Filter Recognition Experiments Results Conclusion aperiodic contribution."— Presentation transcript:

Similar presentations

About project

Feedback