Mel-spectrum to Mel-cepstrum Computation A Speech Recognition presentation October 1 2003 Ji Gu J.Gu@umail.LeidenUniv.nl
Mel-spectrum to Mel-cepstrum Computation Now we have known: The FFT processing step converts each frame of N samples from the time domain into the frequency domain. The result of the Mel-spectrum computation is:
Mel-spectrum to Mel-cepstrum Computation To compute Mel-cepstrum: We convert the log Mel-spectrum back to time domain using the Discrete Cosine Transform (DCT). (Because the Mel-spectrum coefficients and their logarithm are real numbers) The result obtained is called the Mel Frequency Cepstrum Coefficients (MFCC).
Mel-spectrum to Mel-cepstrum Computation Therefore : A DCT is applied to the natural logarithm of the Mel-spectrum to obtain the Mel-cepstrum,c[n] as: C is the number of the cepstral coefficients
Mel-spectrum to Mel-cepstrum Computation In SPHINX III Signal Processing Front End Specification First, the Cosine section of c[n] is computed: int32 fe_compute_melcosine(melfb_t *MEL_FB) { float period, freq; int32 i,j; period = (float)2*MEL_FB->num_filters; if ((MEL_FB->mel_cosine = (float **) fe_create_2d(MEL_FB->num_cepstra,MEL_FB->num_filters, sizeof(float)))==NULL){ fprintf(stderr,"memory alloc failed in fe_compute_melcosine()\n...exiting\n"); exit(0); }
Mel-spectrum to Mel-cepstrum Computation for (i=0; i<MEL_FB->num_cepstra; i++) { freq = 2*(float)M_PI*(float)i/period; for (j=0;j< MEL_FB->num_filters;j++) MEL_FB->mel_cosine[i][j] = (float)cos((double)(freq*(j+0.5))); } return(0); Second, a Cosine transform of the Logarithm of the Mel-spectrum:
Mel-spectrum to Mel-cepstrum Computation void fe_mel_cep(fe_t *FE, double *mfspec, double *mfcep) { int32 i,j; /* static int first_run=1; */ /* unreferenced variable */ int32 period; float beta; period = FE->MEL_FB->num_filters; for (i=0;i<FE->MEL_FB->num_filters; ++i) if (mfspec[i]>0) mfspec[i] = log(mfspec[i]); else mfspec[i] = -1.0e+5; }
Mel-spectrum to Mel-cepstrum Computation for (i=0; i< FE->NUM_CEPSTRA; ++i){ mfcep[i] = 0; for (j=0;j<FE->MEL_FB->num_filters; j++){ if (j==0) beta = 0.5; else beta = 1.0; mfcep[i] += beta*mfspec[j]*FE->MEL_FB->mel_cosine[i][j]; } mfcep[i] /= (float)period; return;
Mel-spectrum to Mel-cepstrum Computation By applying the procedure described above: For each speech frame, a set of mel-frequency cepstrum coefficients(MFCC) is computed. This set of coefficients is called an acoustic vector which represents the phonetically important characteristics of speech and is very useful for further analysis and processing in Speech Recognition. End