Presentation is loading. Please wait.

Presentation is loading. Please wait.

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12.

Similar presentations


Presentation on theme: "IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12."— Presentation transcript:

1 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12

2 Introduction  Music Information Retrieval (MIR)  Singer identification & Vocal-Timbre- Similarity  Feature extraction  Influence from other instruments

3 Related Studies  Using a statistically based speaker- identification method for speech signals in noisy environments First estimated an accompaniment-only model from interlude sections a vocal-only model by subtract the accompaniment-only model from the vocal- plus-accompaniment model Assume singing voice and accompaniment sounds statistically independent Not always satisfied Estimation have problem

4 Related Studies  Using vocal separation method Similar to their accompaniment sound reduction method Did not dealt with interlude sections Conducted experiments, using only vocal sections

5 Method Overview

6 Accompaniment Sound Reduction  F0 estimation PreFEst (Predominant-F0 estimation method) Observed power spectrum in units of cents A band pass filter designed for most melody Observed pdf of frequency components Each observed pdf is generated from weighted- mixture model of possible tone model Estimate the weighting by EM algorithm (MAP), regard as F0’s pdf Track dominant F0

7 Accompaniment Sound Reduction  Harmonic Structure Extraction Extract the frequency and amplitude of the l-th overtone Allow r cent error Search local maximum amplitude in an area

8 Accompaniment Sound Reduction  Re-synthesis Model by sinusoidal Quadratic function approximate changes in phase Linear function approximate changes in amplitude

9 Accompaniment Sound Reduction  Evaluation

10  To be continued…

11 Feature Extraction  LPC-Derived Mel Cepstral Coefficients (LPMCCs)  ∆F0s

12 Reliable Frame Selection  The feature vectors obtained from accompaniment sounds regions are unreliable  Set up a vocal GMM and a nonvocal GMM  Determine whether the feature vector x is reliable or not by threshold:  Difficult to determine a global η  Use α% of all the frames in each song are selected as reliable frames

13 Reliable Frame Selection  Evaluation


Download ppt "IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12."

Similar presentations


Ads by Google