IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12.

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12

Introduction  Music Information Retrieval (MIR)  Singer identification & Vocal-Timbre- Similarity  Feature extraction  Influence from other instruments

Related Studies  Using a statistically based speaker- identification method for speech signals in noisy environments First estimated an accompaniment-only model from interlude sections a vocal-only model by subtract the accompaniment-only model from the vocal- plus-accompaniment model Assume singing voice and accompaniment sounds statistically independent Not always satisfied Estimation have problem

Related Studies  Using vocal separation method Similar to their accompaniment sound reduction method Did not dealt with interlude sections Conducted experiments, using only vocal sections

Method Overview

Accompaniment Sound Reduction  F0 estimation PreFEst (Predominant-F0 estimation method) Observed power spectrum in units of cents A band pass filter designed for most melody Observed pdf of frequency components Each observed pdf is generated from weighted- mixture model of possible tone model Estimate the weighting by EM algorithm (MAP), regard as F0’s pdf Track dominant F0

Accompaniment Sound Reduction  Harmonic Structure Extraction Extract the frequency and amplitude of the l-th overtone Allow r cent error Search local maximum amplitude in an area

Accompaniment Sound Reduction  Re-synthesis Model by sinusoidal Quadratic function approximate changes in phase Linear function approximate changes in amplitude

Accompaniment Sound Reduction  Evaluation

 To be continued…

Feature Extraction  LPC-Derived Mel Cepstral Coefficients (LPMCCs)  ∆F0s

Reliable Frame Selection  The feature vectors obtained from accompaniment sounds regions are unreliable  Set up a vocal GMM and a nonvocal GMM  Determine whether the feature vector x is reliable or not by threshold:  Difficult to determine a global η  Use α% of all the frames in each song are selected as reliable frames

Reliable Frame Selection  Evaluation

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12.

Similar presentations

Presentation on theme: "IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12.

Similar presentations

Presentation on theme: "IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12."— Presentation transcript:

Similar presentations

About project

Feedback