Download presentation
Presentation is loading. Please wait.
Published byJody Collins Modified over 9 years ago
1
IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING MARCH 2010 Lan-Ying Yeh 99.04.12
2
Introduction Music Information Retrieval (MIR) Singer identification & Vocal-Timbre- Similarity Feature extraction Influence from other instruments
3
Related Studies Using a statistically based speaker- identification method for speech signals in noisy environments First estimated an accompaniment-only model from interlude sections a vocal-only model by subtract the accompaniment-only model from the vocal- plus-accompaniment model Assume singing voice and accompaniment sounds statistically independent Not always satisfied Estimation have problem
4
Related Studies Using vocal separation method Similar to their accompaniment sound reduction method Did not dealt with interlude sections Conducted experiments, using only vocal sections
5
Method Overview
6
Accompaniment Sound Reduction F0 estimation PreFEst (Predominant-F0 estimation method) Observed power spectrum in units of cents A band pass filter designed for most melody Observed pdf of frequency components Each observed pdf is generated from weighted- mixture model of possible tone model Estimate the weighting by EM algorithm (MAP), regard as F0’s pdf Track dominant F0
7
Accompaniment Sound Reduction Harmonic Structure Extraction Extract the frequency and amplitude of the l-th overtone Allow r cent error Search local maximum amplitude in an area
8
Accompaniment Sound Reduction Re-synthesis Model by sinusoidal Quadratic function approximate changes in phase Linear function approximate changes in amplitude
9
Accompaniment Sound Reduction Evaluation
10
To be continued…
11
Feature Extraction LPC-Derived Mel Cepstral Coefficients (LPMCCs) ∆F0s
12
Reliable Frame Selection The feature vectors obtained from accompaniment sounds regions are unreliable Set up a vocal GMM and a nonvocal GMM Determine whether the feature vector x is reliable or not by threshold: Difficult to determine a global η Use α% of all the frames in each song are selected as reliable frames
13
Reliable Frame Selection Evaluation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.