Download presentation
Presentation is loading. Please wait.
1
Finding a single voice in music Christine Smit April 26, 2007
2
Outline Introduction Introduction Classification Strategies: Classification Strategies: Counting silent frequency bins Counting silent frequency bins Pitch cancellation Pitch cancellation MFCCs MFCCs Trading recall for precision Trading recall for precision What worked and what didn’t What worked and what didn’t
3
Introduction What am I doing?
4
What is a ‘single voice’? a single note sounding at a time a single note sounding at a time
5
Why do this? single voice finder + instrument identifier = instrument sample library
6
What are the data sets? training set: 10 1-minute samples training set: 10 1-minute samples test set: 10 1-minute test samples test set: 10 1-minute test samples 25% single voice, 75% multi-voice/silence 25% single voice, 75% multi-voice/silence mixture of classical and folk music mixture of classical and folk music
7
What characterizes a single voice? non-solo solonon-solo
8
What characterizes a single voice?
10
Strategies
11
Strategy #1: Silence detection find silence silent HMM? music silence counts raw classification Nothing really worked
12
Strategy #2: Pitch Cancellation music filtered music raw classification final classification filter pitch single voice? HMM
13
Strategy #3: MFCCs MFCC GMM HMM music 13 features likelihood final classification
14
Trading recall for precision
15
Quick reminder Precision = out of the stuff we got, how much of it was right? Precision = out of the stuff we got, how much of it was right? Are google’s results relevant? Recall = out of all the right stuff, how much did we get? Recall = out of all the right stuff, how much did we get? If I asked google for the UN, did I get all the UN’s websites?
16
Precision is important If I have a large enough database, I can afford to have relatively low recall. But I want high precision so what I do get is what I want. If I have a large enough database, I can afford to have relatively low recall. But I want high precision so what I do get is what I want.
17
Strategy #2: Pitch Cancellation music filtered music raw classification final classification filter pitch single voice? HMM
18
Strategy #3: MFCCs MFCC GMM HMM music 13 features likelihood final classification
19
Results
20
Strategy #1: Silence detection (just for comparison)
21
Strategy #2: Pitch Cancellation
22
Strategy #3: MFCCs
23
Conclusion Silence detection really didn’t work out. Silence detection really didn’t work out. MFCCs + GMM is really just as good as pitch cancellation MFCCs + GMM is really just as good as pitch cancellation At 90% precision, I get about 25% recall. At 90% precision, I get about 25% recall.
24
Acknowledgements Much thanks to Professor Ellis for his assistance on this project.
25
Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.