Download presentation
Presentation is loading. Please wait.
Published byZoe Hines Modified over 9 years ago
1
2015/10/221 Progressive Filtering and Its Application for Query-by-Singing/Humming J.-S. Roger Jang ( 張智星 ) Multimedia Information Retrieval Lab CS Dept., Tsing Hua Univ., Taiwan http://www.cs.nthu.edu.tw/~jang
2
-2- Recent Publications zJournals yJiang-Chun Chen, J.-S. Roger Jang, "TRUES: Tone Recognition Using Extended Segments", ACM Transactions on Asian Language Information Processing, 2008. yJ.-S. Roger Jang and Hong-Ru Lee, "A General Framework of Progressive Filtering and Its Application to Query by Singing/Humming", IEEE Transactions on Audio, Speech, and Language Processing, No. 2, Vol. 16, PP. 350-358, Feb 2008. zConferences yLiang-Yu Chen, Chun-Jen Lee, Jyh-Shing Roger Jang, "Minimum Phone Error Discriminative Training For Mandarin Chinese Speaker Adaptation", Proceedings of INTERSPEECH 2008, Brisbane, Australia, Sept. 2008. yChao-Ling Hsu, Jyh-Shing Roger Jang, and Te-Lu Tsai, "Separation of Singing Voice from Music Accompaniment with Unvoiced Sounds Reconstruction for Monaural Recordings", Proceedings of 125th AES Convention, San Francisco, USA, Oct. 2008. yZhi-Sheng Chen, Jia-Min Zen, Jyh-Shing Roger Jang, "Music Annotation and Retrieval System Using Anti-Models", Proceedings of 125th AES Convention, San Francisco, USA, Oct. 2008.
3
-3- Outline zProblem definition of QBSH zMethods for QBSH zProgressive Filtering zConclusions
4
-4- Introduction to QBSH zQBSH: Query by Singing/Humming yInput: Singing or humming from microphone yOutput: A ranking list retrieved from the song database zOverview yFirst paper: Around1994 yExtensive studies since 2001 yState of the art: QBSH tasks at ISMIR/MIREXQBSH tasks at ISMIR/MIREX
5
-5- Challenges in QBSH Systems zReliable pitch tracking for acoustic input yInput from mobile devices yInput at noisy karaoke box zSong database preparation yAudio music vs. MIDIs zEfficient/effective retrieval yKaraoke machine: ~10,000 songs yInternet music search engine: ~500,000,000 songs
6
-6-
7
-7- Goal and Approach zGoal: To retrieve songs effectively within a given response time, say 5 seconds or so zOur strategy yMulti-stage progressive filtering yData-driven design methodology based on DP
8
-8- Approaches to QBSH zPitch TrackingPitch Tracking zMethods for QBSHMethods for QBSH
9
-9- A Quick Demo of QBSH zDemo page of MIR lab: yhttp://mirlab.org/mir_main/demo.htmhttp://mirlab.org/mir_main/demo.htm zDemo of QBSH yhttp://mirlab.org/Demo/MusicSearch/index.htmhttp://mirlab.org/Demo/MusicSearch/index.htm
10
-10- Progressive Filtering zMulti-stage representation yEach stage is a method for QBSH stage 1 stage 1 stage 2 stage 2 stage i stage i … … s i : survival rate for stage i d i : delay for stage i n i-1 : no. of input songs to stage i
11
-11- Stage Characteristics for Effectiveness z RS curve for stage i: recog. rate = r i (s) Survival rates s (%) Recog. rates (%) More effective method Less effective method Random guess 10010 100 65 Top-10% recog. rate is 65% (0, 0) (100, 100) Survival rate Recog. rate
12
-12- z TS curve for stage i: average time = t i (s) Stage Characteristics for Efficiency Survival rates (%) Average time (ms) Less efficient method More efficient method 10010 5 When s=10%, the average one-to-one comparison time is 5ms Survival rate Time (0, 0) (100, 0)
13
-13- Formulation as an Optim. Problem zMax: subject to the constraints n (= n 0 ): Size of the song database T max : maximum allowable response time, say, 5 sec. 10 : the size of the retrieved ranking list.
14
-14- DP-based Approach zThe orig. optim. task can be cast into DP: yOptimum-value function R i (s, t) is the optimum recog. rate at stage i, with a cumulated survival rate s and a cumulated computation time t. yRecurrent formula for R i (s, t) can be derived based on changing the survival rate of stage i, as follows.
15
-15- Recurrent formula for R i (s, t) stage 1 stage 1 stage i-1 stage i-1 stage i stage i … … d i : delay of stage i
16
-16- DP-based Approach yBoundary conditions for R i (s, t) : yOptim. recog. rate: We can then back track to find the optimum s 1, s 2, …, s m.
17
-17- Five Stages for Our Study zWe chose 5 stages for DP-based design method: yRange comparison yModified edit distance yLS yDTW with down-sampled inputs yDTW
18
-18- Corpora zQBSH corpusQBSH corpus y2797 8-second recordings (8 KHz, 8 bits) of 48 kids songs, by118 subjects y500 for design set, the others for test zSong database y13320 songs zComparison mode yAnchored beginning
19
-19- RS curves
20
-20- TS Curves
21
-21- Optimum RR wrt Response Time
22
-22- Survival Rates wrt Response Time
23
-23- Conclusions & Future Work zConclusions yAdvantages: xA scalable meta-method xFeasible for optimizing QBSH systems xApplicable (?) to other multimedia retrieval systems yDisadvantages xDerivation of RS and TS curves is time-consuming zFuture work yMore effective/efficient method for each stage
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.