Demos for QBSH J.-S. Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University
Intro. to QBSH zQBSH: Query by Singing/Humming zChallenges yRobust pitch tracking yKey transposition yCollection of song databases yEfficient comparison xKaraoke box: ~10000 songs xInternet: 500M songs, 12M albums (
Efficient Retrieval in QBSH zMethods for efficient retrieval yMulti-stage progressive filtering yIndexing for different comparison methods yMusic phrase identification yRepeating pattern identification yDistributed & parallel computing zOur focus yParallel computing via GPU
MIRACLE zMIRACLE yMusic Information Retrieval Acoustically via Clustered and paralleL Engines zDatabase (~20K songs) yMIDI files ySolo vocals (<100) yMelody extracted from polyphonic music (<100) zComparison methods yLinear scaling yDynamic time warping zTop-10 Accuracy y~75% zPlatform ySingle CPU+GPU
MIRACLE (II) zReferences (full list)full list yJ.-S. Roger Jang and Ming-Yang Gao, "A Query-by-Singing System based on Dynamic Programming", International Workshop on Intelligent Systems Resolutions (the 8th Bellman Continuum), PP , Hsinchu, Taiwan, Dec yJyh-Shing Roger Jang, Jiang-Chun Chen, Ming-Yang Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engines", International Symposium on Music Information Retrieval (ISMIR) 2001 y… yChung-Che Wang and Jyh-Shing Roger Jang, “Acceleration of Query by Singing/Humming Systems on GPU: Compare from Anywhere”, International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2012
MIRACLE Before Oct zClient-server distributed computing zCloud computing via clustered PCs Master server Clients Clustered servers PC PDA/Smartphone Cellular Slave Master server Slave servers Request: pitch vector Response: search result Database size: ~12,000
Current MIRACLE zSingle server with GPU yNVIDIA 560 Ti, 384 cores (speedup factor = 10) Master server Clients Single server PC PDA/Smartphone Cellular Master server Request: pitch vector Response: search result Database size: ~13,000
MIRACLE in the Future zMulti-modal retrieval ySinging, humming, speech, audio, tapping… Master server Clients Clustered servers PC PDA/Smartphone Cellular Slave Master server Slave servers Request: feature vector Response: search result
QBSH for Various Platforms zPC yWeb version zEmbedded systems yKaraoke machines zSmartphones yiPhone/Android zToysToys y16-bit micro- controller
QBSH Prototype in MATLAB z To create a QBSH prototype in MATLAB yGet familiar with audio processing in MATLAB xSee audio signal processingaudio signal processing yTry the programming contests on xPitch trackingPitch tracking xQBSHQBSH Run exampleProgram/goDemo.m to test drive the QBSH prototype in MATLAB!
QBSH Demos zQBSH demos by our lab yQBSH on the web: MIRACLEMIRACLE yQBSH on toysQBSH on toys zExisting commercial QBSH systems ywww.midomi.comwww.midomi.com ywww.soundhound.comwww.soundhound.com
Returned Results zTypical results of MIRACLE
13 Online Karaoke Synchronized lyrics Calory consumption Real-time score Recording Live broadcast Real-time pitch display Automatic key adjustment
Future Work zMulti-modal music retrieval yQuery by user’s inputs: Singing, humming, whistling, speech, tapping, beatboxing yQuery by exact examples: Audio clips zSpeedup schemes yRepeating pattern id., DTW indexing zDatabase preparation yPolyphonic audio music as database The ultimate challenge!