Download presentation
Presentation is loading. Please wait.
Published bySpencer Freeman Modified over 9 years ago
1
ACCELERATING QUERY-BY-HUMMING ON GPU Pascal Ferraro, Pierre Hanna, Laurent Imbert, Thomas Izard ISMIR 2009 Presenter: Chung-Che Wang (Focus on the performance of GPU)
2
2 Outline Introduction Aligning two music (omitted) Parallel implementation Tests and results (recognition rate is omitted) Conclusions
3
3 Introduction Powerful methods have high computational cost 160 times faster by using GPU The same program is executed on many data elements in parallel new challenges: memory operations and computational resource allocations
4
4 Parallel Implementation (1/4) CUDA can be seen as an extension of C that allows developers to define C functions, called kernels kernels must be written in C GPU operates as a coprocessor CUDA threads execute on GPU the rest of the program runs on a CPU
5
5 Parallel Implementation (2/4) Virtually launches N kernels executing the algorithm in parallel N: # of entries in the database The database is usually large --> must be sotred in the global memory not cached on GPU --> extremely important to follow the right access pattern
6
6 Parallel Implementation (3/4) In order to optimize the access of memory store the DB in "note" major (instead of "song" major) only store the current rows of Smith-Waterman matrices allocation is based on the query’s fixed size rather than the pieces of music’s variable sizes
7
7 Parallel Implementation (4/4) Each multi-processor executes: convert queries to note vector comparison between the query and each reference Each processor store its intermediate Smith-Waterman matrices (only one row) in its own shared memory space.
8
8 Tests and Results (1/3) Query data corpus: MIR-QBSH used in MIREX 2007/2008 2797 queries Databases DB1: 2048 48 ground-truth + 2000 noise from the Essen Collection DB2: 6030 48 ground-truth + 5982 noise from the whole Essen Collection DB3: 17433 ground-truth MIDIs are rather short while Essen collection mainly consists of long data files a subset of the RISM A/II collection, proposed during MIREX 2005
9
9 Tests and Results (2/3) Three different platforms:
10
10 Tests and Results (3/3) Timings of the different algorithms on various GPUs and databases in mm:ss:
11
11 Conclusions A great care must be taken when programming memory operations bad allocation strategy can have a significant impact on the computation time Future work optimized the pre-processing phase running exclusively on the CPU --> takes 75-90% of the overall computation time implement this stage on GPU using the CUDA CUFFT library
12
12 The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.