Download presentation
Presentation is loading. Please wait.
Published byLora Flowers Modified over 9 years ago
1
A Music Search Engine Built upon Audio-based and Web-based Similarity Measures P. Knees, T., Pohle, M. Schedl, G. Widmer SIGIR 2007
2
INTRODUCTION Basically all existing music search systems make use of manually assigned subjective meta- information like genre or style to index the underlying music collection. Explicit manual annotations A small set of meta-data Recent approaches Content-based analysis of the audio files Collaborative recommendations Incorporate information from different sources
3
RELATED WORK Query-by-example Query-by-Humming/Singing (QBHS) Operate on MIDI Music piece → Meta-data Cross-media Semantic ontology Semantic relations Crawler on “audio blogs” Word sense disambiguation Text surrounding the links to audio files Last.fm – listening habits & tags
4
PREPROCESSING THE COLLECTION ID3 tags Artist Album Title Ignored Only speech pieces ( skit in rap) Intro / Outro Duration below 1 minute
5
WEB-BASED FEATURES Queries to Google 1. “artist” music 2. “artist” “album” music review 3. “artist” “title” music review -lyrics For each query, retrieve top-ranked 100 pages Clean HTML tags and stop words in 6 languages
6
WEB-BASED FEATURES (CONT.) term list of each music piece Remove all terms with df tm <= 2 global term list Remove all terms that co-occur < 0.1% Resulting 78,000 terms (dimensions) weight( t, m ) tf * idf N – # of music pieces mpf t – music piece frequency Cosine normalization Removes the influence of the length of pages
7
AUDIO-BASED SIMILARITY MFCCs, Gaussian Mixture Model, KL divergence Problem Hubs- frequently similar Outliers- never similar to others Triangle inequality - does not fulfill Author’s previous work solve these problems
8
AUDIO-BASED SIMILARITY (CONT.) Always similar – hubs n dist (A) = distance to the n th nearest neighbour g(A, P i ) = D basic (A, P i ) / n dist (P i ), for all i sort g(A, P i ) ascending, pick n th value as f(A) D n-NN norm (A, B) = D basic (A, B) / ( f(A) * f(B) ) Never similar – outliers like above Triangle inequality sort D basic (A, P i ), for all i interpolating D basic (A, B) into D basic (A, P i ) D P (A, B) is the rank of D basic (A, B) in D basic (A, P i ) D pv (A, B) = D P (A, B) + D P (B, A)
9
DIMENSIONALITY REDUCTION χ 2 test s : 100 most similar tracks d : 100 most dissimilar tracks Calculate χ 2 ( t, s ) N terms with highest value are then joined into a global list sd t AB !t CD n __50100150 dimensionality 78000467969758866
10
VECTOR ADAPTATION Particularly necessary for tracks where no related information could be retrieved from the web Perform a simple smoothing
11
QUERYING THE MUSIC SEARCH ENGINE Original query + “music” -site:last.fm Google search 10 top-most web pages Map to vector space Calculate Euclidean distances
12
AUDIOSCROBBLER GROUND TRUTH Common approach genre information several drawbacks http://www.audioscrobbler.net Web services to access Last.fm data Tag information provided by Last.fm drawbacks Using top tags for tracks (total 227 tags)
13
PERFORMANCE EVALUATION Dimensionality reduction pass significance test χ 2 /50 best random permutation
14
PERFORMANCE EVALUATION Vector adaptation (re-weighting) no significance
15
PERFORMANCE EVALUATION Overall Precision after 10 documents
16
EXAMPLES Rock with great riffs Punk Relaxing music
17
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space results
18
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space 合輯, remix results
19
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space Lyrics results
20
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space Indexing documents results
21
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space PLSA results
22
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space Computation inefficient results
23
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space Ground truth? results
24
FUTURE WORK Dimensionality reduction 12601 tracks ID3 tag Web-based feature Google search Audio similarity Vector adaptation Query Google search Vector space 合輯, remix Lyrics PLSA Indexing documents Computation inefficient Ground truth? results
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.