Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

Similar presentations


Presentation on theme: "Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611"— Presentation transcript:

1 Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611
Rhythmic Similarity Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611

2 Agenda Introduction Similarity by spectral centroid
Similarity by beat spectrum Conclusion Discussion

3 Introduction Goal is to evaluate similarity of songs based on similarity in tempo and rhythmic signature Indepenent of non-rhythmic features Applications Automatic DJ, searching for similar music Perceived dimensions of rhythm Meter, rapidity, uniformity-variation, simplicity-complexity, etc. Describe two algorithms to extract approximations of perceived feature set

4 Introduction Distinguish methods by feature sets extracted as a basis of comparison among songs Method 1 – loudness, brightness, and mel-frequency cepstral coefficients Method 2 – beat spectrum Both not thoroughly evaluated on large dataset

5 Similarity by Spectral Centroid
Paulus and Klapuri - ISMIR 2002 Goals Extracted feature set independent of sounds used, only their rhythm Comparison of feature sets independent of tempo (time warp align) 4 Steps 1) Pre-processing 2) Pattern segmentation 3) Feature set extraction 4) Dynamic time warping

6 Algorithm System overview [3]

7 Algorithm 1) Pre-processing
Assume rhythmic signature based on percussive events only Use stochastic element from sinusoid+noise model 2) Pattern segmentation (on original audio) Eight amplitude envelopes of log-spaced subbands Half-wave rectify, squared, lowpass, natural logarithmic compression Periodicity detection for each amplitude envelope (autocorrelation based s(ρ)) Extract tatum, tactus, and measure length Tatum – from max spectral product of s(ρ) Tactus/Measure – from a priori likelihood functions (as a function of s(ρ))

8 Algorithm 3) Feature set extraction
Input: stochastic component and tatums, tactus, and measure boundaries Loudness extraction Log of mean squared energy Brightness extraction Spectral centroid (best results) Mel-frequency cepstral coefficients Normalize and store vectors in 2D feature set matrix (loudness, brightness, MFCC over time)

9 Algorithm 4) Dynamic Time Warping
compare feature sets by finding optimal path through a matrix of points representing all possible time alignments between feature vector sets Allow feature sets to differ in length (tempo) Time-aligned features used as basis of comparison

10 Evaluation Pattern segmentation is a failure Similarity
tatus – 67% accuracy measure – 77% accuracy phase (cross-correlate pulse train at tempo from tatus with amplitude envelopes) – 50% accuracy tatum - N/A Similarity Compared 41 patterns - songo, samba, waltz, shuffle, 8-beat, etc. Did not use preprocessed stochastic signal at the input of the feature set extractor Best results with spectral centroid

11 Similarity matrix using spectral centroid of 41 rhythms [3]
Evaluation Similarity matrix using spectral centroid of 41 rhythms [3]

12 Evaluation Did not use pre-processing or pattern segmentation
Evaluated over small corpus of percussion loops without non-percussive events Of feature set (loudness, spectral centroid, MFCC), only spectral centroid was used

13 Similarity by beat spectrum
Foote, Cooper, Nam – ISMIR 2002 Goals Rank similarity by rhythmic similarity and tempo (no time warping) 4 steps 1) Audio parameterization 2) Similarity matrix 3) Beat spectrum extraction from similarity matrix 4) Compare beat spectrums

14 Algorithm 1) Audio parametrization 2) Similarity matrix
STFT, 256-point FFTs with 128 point overlap 2) Similarity matrix Normalized Euclidean distance between all pairwise combinations of FFT vectors Indices correspond to time frame

15 Similarity matrix D(i,j) [1]
Algorithm Similarity matrix D(i,j) [1]

16 Algorithm 3) Beat spectrum extraction from similarity matrix S(i,j) = D(i,j) Autocorrelation finds periodicity amongst similarity between FFT vectors i.e. the periodicity of similar spectrums Function of time lag between similarity measurements Beat spectrum vectors used as feature set to compare songs Symmetry of S(i,j) across diagonal makes beat spectrum a vector one-dimensional correlation amongst diagonals 4) Compare beat spectrums Normalized Euclidean distance

17 Evaluation Compare squared Euclidean distance between same pattern at different tempos (time-stretched) [2]

18 Evaluation Took beat spectrum from several parts of several songs (4 songs, 15 total patterns) Beat spectrums should most closely match other beat spectrums from same song Found 96.7% accuracy Alternatively, compared 25 lowest FFT coefficients of beat spectrums to perform equally well as feature set vectors

19 Conclusion Difference between methods Common Issues
Feature set used for comparison (eg. Spectroid vs. beat spectrum) Definition of rhythmic and non-rhythmic data (eg. First method classifies tempo as non-rhythmic data by dynamic time alignment) Common Issues Segmenting a rhythmic pattern is difficult, second method did not even attempt this Tradeoff between computational demands and ability to handle evolving rhythms in songs Methods not thoroughly evaluated and compared on large groundtruth datasets (MIREX)

20 References [1] Foote, J. and S. Uchihashi The Beat Spectrum: A New Approach to Rhythm Analysis. Proceedings of the International Conference on Multimedia and Expo . [2] Foote, J., M. Cooper and U. Nam Audio Retrieval by Rhythmic Similarity . Proceedings of the 3rd International Symposium on Musical Information Retrieval. [3] Paulus, J., and A. Klapuri Measuring the Similarity of Rhythmic Patterns. Proceedings of the 3rd International Symposium on Musical Information Retrieval.


Download ppt "Presenter: Simon de Leon Date: March 2, 2006 Course: MUMT611"

Similar presentations


Ads by Google