Presentation is loading. Please wait.

Presentation is loading. Please wait.

Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006.

Similar presentations


Presentation on theme: "Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006."— Presentation transcript:

1 Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006

2 System Framework

3 Pitch Class Profile (PCP) The PCP vector is a 12-dimensional vector, which shows the relative intensities of the 12 pitch classes, {C, C#, D, D#, E, F, F#, G, G#, A, A#,B} Normalized to a unit vector

4 Pitch Class Profile (PCP)

5 Measure-based Similarity Matrix Previous similarity matrix –Pre-defined window size –results in a similarity matrix of a large size that makes further processing more expensive In this paper –Use measure as the element of similarity matrix

6 Measure-based Similarity Matrix PCP Vector generation –choose a window size that is equal to the duration of one half beat –Detect onset signal compute the change of the spectral content between two adjacent shifting windows of 20ms long and with 50% overlap

7 Measure-based Similarity Matrix –the autocorrelation function (ACF) of the onset signal is calculated to determine the beat period –Example: 100BPM → length of half beat is 300 ms Longer than the window size commonly use in previous work

8 Measure-based Similarity Matrix Grouping N successive PCP vectors Since PCP vectors are unit vectors, 0 <= s ij <= 1 dynamic time warping (DTW) can be used to enhance the s ij value

9 Dynamic Time Warping

10 Measure-based Similarity Matrix After the simplification, a 3-minute song with a tempo of 100BPM can form a 75 × 75 similarity matrix MSM reveals more the chord similarity rather than the melody similarity

11 Johnny Cash’s Hurt repeatedly uses the chord succession {Am, Am, C, D} in the 1st and 3rd sections while {G, A, F, C} in the 2nd and 4th sections. Beatles’ Yesterday does not have chord succession of short periods. Its music form structure is P = {I V V C V C V O} Two MSM Examples

12 Detection of Local Similarity Using a 2D moving window

13 Detection of Local Similarity move the 2D moving window along the diagonal line of the MSM

14 Detection of Long Range Similarity The Viterbi algorithm is used to find segments with consecutive large similarity values along the 45-degree direction we can exploit the output from the second module that provides the chord succession similarity to enhance the long range similarity detection.

15 Detection of Long Range Similarity interpret the x-axis as the “time”, the y-axis as the “state”

16 Detection of Long Range Similarity use “scores” instead of “probabilities” The score of a path is defined as the product of similarity value of all states and scores of all state transitions

17 Detection of Long Range Similarity P T0 > P T1 to guarantee the preference along the 45-degree direction. –The larger the ratio, the more favorable the path will proceed along the 45- degree direction. –In our experiment, the ratio P T0 /P T1 is chosen to be 1.5

18 Detection of Long Range Similarity Pruning with Chord Succession Information –sections with repetitive chord successions of a certain period should be similar to sections of same period –A period value p is tagged to a measure

19 Detection of Long Range Similarity

20 Post-processing we begin with the state j that gives the highest Q(L, j) at time L, and perform a back-tracking process. Segments with length smaller than φ measures are removed –In our implementation, φ = 8. Segments whose mean similarity value is less than a threshold, τ, are removed –τ = mean + standard deviation (for all s ij )

21 Post-processing Each segment should be divided –if their two corresponding sections in the song overlap with each other –if there is a significant difference between similarity values before and after a certain point in the segment. If there are conflicts on sections, the one with a higher similarity value has the priority to keep the boundaries For those songs in verse-chorus form, similarity values are clustered into two classes –high similarity values are claimed to be the chorus

22

23 Experiment collection of 120 pop, country and rock songs after 60’s. 100 of them are of the verse-chorus form and 20 are of the AAA or other form mono audio sampled at a rate of 22,050Hz, with 16 bits per sample.

24 Experimental Results The pattern extraction of a song is claimed to be correct if all patterns in the song are extracted without distinguishing between verse and chorus The accurate detection rate is 112/120 = 93.33%.

25 Experimental Results


Download ppt "Similarity Matrix Processing for Music Structure Analysis Yu Shiu, Hong Jeng C.-C. Jay Kuo ACM Multimedia 2006."

Similar presentations


Ads by Google