Download presentation
Presentation is loading. Please wait.
Published byFlora Terry Modified over 8 years ago
2
Melody track identification in music symbolic files David Rizo Pedro J. Ponce de León Antonio Pertusa Carlos Pérez-Sancho José M. Iñesta The 19th International FLAIRS Conference, 11-13 May, 2006 Melbourne Beach, Florida GRFIA Group of Pattern Recognition and Artificial Intelligence University of Alicante (Spain)
3
Outline What’s a melody track? Methodology overview Features Random Forest Classifier Experiments –Data –Track classification –Track selection Conclusions
4
What’s a melody track? “The logical song”, Supertramp MP3 format (d.audio) Sequencer view of the MIDI song score. All tracks playing tracks Track names Digital score = organized several tracks or instruments playing simultaneously Apple “Logic Express 7” sequencer
5
What’s a melody track? The logical song, Supertramp Melody track playing, accompaniment muted tracks Mute button
6
What’s a melody track? The logical song, Supertramp Melody track muted, accompaniment playing tracks Mute button
7
Objectives and state of the art Applications: –Melody matching when searching digital score databases (e.g.: query by humming) –Build melody thumbnails for automatic collection indexing Similar works related to this subject: –Extract monophonic melody from polyphonic score Objective: Select automatically the melody track from digital scores (MIDI, XML, …) Track 1 Track 2 Track 3 Track 4. Track N- 1 Track N Is Melody?. Is Melody? No YES!! No. No
8
Methodology overview Multitrack MIDI file Track1Track2 …… TrackN-1TrackN Random Forest Classifier Probability of being melody track p 1 =0.3p 2 =0.8 …… p n-1 =0.6p n= 0.7 Remove accompaniment with a threshold, e.g: =0.5 p i possible melody track p i < cannot be melody track Select melody track: highest p Feature vectors AvgPitch =26 LongestDur =12.3 …. AvgPitch =65 LongestDur =4.5 …. AvgPitch =80 LongestDur =1.7 …. AvgPitch =67 LongestD =0.5 ….
9
MIDI Track characterization I CategoryFeatures Track information Duration, Number of notes,Occupation rate Polyphony rate Pitch Highest, Lowest, Mean, Standard deviation Pitch intervals Number of different intervals, Largest, Smallest, Mean, Mode, Standard deviation Note durations Longest, Shortest, Mean, Standard deviation Syncopation Number of syncopated notes Both absolute and normalized values (value i - min) / (max - min), max and min values for all the tracks in a given file
10
MIDI Track characterization II Some plots of features … correlated but … Some of them do not provide any information
11
MIDI Track characterization III Combining them a classification is possible AvgAbsInterval AvgNDuration AvgPitch TrackNumNotes 1st: AvgAbsInterval > 20 1st 2nd 2nd: and AvgNormalizedDuration > 0.5 3rd 3rd: and AvgPitch [40, 90] 4th 4th: and TrackNumNotes > 2000 Using decision trees E.g.: A track seems not to be a melody if: No Yes Is Melody?
12
Feature selection and random forest I Melody 12% Not Melody 88% AvgPitch < 10 ? Melody 20% Not Melody 70% Melody 11% Not Melody 89% Melody 52% Not Melody 48% Melody 98% Not Melody 2% Melody 23% Not Melody 77% TrackNumNotes < 2000?Classified as Not Melody Classified as Melody Classified as Not Melody AvgAbsInterval < 8.2? Decision tree Yes No Yes Data too complex for an simple decision tree
13
Feature selection and random forest II Random Forest - collection of K tree predictors - each node uses a random selection of F features Taken from : “Unsupervised Learning with Random Forest Predictors”, Steve Horvath Learn from examples Classify a new object(track) from an input vector - each tree classification = vote - choose the most voted class
14
Track selection procedure Each track yields a probability p i of belonging to the Is-Melody class Remove the tracks with p i < Select the track with highest p i among the rest Scenarios No track has p i >= Song has not any melody track Several tracks with the highest p i Song has more than one melody track One selected track Select this track as melody
15
Data Corpora: MIDI files downloaded from internet Three music genres Automatic tagging List of track names from aprox. 28.000 MIDIs 517sequenced by: 520vocals 537snare 558choeurs 563bajo 564sax 581bass (bb) 583piano (bb) 595drums (bb) 654trumpet 659drum 683choir 730trombone 799vocal 819sequenced by 877winjammer demo 936organ 958chant 984flute 1095guitare 1115brass 1206m?lodie 1413melodie 1445midi 1924basse 2735remixed 2860untitled 2984guitar 3984 Melody CountTrack Name From the 50 most repeated names: M = {melody, melodie, melodia, vocal, chant, voice, lead voice, voix, lead, lead vocal, canto} Keep only files with at least a track with its name M A track is tagged as a melody if its name M Manual check on 100 files 98 correctly tagged 2% estimated error Lack of suitable tagged databases for this task Too many files to be tagged
16
Data Corpus IDStyle No. of files No. of tracks CL200Classical200687 JZ200Jazz200769 KR200Popular2001370 CLAClassical131581 JAZJazz9984208 KARPopular13609256 TOTAL308916871 Karaoke = “.kar” files of popular modern music
17
Experiments A) Classify tracks as: –Melody –Not-melody –Setup parameters B) Establish a threshold C) Select the melody track among the tracks in a song
18
Tracks classification Melody versus non-melody track classification experiment –Extended WEKA toolkit to read MIDI files and gather the presented features Correcty processed instances = tracks 10-fold cross validation scheme Corpus Random Forest K=10, F=5 Bayesian network K-NN Multilayer perceptron CL20099,2%98,8%96,0%99,1% JZ20096,0%96,5%96,3%96,8% KR20094,8%91,8%88,9%93,0% Use this setup for the other experiments Good results and Rules can be analyzed
19
Melody track selection Threshold value setup –KR200+JZ200+CL200 2826 tracks –p i [0,1] for each track – such that minimum classification errors Any value [0.41,0.59] yielded 14 errors
20
Melody track selection Melody track selection experiment –Leave-one-out scheme Style% Success CL200100.0 JZ20099.0 KR20086.6 Lower success rate: - melody tracks not properly tagged - heterogeneous track layout
21
Style specifity II Melody track selection across styles. StyleTest corpus% Success KAR+JAZCLA71.7 CLA+KARJAZ90.7 CLA+JAZKAR52.2 Poorer results when no training on the same style
22
Training set specifity Generalization study: –melody track selection by styles when training with all the data Training corpora Test corpus % Success CL200+JZ200+KR200CLA76.3 CL200+JZ200+KR200JAZ95.6 CL200+JZ200+KR200KAR79.9 Average success rate taking into account the cardinalities of the datasets: 86.1% Duplicated tracks, cannons Track layout
23
Conclusions Method for identifying melody track –Promising results on some genres –Layout (texture) or how the song is sequenced is very important: case for popular music files –Need of large amount of training data to train the system
24
Conclusions Current and future work –Manual tagging of corpora: some problems on judging `what is a melody´ for different musicians (solos, intros…) –Characterize a melody from the rules learnt by the random forest –Melody segmentation among tracks
25
THANK YOU FOR YOUR ATTENTION
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.