Automatic music classification and the importance of instrument identification Cory McKay and Ichiro Fujinaga Music Technology Area Faculty of Music McGill University Montreal, Canada
Overview Examination of the relative importance of different high-level features in automatic music classification Performed an experiment involving automatic genre classification of MIDI files Found that features based on instrumentation (an abstraction of timbre) were of particular importance
Topics Introduction to automatic music classification Related research Details of experiment performed Features used Feature weighting Taxonomies used Classifiers and training data used ResultsConclusions
10 March 2005 CIM Montreal / McKay & Fujinaga 4/25 Introduction to automatic music classification There are many ways in which computers can classify music GenreComposerPerformer Geographical/temporal/cultural origin etc. Music classification can be difficult for both humans and computers Rarely have precise, clear and consistent guidelines delineating the musical characteristics of categories
10 March 2005 CIM Montreal / McKay & Fujinaga 5/25 Applications of automatic music classification Discovery of probable authorship of anonymous compositions Sociological and psychological research into how humans construct the notion of musical similarity and form musical groupings Automatic sorting of large databases Music recommendation systems Sorting of personal music collections e.g. based on mood or listening scenarios Automated transcription Detection of pirated recordings
10 March 2005 CIM Montreal / McKay & Fujinaga 6/25 Advantages of automatic music classification Computers can perform classifications faster and more consistently than humans Computers can analyze music in novel and non-intuitive ways that might not occur to humans Computers can avoid human preconceptions that might contaminate experimental results
10 March 2005 CIM Montreal / McKay & Fujinaga 7/25 How automatic classification works “Feature” extraction Properties or characteristics of recordings Percepts that classifiers base decisions on Can be extracted from audio (e.g. MP3) or symbolic (e.g. MIDI) recordings Good features are essential to successful classification Classification can be done using Expert systems: utilize pre-set heuristics Machine learning (AI): supervised or unsupervised learning
10 March 2005 CIM Montreal / McKay & Fujinaga 8/25 Features Low-level features Signal processing quantities e.g. spectral centroid and spectral flux Can be effective practically Can have psychoacoustic significance Have little direct theoretical meaning musicologically or sociologically High-level features Based on musical abstractions e.g. tempo and meter Currently difficult or impossible to extract from audio recordings Have more theoretical relevance than low-level features
10 March 2005 CIM Montreal / McKay & Fujinaga 9/25 Overview of this experiment Empirical examination of which features are most useful to classifiers Used high-level features because of their theoretical significance Used test task of genre classification A particularly difficult type of classification Related to many other types of classification Features useful for this task likely to be particularly robust
10 March 2005 CIM Montreal / McKay & Fujinaga 10/25 Related research Relatively little work has been done on features that could be useful for arbitrary types of music Cantometrics project (Lomax 1968) Tagg (1982) Cope (1991) Arden and Huron (2001) Studied the correlation between musical features and geographical regions Automatic genre classification has received considerable attention recently Audio classification work of Tzanetakis and Cook (2002) is often cited Best results to date with symbolic data have been achieved by McKay and Fujinaga (2004)
10 March 2005 CIM Montreal / McKay & Fujinaga 11/25 Bodhidharma Experiments carried out with the Bodhidharma system A general-purpose symbolic feature extraction and classification system Easy-to-usePortable Applicable to a wide range of research tasks
10 March 2005 CIM Montreal / McKay & Fujinaga 12/25 Features studied 111 high-level features implemented: Instrumentation e.g. whether modern instruments are present Musical Texture e.g. standard deviation of the average melodic leap of different lines Rhythm e.g. standard deviation of note durations Dynamics e.g. average note to note change in loudness Pitch Statistics e.g. fraction of notes in the bass register Melody e.g. fraction of melodic intervals comprising a tritone Largest available set of implemented high-level features 42 more features have been proposed, but have not been implemented yet More information available in Cory McKay’s master’s thesis (2004)
10 March 2005 CIM Montreal / McKay & Fujinaga 13/25 Features to use An insufficient number of features can fail to provide classifiers with enough information to make good decisions Too many features can overwhelm and confuse classifiers Can be difficult to predict in advance which features will work well together Individual performance of a feature is not necessarily indicative of its performance in combination with other features
10 March 2005 CIM Montreal / McKay & Fujinaga 14/25 Feature weighting Feature weighting is a technique for experimentally determining the importance of various features by assigning weights to them Used genetic algorithms here “Evolves” a good set of weights The weights produced by the genetic algorithm provides an indication of the importance of particular features in particular contexts
10 March 2005 CIM Montreal / McKay & Fujinaga 15/25 Types of classification performed The choice of “best” features is context-dependant e.g. best features for distinguishing between Baroque and Romantic different than when comparing Punk and Heavy Metal Performed three types of classification: FlatHierarchicalRound-robin Hierarchical and round-robin feature weighting allowed classifiers to use specialized weightings in order to improve performance
10 March 2005 CIM Montreal / McKay & Fujinaga 16/25 Taxonomies used Used hierarchical taxonomies A recording could belong to more than one category A category could be a child of multiple parents in the taxonomical hierarchy Performed experiments with two taxonomies: Large (38 leaf categories): Used to test system under realistic conditions Small (9 leaf categories): Used to loosely compare system to existing sytems
10 March 2005 CIM Montreal / McKay & Fujinaga 17/25 Large taxonomy
10 March 2005 CIM Montreal / McKay & Fujinaga 18/25 Small taxonomy JazzBebop Jazz Soul SwingPopularRapPunkCountry Western Classical Baroque Modern Classical Romantic
10 March 2005 CIM Montreal / McKay & Fujinaga 19/25 Training and testing Used ensembles of k-nearest neighbour and neural network classifiers 950 MIDI files Hand-classified for training based on a variety of on-line databases 5 fold cross-validation 80% training, 20% testing
10 March 2005 CIM Montreal / McKay & Fujinaga 20/25 Average success rates 9 Category Taxonomy Leaf: 90% Root: 98% 38 Category Taxonomy Leaf: 57% Root: 81%
10 March 2005 CIM Montreal / McKay & Fujinaga 21/25 Success rates achieved in previous research Audio results: Many systems have been implemented Generally only used 10 categories or less Success rates generally below 80% for more than 5 categories Symbolic results: 84% for 2-way classifications (Shan & Kuo 2003) 89% for 2-way classifications (Ponce de Leon & Inesta 2004) 63% for 3-way classifications (Chai & Vercoe 2001) 60-70% for 6-way classifications (Basili, Serafini & Stellato 2004)
10 March 2005 CIM Montreal / McKay & Fujinaga 22/25 Feature performance Feature Group Number of Features Weighting Scaled by Number of Features (%) Instrumentation Pitch Rhythm Melody Texture141.7 Dynamics41.6 Features based on instrumentation were assigned 46.1% of all weightings (after scaling) At least one instrumentation feature played a major role in almost every classifier Two of the top three features were based on instrumentation
10 March 2005 CIM Montreal / McKay & Fujinaga 23/25 Importance of instrumentation Features based on instrumentation clearly dominant A high-level abstraction of timbre Implies that audio classification systems could benefit from instrument identification modules Caveat: These results present the overall averages of weightings Other features played a dominant role in certain stages of classification The best results were achieved by including a wide variety of features and applying feature weighting
10 March 2005 CIM Montreal / McKay & Fujinaga 24/25 Conclusions Features based on instrumentation can play an essential role in automatic music classification, and should be used if possible High-level features can produce good results, and should not be neglected in favour of low- level features Bodhidharma’s large feature library combined with feature weighting is an effective approach Very good genre classification success rates can be achieved with small taxonomies, and we are at least approaching a point where large taxonomies can be dealt with effectively
