THE 2005 MUSIC INFORMATION RETRIEVAL EVALUATION EXCHANGE (MIREX 2005) RESULTS OVERVIEW Graduate School of Library and Information Science J. Stephen DownieKris WestAndreas EhmannEmmanuel Vincent GSLIS University of Illinois at Urbana-Champaign School of Computing Sciences University of East Anglia Electrical Engineering University of Illinois at Urbana-Champaign Electronic Engineering Queen Mary University of London MIREX Contest Results MIREX Contest Results Audio Melody Extraction RankParticipant Overall Accuracy 1Dressler, K.71.4% 2Ryynänen & Klapuri64.3% 3Poliner & Ellis61.1% 3Paiva, R % 5Marolt, M.59.5% 6Paiva, R % 7Goto, M.49.9% 8Vincent & Plumbley 147.9% 9Vincent & Plumbley 246.4% -Brossier, P.DNC Audio Key Finding RankParticipant Composite % Score 1İzmirli, Ö.89.55% 2 Purwins & Blankertz 89.00% 3Gómez, E % 4Gómez, E % 5Pauws, S.85.00% 6Zhu, Y.83.25% 7Chuan & Chew79.10% Symbolic Key Finding RankParticipant% Score 1Temperley, D.91.4% 2Zhu, Y.85.9% 3Rizo & Iñesta78.5% 4Ehmann, A.75.7% 5 Mardirossian & Chew 74.6% Audio Drum Detection RankParticipantF-measure 1Yoshii, Goto & Okuno Tanghe, Degroeve & De Baets Tanghe, Degroeve & De Baets Tanghe, Degroeve & De Baets Dittmar, C Paulus, J Gillet & Richard Gillet & Richard Audio Tempo Extraction RankParticipantP-Score 1 Alonso, David & Richard Uhle, C Uhle, C Gouyon & Dixon Peeters, G Gouyon & Dixon Gouyon & Dixon Eck, D Davies & Brossier Gouyon & Dixon Sethares, W Brossier, P Tzanetakis, G Key Issues for Future MIREX Key Issues for Future MIREX Establish a communications mechanism specifically devoted to the establishment of standardized and stable evaluation metrics Open discussions on the selection of more statistical significance testing procedures Establish new annotation tools and procedures to overcome the shortage of available ground-truth data Establish a more formal organizational structure for future MIREX contests Convene an online forum to produce a high-level development plan for the future of the M2K Toolkit Continue to develop the evaluator software and establish an open-source evaluation API Make useful evaluation data publicly available year round Establish a webservices-based IMIRSEL/M2K online system prototype Audio Artist Identification RankParticipant Avg Accuracy 1Mandel & Ellis72.45% 2 Bergstra, Casagrande & Eck % 3 Bergstra, Casagrande & Eck % 4Pampalk, E.61.28% 5West & Lamere47.24% 6Tzanetakis, G.42.05% 7Logan, B.25.95% Audio Genre Classification RankParticipant Avg Accuracy 1 Bergstra, Casagrande & Eck % 2 Bergstra, Casagrande & Eck % 3Mandel & Ellis78.81% 4West, K.75.29% 5Lidy & Rauber % 6Pampalk, E.75.14% 7Lidy & Rauber % 8Lidy & Rauber % 9Scaringella, N73.11% 10Ahrendt, P.71.55% 11Burred, J.62.63% 12Soares, V.60.98% 13Tzanetakis, G.60.72% -Li, M.TO -Kai & ShengDNC Audio Onset Detection RankParticipant Avg F- measure 1Lacoste & Eck % 2Lacoste & Eck % 3Ricard, J.74.80% 4Brossier, P.74.72% 5Röbel, A % 6Collins, N.72.10% 7Röbel, A % 8 Pertusa, Klapuri & Iñesta 58.92% 9West, K.48.77% Symbolic Melodic Similarity RankParticipant Avg Dynamic Recall 1 Grachten, Arcos & Mántaras 65.98% 2Orio, N.64.96% 3 Suyoto & Uitdenbogerd 64.18% 4 Typke, Wiering & Veltkamp 57.09% 5 Lemström, Mikkilä, Mäkinen & Ukkonen % 6 Lemström, Mikkilä, Mäkinen & Ukkonen % 7Frieler & Müllensiefen51.81% Symbolic Genre Classification RankParticipant Avg Accuracy 1McKay & Fujinaga77.17% 2 Basili, Serafini & Stellato (NB) 72.08% 3Li, M.67.57% 4 Basili, Serafini & Stellato (J48) 67.14% 5Ponce de Leon & Inesta 37.76% The continued near impossibility of establishing a common set of evaluation databases or the sharing of databases among researchers due primarily to intellectual property restrictions and the financial implications of those restrictions The ongoing lack of established evaluation metrics for the plethora of tasks in MIR The general tendency in the field to use small databases for evaluation with the difficulties associated with the creation of ground-truth data being a primary cause of this state of affairs Special Thanks to: The Andrew W. Mellon Foundation, the National Science Foundation, the content providers and the MIR community, the Automated Learning Group (ALG) at the National Center for Supercomputing Applications at UIUC, Paul Lamere of Sun Labs, X. Hu, J. Futrelle, M. Callahan, M. C. Jones, D. Tcheng, M. McCrory, S. Kim, and J. H. Lee, all of IMIRSEL, the GSLIS technology services team, and the ISMIR 2005 organizing committee MIREX 2005 Challenges MIREX 2005 Challenges ** Note: DNC – Did Not Complete, TO – Time Out