Automatic Music Classification Cory McKay. 2/47 Introduction Many areas of research in music information retrieval (MIR) involve using computers to classify.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Florida International University COP 4770 Introduction of Weka.
Ch:8 Design Concepts S.W Design should have following quality attribute: Functionality Usability Reliability Performance Supportability (extensibility,
C6 Databases.
1 VLDB 2006, Seoul Mapping a Moving Landscape by Mining Mountains of Logs Automated Generation of a Dependency Model for HUG’s Clinical System Mirko Steinle,
Design Concepts and Principles
1 Software Design Introduction  The chapter will address the following questions:  How do you factor a program into manageable program modules that can.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Management Information Systems, Sixth Edition
Copyright Irwin/McGraw-Hill Software Design Prepared by Kevin C. Dittman for Systems Analysis & Design Methods 4ed by J. L. Whitten & L. D. Bentley.
Information Retrieval in Practice
Managing Data Resources
Client/Server Databases and the Oracle 10g Relational Database
 Image Search Engine Results now  Focus on GIS image registration  The Technique and its advantages  Internal working  Sample Results  Applicable.
Introduction to Databases Transparencies
Interpret Application Specifications
Classification of Music According to Genres Using Neural Networks, Genetic Algorithms and Fuzzy Systems.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Overview of Search Engines
Lecture-8/ T. Nouf Almujally
1 Introduction to Web Development. Web Basics The Web consists of computers on the Internet connected to each other in a specific way Used in all levels.
Database Systems: Design, Implementation, and Management Ninth Edition
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
JSymbolic and ELVIS Cory McKay Marianopolis College Montreal, Canada.
Graph-RAT Overview By Daniel McEnnis. 2/32 What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Issues with Data Mining
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Module Title? DBMS Introduction to Database Management System.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
JSymbolic Cedar Wingate MUMT 621 Professor Ichiro Fujinaga 22 October 2009.
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
© 2007 by Prentice Hall 1 Introduction to databases.
ACE: A Framework for optimizing music classification Cory McKay Rebecca Fiebrink Daniel McEnnis Beinan Li Ichiro Fujinaga Music Technology Area Faculty.
Use of Hierarchical Keywords for Easy Data Management on HUBzero HUBbub Conference 2013 September 6 th, 2013 Gaurav Nanda, Jonathan Tan, Peter Auyeung,
Discovering Computers Fundamentals Fifth Edition Chapter 9 Database Management.
Tutorial 7 Working with Multimedia. New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition 2 Objectives Explore various multimedia applications.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Automatic music classification and the importance of instrument identification Cory McKay and Ichiro Fujinaga Music Technology Area Faculty of Music McGill.
DATABASE MANAGEMENT SYSTEMS CMAM301. Introduction to database management systems  What is Database?  What is Database Systems?  Types of Database.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
McGill University > Schulich School of Music > Music Technology > MUMT 611 j j MusicMetaManager j j Cory McKay Jason A. Hockman part of the jMIR software.
Graph RAT A framework for integrating social and content data By Daniel McEnnis University of Waikato To what extent do artists cluster into genres Pattern.
Presented By- Shahina Ferdous, Student ID – , Spring 2010.
Issues in Automatic Musical Genre Classification Cory McKay.
Data Mining and Decision Support
Copyright (c) 2014 Pearson Education, Inc. Introduction to DBMS.
Hall, Accounting Information Systems, 8e ©2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
1 / 22 jSymbolic Jordan Smith – MUMT 611 – 6 March 2008.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Your Interactive Guide to the Digital World Discovering Computers 2012 Chapter 13 Computer Programs and Programming Languages.
Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.
Information Retrieval in Practice
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Waikato Environment for Knowledge Analysis
Database Management System (DBMS)
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
The ultimate in data organization
Presentation transcript:

Automatic Music Classification Cory McKay

2/47 Introduction Many areas of research in music information retrieval (MIR) involve using computers to classify music in various ways  Genre or style classification  Mood classification  Performer or composer identification  Music recommendation  Playlist generation  Hit prediction  Audio to symbolic transcription  etc. Such areas often share similar central procedures

3/47 Fundamental music classification tasks (1/3) Musical data collection  The instances (basic entities) to classify  Audio recordings, scores, cultural data, etc. Feature extraction  Features represent characteristic information about instances  Must provide sufficient information to segment instances among classes (categories) Machine learning  Algorithms (“classifiers” or “learners”) learn to associate feature patterns of instances with their classes Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Classifications Music

4/47 Fundamental music classification tasks (2/3) Many classification tasks require metadata about instances  Title, composer, performer, genre, date, etc. Must be validated and corrected  Raw information found in ID3 tags, Gracenote CDDB, etc. often erroneous and inconsistent Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Analysis Classifications Music

5/47 Fundamental music classification tasks (3/3) Supervised learning requires training  Correctly labeled model instances (“ground truth”) are used to teach classifiers to associate certain feature patterns with desired classes  Trained classifiers can then classify novel instances Success of classifiers is dependent on the quality of the ground truth  It is therefore essential that the metadata labeling of the musical data be accurate Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Metadata Analysis Classifications Music Classifier Training

6/47 Consolidating fundamental tasks Properly performing these tasks requires significant effort and knowledge in (at least):  Data mining  Signal processing  Musicology Result:  Naïve or improperly performed research  Duplication of effort  Reluctance to use automatic music classification in musicological or other research where it could be useful Solution: standardized MIR research software  Makes automatic music classification technology available to researchers in many disciplines

7/47 Existing MIR software Only a few MIR software systems have been built for use by other researchers  e.g. Marsyas and M2K  Tend to focus primarily on particular sub-tasks e.g. audio feature extraction  Not typically well integrated with other systems  Do not sufficiently emphasize extensibility  Typically have usability problems Installation and licensing issues, poor documentation Result:  Emphasis on existing techniques rather than development of new approaches  Difficulties in integrating research between labs  Inaccessible to non-technical music researchers

8/47 jMIR has been developed to meet the need for standardized MIR research software  Has a separate software component to address each important aspect of automatic music classification Each component can be used independently Combinations of components can be used as an integrated whole  Architectural emphasis on providing an extensible platform for iteratively developing new techniques and algorithms Can also be used directly as is  Interfaces designed for both technical and non-technical users Well-documented  Free and open source Cross-platform Java implementation jMIR

9/47 Musical data collection Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Metadata Analysis Classifications Music Classifier Training

10/47 Types of musical data Audio recordings  Sampled sound  Wave, MP3, AAC, etc. Symbolic recordings  Abstract musical instructions  Scores, MIDI, Humdrum, etc. Cultural information  Information external to musical content itself e.g. playlists, album reviews, Billboard stats, etc.  Based on web searches, surveys, expert opinion, etc. Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Musical Data Collection

11/47 Connections between data types Automatic transcription technologies are increasingly making it possible to automatically generate symbolic recordings from audio Metadata annotations are necessary for linking cultural information with particular recordings Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Metadata Transcription Musical Data Collection

12/47 jMIR Codaich A research database of labeled MP3 recordings  For use in training and testing algorithms There are plans to eventually include additional format types in Codaich  Including symbolic formats Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Metadata Transcription jMIR Codaich Musical Data Collection

13/47 Sharing Codaich Codaich is intended to provide a common knowledge base that can be used by researchers in different labs to compare the effectiveness of their varying approaches Overcoming copyright limitations on distributing music:  On-demand Feature Extraction Network (OMEN) Implemented by Daniel McEnnis  Researchers use distributed computing and the jMIR jAudio feature extractor to request local feature extraction at sites (e.g., libraries) that have legal access to individual recordings  jAudio and OMEN allow custom original features and extraction parameters

14/47 Statistics on Codaich MP3 recordings  Constantly growing 2247 artists 55 genres  Popular, classical, jazz and “world” 19 metadata fields

15/47 jMIR Bodhidharma MIDI Database Collection of labeled MIDI recordings 950 recordings 38 genres Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Metadata Transcription jMIR Codaich jMIR Bodhidharma MIDI Database Musical Data Collection

16/47 jMIR jMusicMetaManager Metadata found with recordings is typically problematic  Inconsistent  Error-prone jMusicMetaManager is software that automatically analyzes metadata across recordings Is currently used to maintain Codaich  There are plans to adapt it to MIDI as well Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Metadata Transcription jMIR Codaich jMIR jMusicMeta- Manager jMIR Bodhidharma MIDI Database Musical Data Collection

17/47 Tasks performed by jMusicMetaManager Detects differing metadata values that should in fact be the same  e.g. in an performer identification task, “Charlie Mingus” should not be misclassified as a different performer than “Mingus, Charles” Detects redundant copies of recordings  Could contaminate test sets Generates inventory and statistical profile reports  39 reports in all

18/47 How jMusicMetaManager works Calculates edit distance between pairs of field values  Threshold based on field lengths Performs 23 additional pre- processing equivalency operations Considers varied word orderings and word subsets Applies false error filtering

19/47 jMusicMetaManager’s I/O Parses metadata from Apple iTunes XML or MP3 ID3 tags  And Gracenote CDDB, indirectly  Can export to ACE XML or Weka ARFF Generates reports in frames-based HTML

20/47 Musical data collection summary Symbolic Recordings MIDI, scores, Humdrum, etc. Audio Recordings MP3, AAC, Wave, etc. Cultural Information Web, surveys, experts, etc. Metadata Transcription jMIR Codaich jMIR jMusicMeta- Manager jMIR Bodhidharma MIDI Database Musical Data Collection

21/47 Feature extraction Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Metadata Analysis Classifications Music Classifier Training

22/47 Types of features Low-level  Associated with signal processing and basic auditory perception  e.g. spectral flux or RMS  Usually not intuitively musical High-level  Musical abstractions  e.g. meter or pitch class distributions Cultural  Sociocultural information outside the scope of auditory or musical content  e.g. playlist co-occurrence or purchase correlations Feature Extraction Low-Level Features High-Level Features Cultural Features

23/47 jMIR jAudio Implemented jointly with Daniel McEnnis Extracts features from audio files  MP3, WAV, AIFF, AU, SND 28 bundled core features  Mainly low-level  Some high-level Audio Recordings jMIR jAudio Feature Extraction Low-Level Features High-Level Features Cultural Features Extracted Feature Values

24/47 Developing features with jAudio Two general ways of using jAudio  Directly as an audio feature extractor  Platform for developing and sharing new features Can be independent features Can be based on existing features New features are added using a modular plugin interface  jAudio (like all jMIR feature extractors) automatically calculates feature dependencies and scheduling at runtime

25/47 Metafeatures and aggregators jAudio automatically calculates “metafeatures” of new or existing features  e.g. running means, standard deviations or derivatives across sample windows jAudio automatically calculates “aggregators” for new or existing features  Functions that collapse a sequence of feature vectors into a single vector or smaller sequence of vectors  Useful for representing in a low-dimensional way how different features change together  e.g. the Area of Moments aggregator transforms a set of feature vectors into a two-dimensional image matrix and calculates two-dimensional moments

26/47 Using jAudio Customizable extraction parameters  Window size and overlap  Normalization  Downsampling  Individual feature parameters Records and synthesizes audio Converts MIDI to audio Displays audio in both the time and frequency domains

27/47 jMIR jSymbolic Extracts high- level features from MIDI files 111 bundled features  Currently being expanded to 160  Many are original Symbolic Recordings Audio Recordings jMIR jAudio jMIR jSymbolic Feature Extraction Low-Level Features High-Level Features Cultural Features Extracted Feature Values

28/47 jSymbolic’s features Features fall into 7 broad categories  Instrumentation  Musical Texture  Rhythm  Dynamics  Pitch Statistics  Melody  Chords Histogram aggregators are often used  Rhythm, pitch, pitch class, melody, vertical interval and chord histograms

29/47 jMIR jWebMiner Extracts cultural features from the web using web services  Google  Yahoo! Calculates the coocurrence and cross tabulation of metadata fields  e.g. how often does Bach co-occur on a web page with Baroque, compared to Stravinsky? Currently in alpha development Symbolic Recordings Audio Recordings Cultural Information jMIR jAudio jMIR jSymbolic jMIR jWebMiner Feature Extraction Low-Level Features High-Level Features Cultural Features Extracted Feature Values

30/47 jWebMiner’s functionality Parses search terms from:  iTunes, ACE XML, Weka ARFF, text Can assign higher weights to particular sites  e.g. All Music, Wikipedia, Pitchfork, etc. Can enforce filter words  e.g. a site must include the word “music” to be considered

31/47 Feature extraction summary Symbolic Recordings Audio Recordings Cultural Information jMIR jAudio jMIR jSymbolic jMIR jWebMiner Feature Extraction Low-Level Features High-Level Features Cultural Features Extracted Feature Values

32/47 Machine learning Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Metadata Analysis Classifications Music Classifier Training

33/47 Some types of machine learning Supervised  Learners trained on model labeled instances Unsupervised  Examines instances in terms of internal similarities rather than externally provided labels Ensemble  Multiple classifiers work together  Hopefully perform better overall than individually Supervised Algorithms Machine Learning Unsupervised Algorithms Ensemble Algorithms

34/47 Input to machine learning systems Extracted feature values serve as the percepts of classifiers Ground truth needed by supervised learners A class ontology (structured set of relationships between classes) is sometimes used  Some learners can capitalize on structuring  Long-term goal is to allow arbitrary ontologies in jMIR Supervised Algorithms Machine Learning Extracted Features Ground Truth Unsupervised Algorithms Ensemble Algorithms Class Ontology

35/47 Training and testing sets Data segmented into training and testing sets if classifiers need to be trained  To avoid overtraining (failure to generalize training instance features to those of the general instance population) Feature values are simply passed on if training is not needed Supervised Algorithms Machine Learning Extracted Features Ground Truth Unsupervised Algorithms Ensemble Algorithms Class Ontology Training Sets Testing Sets Features to Classify OR

36/47 Dimensionality reduction algorithms Too many features degrade classifier performance  “Curse of dimensionality” Too few features can fail to encapsulate sufficient information Dimensionality reduction algorithms automatically find a good lower- dimensional subset or projection of the given features Supervised Algorithms Machine Learning Extracted Features Ground Truth Unsupervised Algorithms Ensemble Algorithms Dimensionality Reduction Algorithms Class Ontology Training Sets Testing Sets Features to Classify OR

37/47 Output of machine learning systems Classifications of instances are output if no supervised training is needed Metalearners can be used to choose appropriate classifier(s)  Each algorithm has its own strengths and weaknesses  Training output consists of evaluations of each algorithm as well as the trained classifiers Supervised Algorithms Machine Learning Extracted Features Ground Truth Classification Results Unsupervised Algorithms Ensemble Algorithms Dimensionality Reduction Algorithms Class Ontology Algorithm Evaluations Training Sets Testing Sets Features to Classify OR Trained Classifiers OR

38/47 jMIR ACE ACE is jMIR’s classifier and metalearner  Automatically experiments with and selects classifier(s)  Trains classifiers  Classifies novel instances Supervised Algorithms Machine Learning Extracted Features Ground Truth Classification Results Unsupervised Algorithms Ensemble Algorithms jMIR ACE Dimensionality Reduction Algorithms Class Ontology Algorithm Evaluations Training Sets Testing Sets Features to Classify OR Trained Classifiers OR

39/47 Algorithms experimented with by ACE Classifiers:  Induction trees, naive Bayes, k-nearest neighbour, neural networks, support vector machines  Classifier parameters are also varied automatically Dimensionality reduction:  Principal component analysis, exhaustive searches, feature selection using genetic algorithms Classifier ensembles:  Bagging, boosting Additional algorithms will be added in the future:  Including unsupervised learning algorithms Researchers are encouraged to add their own algorithms  ACE, like all jMIR components, emphasizes extensibility  ACE utilizes the Weka general pattern recognition library

40/47 Details of ACE ACE evaluates algorithms in terms of  Classification accuracy  Performance consistency  Training complexity / time  Classification complexity / time There are future plans to utilize distributed computing to spread out the computational burden  Will also add the ability to impose limits on the time available for the ACE metalearner to come up algorithm selections

41/47 ACE’s interface Command line Java API GUI  In alpha development

42/47 jMIR ACE XML files Allow jMIR components to communicate with each other Allow jMIR output to be used by other software  To help ensure interoperability, jMIR components also produce and parse Weka ARFF files Supervised Algorithms Machine Learning Extracted Features Ground Truth Classification Results Unsupervised Algorithms Ensemble Algorithms jMIR ACE Dimensionality Reduction Algorithms jMIR ACE XML Files Class Ontology Algorithm Evaluations Training Sets Testing Sets Features to Classify OR Trained Classifiers OR

43/47 Details of the ACE XML files Information stored in ACE XML files:  Feature values and information about features  Model classifications and other metadata  Class taxonomies Will be expanded to general ontologies in the future Advantages of ACE XML compared to general data mining file formats (e.g. Weka ARFF)  Ability to assign multiple classes to individual instances  Ability to classify both overall instances and their sub-sections  Maintenance of logical groupings of multi-dimensional features  Maintenance of internal identifying metadata about instances  Ability to represent taxonomical class structures

44/47 Machine learning summary Supervised Algorithms Machine Learning Extracted Features Ground Truth Classification Results Unsupervised Algorithms Ensemble Algorithms jMIR ACE Dimensionality Reduction Algorithms jMIR ACE XML Files Class Ontology Algorithm Evaluations Training Sets Testing Sets Features to Classify OR Trained Classifiers OR

45/47 Overview of jMIR jAudiojSymbolicjWebMiner jMIR and its Components Codaich jMusicMetaManager Bodhidharma Audio MusicSymbolic Music Internet ACE XML Files ACE Classification Output Algorithm Evaluations Trained Classifiers OR Musical Data Collection Basic Classification Tasks Feature Extraction Machine Learning Metadata Metadata Analysis Classifications Music Classifier Training

46/47 Goals of jMIR Make sophisticated pattern recognition technologies accessible to music researchers with both technical and non-technical backgrounds Increase cooperation between research groups  Enable objective comparisons of algorithms  Eliminate redundant duplication of effort  Facilitate iterative development and sharing of new MIR technologies Facilitate research combining all 3 feature types  Limited intersection of information encapsulated by each type  Significant potential to improve classification performance

47/47 Contact information Software available at:  