Realtime Recognition of Orchestral Instruments Ichiro Fujinaga McGill University
Overview Introduction Lazy learning (exemplar-based learning) k-NN classifier Genetic algorithm Features Results Conclusions
Introduction Realtime recognition of isolated monophonic orchestral instruments Spectrum analysis by Miller Puckette’s fiddle~ Adaptive system based on a exemplar- based classifier and a genetic algorithm
Overall Architecture Data Acquisition & Data Analysis (fiddle) Recognition K-NN Classifier Output Instrument Name Knowledge Base Feature Vectors Genetic Algorithm K-NN Classifier Best Weight Vector Live mic Input Sound file Input Off-line
Exemplar-based learning The exemplar-based learning model is based on the idea that objects are categorized by their similarity to one or more stored examples There is much evidence from psychological studies to support exemplar-based categorization by humans This model differs both from rule-based or prototype- based (neural nets) models of concept formation in that it assumes no abstraction or generalizations of concepts This model can be implemented using k-nearest neighbor classifier and is further enhanced by application of a genetic algorithm
Exemplar-based categorization Objects are categorized by their similarity to one or more stored examples No abstraction or generalizations, unlike rule-based or prototype-based models of concept formation Can be implemented using k-nearest neighbor classifier Slow and large storage requirements?
Exemplar-based learning The exemplar-based learning model is based on the idea that objects are categorized by their similarity to one or more stored examples There is much evidence from psychological studies to support exemplar-based categorization by humans This model differs both from rule-based or prototype- based (neural nets) models of concept formation in that it assumes no abstraction or generalizations of concepts This model can be implemented using k-nearest neighbor classifier and is further enhanced by application of a genetic algorithm
K-nearest-neighbor classifier Determine the class of a given sample by its feature vector: Distances between feature vectors of an unclassified sample and previously classified samples are calculated The class represented by the majority of k- nearest neighbors is then assigned to the unclassified sample
Example of k-NN classifier
Distance measures The distance in a N-dimensional feature space between two vectors X and Y can be defined as: A weighted distance can be defined as:
Genetic algorithms Optimization based on biological evolution Maintenance of population using selection, crossover, and mutation Chromosomes = weight vectors Fitness function = recognition rate Leave-one-out cross validation
Features Static features (per window) pitch mass or the integral of the curve (zeroth-order moment) centroid (first-order moment) variance (second-order central moment) skewness (third-order central moment) amplitudes of the harmonic partials number of strong harmonic partials spectral irregularity tristimulus Dynamic features means and velocities of static features over time
Data Original source: McGill Master Samples Over 1300 notes from 39 different timbres (23 orchestral instruments) Spectrum analysis by fiddle (2048 points) First 46–232ms of attack (1–9 windows) Each analysis window (46 ms) consists of a list of amplitudes and frequencies of the peaks in the spectra
Results Experiment I SHARC data static features Experiment II fiddle dynamic features Experiment III more features redefinition of attack point
Conclusions Realtime timbre recognition system Analysis by Puckette’s fiddle Recognition using dynamic features Adaptive recognizer by k-NN classifier enhanced with genetic algorithm A successful implementation of exemplar-based classifier in a time- critical environment
Future research Performer identification Speaker identification Tone-quality analysis Multi-instrument recognition Expert recognition of timbre
Recognition rate for different lengths of analysis window
Comparison with Human Performance