Download presentation
Presentation is loading. Please wait.
Published byCharlotte Walker Modified over 8 years ago
1
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain Stefan Kersten, Rafael Ramirez Music Technology Group, Universitat Pompeu-Fabra, Barcelona, Spain
2
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
3
Introduction Chord progressions as features for genre recognition In classical music, each period had its own harmonic framework to compose We can find typical chord progressions in every style: I-VI-ii-V (jazz) I-IV-V (pop-rock) Chord progressions are symbolic features They can be applied to classify audio data if a chord transcription system is applied
4
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
5
System architecture MUSIC AUDIO FILE Am Dm Em … CHORD PROGRESSIONS C/Am TRANSPOSED AUDIO TO CHORD TRANSCRIPTION SYSTEM AUDIO DOMAIN SYMBOLIC DOMAIN CLASSICAL JAZZ POPULAR LANGUAGE MODELS GENRE EVALUATION
6
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
7
Audio feature extraction MUSIC AUDIO FILE FEATURE EXTRACTION (HARMONIC PITCH CLASS PROFILES) KEY ESTIMATION AUDIO TO CHORD TRANSCRIPTION SYSTEM BEAT TRACKING/ CHORD WEIGHTING Cm Fm Gm … CHORD PROGRESSIONS Am Dm Em … CHORD PROGRESSIONS C/Am TRANSPOSED
8
Audio feature extraction (2) Frame-based extraction of Harmonic Pitch Class Profile (HPCP) Spectral peak tracking and mapping of peaks into 36 pitch classes (enhanced chroma feature) [ E. Gómez. “Tonal Description of Music Audio Signals”. PhD thesis, 2006 ] The most probable chord is selected for each frame
9
Audio feature extraction (3) Beat detection using two algorithms: Dixon's multiple agent algorithm (BeatRoot) [ S. Dixon. “Evaluation of the audio beat tracking system beatroot”. Journal of New Music Research, 36(1): 39–50, 2007 ] Ellis' dynamic programming algorithm [ D. P. W. Ellis. “Beat tracking by dynamic programming”. Journal of New Music Research, 36(1): 51–60, 2007 ] Selection of the onset stream with the least mean variance in tempo period
10
Audio feature extraction (4) Computation of beat-level features from frame- level features by building chord histogram Chords weighted by their probabilities computed in the previous feature extraction step Select maximum in histogram as the most salient chord among the frames in a given beat Key and mode of the song are determined from HPCP in order to transpose the chord transcription
11
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
12
Language models (LM) … … … Am Dm Em … n-gram extraction 2-gram probability Am Dm 0.012 Dm Em 0.008 … LANGUAGE MODEL Training: n = 2 We obtain a language model for each genre LM can be constructed using different n-gram lengths (2-grams, 3-grams, 4-grams) For each genre, we have a set of chord progression files A new problem file is evaluated against each LM The genre is assigned as that maximizing the probability of the LM Test:
13
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
14
Datasets DS1: Ground-truth of chord progressions 761 Band-in-a-box (symbolic) files in three genres: classical, jazz and popular; synthesized into audio Full chords: Fm7 Bb7 Em7 A7 Dm7 Am7 Triads: Fm Bb Em A Dm Am DS2: 12 audio files extracted from commercial CDs of the same genres
15
Chord transcription system proof of concept SYNTHESIS LANGUAGE MODELS TRANSCRIPTION LANGUAGE MODELS TRAINING 10-FOLD CROSS VALIDATION 10-FOLD CROSS VALIDATION DS1 … … … Am Dm Em … … … … Am Dm Em …
16
Chord transcription system proof of concept Poorer performance for triads due to the lack of full chord structure information Errors in the transcription process: Bad chord recognitions Bad key estimation Good results despite the simplicity of the features; note that a vocabulary of 24 chords (only major/minor triads) was used
17
Evaluation with real audio data LANGUAGE MODELS TRANSCRIPTION TRAINING DS1 DS2 FEATURE EXTRACTION (n-grams) … … … Am Dm Em … … … … C Em F … GENRE EVALUATION
18
Evaluation with real audio data 3- and 4-grams perform the same and slightly better than 2-grams Better results obtained when classifying jazz against academic or popular music
19
Outline Introduction System architecture Audio to chord transcription Language modeling Experiments Conclusions and future work
20
Conclusions Chord progressions seem to be a suitable representation for genre classification State-of-the-art transcription systems can be used to obtain chord progressions from audio files for this task Results limited by the small vocabulary size
21
Future work Improve the chord transcription system Better chord recognition, including structures like 7th, dim, aug… Combination of the system with melody-based language models
22
Genre Classification of Music by Tonal Harmony Carlos Pérez-Sancho, David Rizo Departamento de Lenguajes y Sistemas Informáticos, Universidad de Alicante, Spain Stefan Kersten, Rafael Ramirez Music Technology Group, Universitat Pompeu-Fabra, Barcelona, Spain
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.