Download presentation
Presentation is loading. Please wait.
Published byAyana McNeil Modified over 9 years ago
1
1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis (gtzan@cs.uvic.ca)gtzan@cs.uvic.ca Associate Professor, IEEE Senior Member Tier II Canada Research Chair Computer Science Department (also in Music, ECE) University of Victoria, Canada
2
Copyright 2011 G.Tzanetakis MIR ‣ Interdisciplinary science of retrieving information from music ‣ ISMIR - Int. Symposium -> Int. Conf. on MIR -> Int. Conf. of the Society of MIR ‣ First ISMIR in 2000 ‣ Increasing presence in ICASSP, ICME, ACMM, TMM, TASLP, MMTA ‣ All proceedings are freely available online ‣ music-ir@listes.ircam.fr music-ir@listes.ircam.fr
3
3 Copyright 2011 G.Tzanetakis Connections Machine Learning Signal Processing Psychology Computer Science Information Science Human-Computer Interaction MUSIC
4
Copyright 2011 G.Tzanetakis 4 Music today ‣ Music is produced, distributed and consumed digitally ‣ 2011 digital music sales > physical album sales
5
5 Copyright 2011 G.Tzanetakis Industry 6725421
6
Copyright 2011 G.Tzanetakis Music Collections ‣ Personal music collections ~ thousands ‣ Streaming music sites, stores ~ millions ‣ Great celestial jukebox in the sky ~ all of recorded music in human history ‣ A 5-minute music track is digitally represented using approximately 26 million floating point numbers
7
7 Copyright 2011 G.Tzanetakis Overview Focus on signal processing and audio Audio Feature Extraction Timbre, Pitch, Rhythm Analysis Similarity, Classification, Modelling Time Tasks Similarity, Genre classification, Tag annotation, Query-by-Humming, Audio-Score Alignment
8
8 Copyright 2011 G.Tzanetakis Audio Feature Extraction Sound and sine waves Timbral Features Short Time Fourier Transform (STFT) Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Audio Compression Pitch and Harmony Rhythm
9
9 Copyright 2011 G.Tzanetakis Linear Systems and Sinusoids in1 in2 in1 + in2 out1 out2 out1 + out2 Amplitude Period = 1 / Frequency 0 180 360 Phase True sine waves last forever sine wave -> LTI -> new sine wave
10
10 Copyright 2011 G.Tzanetakis Fourier Transform Text 1768-1830
11
Copyright 2011 G.Tzanetakis Short Time Fourier Transform Time-varying spectra Fast Fourier Transform FFT Input Time t t+1 t+2 Filters Oscillators Output Amplitude Frequency
12
12 Copyright 2011 G.Tzanetakis Spectrum and Shape Descriptors M F Centroid Rolloff Flux Bandwidth Moments.... Centroid Feature Space Feature vector =
13
13 Copyright 2011 G.Tzanetakis Mel Frequency Cepstral Coefficients Mel-scale 13 linearly-spaced filters 27 log-spaced filters CF CF-130 CF / 1.0718 CF+130 CF * 1.0718 Mel-filtering Log DCT MFCCs
14
14 Copyright 2011 G.Tzanetakis Audio Feature Extraction
15
15 Copyright 2011 G.Tzanetakis Traditional Music Representations
16
16 Copyright 2011 G.Tzanetakis Pitch content Harmony, melody = pitch concepts Music Theory Score = Music Bridge to symbolic MIR Automatic music transcription Non-transcriptive arguments Split the octave to discrete logarithmically spaced intervals
17
17 Copyright 2011 G.Tzanetakis Pitch Detection P Time-domain Frequency-domain Perceptual Pitch is a PERCEPTUAL attribute correlated but not equivalent to fundamental frequency
18
18 Copyright 2011 G.Tzanetakis Time Domain C4 Clarinet Note C4 Sine Wave # zero-crossings sensitive to noise – needs LPF
19
19 Copyright 2011 G.Tzanetakis AutoCorrelation Efficient computation possible for powers of 2 using FFT F(f) = FFT(X(t)) S(f) = F(f) F*(f) R(l) = IFFT(S(f))
20
20 Copyright 2011 G.Tzanetakis Frequency Domain Fundamental frequency (as well as pitch) will correspond to peaks in the Spectrum. The fundamental does not necessarily have the highest amplitude. Sine C4Clarinet C4
21
21 Copyright 2011 G.Tzanetakis Chroma – Pitch perception
22
22 Copyright 2011 G.Tzanetakis Automatic Rhythm Description
23
23 Copyright 2011 G.Tzanetakis Beat Histograms Tzanetakis et al AMTA01 max(h(i)), argmax(h(i)) Beat Histogram Features
24
24 Copyright 2011 G.Tzanetakis Analysis Overview Musical Piece Trajectory Point Cloud
25
25 Copyright 2011 G.Tzanetakis Content-based Similarity Retrieval (or query-by-example) Point Input: Query example Output: Ranked list of similar audio files based on feature vector similarity
26
26 Copyright 2011 G.Tzanetakis p( | ) * P( ) Classification Decision boundary Partitioning of feature space Generative vs discriminative models P( | )= p( ) Music Speech
27
27 Copyright 2011 G.Tzanetakis Classification Genre/Style Emotion/Mood Artist Instrument MIREX 2007 10 genres 700 30-second clips / genre
28
28 Copyright 2011 G.Tzanetakis Multi-tag annotation Free-form tags (female voice, woman singing) Multi-label classification problems with twists Issues: synonyms, subpart relations, sparse,noisy Cold start problem Typically each tag is treated independently as a classification problem Inverse also interesting (query-by-keywords)
29
29 Copyright 2011 G.Tzanetakis Stacking
30
30 Copyright 2011 G.Tzanetakis Polyphonic Audio-Score Alignment Representation Time Series of Chroma Matching Procedure Dynamic Time Warping
31
31 Copyright 2011 G.Tzanetakis Dynamic Time Wraping Aligned Performances of the same orchestral piece Attempting to align two different orchestra pieces
32
32 Copyright 2011 G.Tzanetakis Query-by-humming User sings a melody Computer searches database for song containing the melody The challenge of difficult queries
33
33 Copyright 2011 G.Tzanetakis The MUSART system Query preprocessing Pitch contour extraction (audio) Note segmentation (symbolic) Target preprocessing (symbolic) Theme extraction Model-forming, representation Search to find approximate match Dynamic Time Warping, HMMs
34
34 Copyright 2011 G.Tzanetakis Conclusions Through a combination of digital signal processing and machine learning techniques a variety of music information retrieval tasks have been explored in the literature The tasks covered in this presentation are representative of existing work and there are already commercial implementations for them. There are many more that are actively being investigated. Music is a complex and fascinating signal and we are just beginning to understand it better using computers
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.