Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis Associate Professor, IEEE Senior Member.

Similar presentations


Presentation on theme: "1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis Associate Professor, IEEE Senior Member."— Presentation transcript:

1 1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis (gtzan@cs.uvic.ca)gtzan@cs.uvic.ca Associate Professor, IEEE Senior Member Tier II Canada Research Chair Computer Science Department (also in Music, ECE) University of Victoria, Canada

2 Copyright 2011 G.Tzanetakis MIR ‣ Interdisciplinary science of retrieving information from music ‣ ISMIR - Int. Symposium -> Int. Conf. on MIR -> Int. Conf. of the Society of MIR ‣ First ISMIR in 2000 ‣ Increasing presence in ICASSP, ICME, ACMM, TMM, TASLP, MMTA ‣ All proceedings are freely available online ‣ music-ir@listes.ircam.fr music-ir@listes.ircam.fr

3 3 Copyright 2011 G.Tzanetakis Connections Machine Learning Signal Processing Psychology Computer Science Information Science Human-Computer Interaction MUSIC

4 Copyright 2011 G.Tzanetakis 4 Music today ‣ Music is produced, distributed and consumed digitally ‣ 2011 digital music sales > physical album sales

5 5 Copyright 2011 G.Tzanetakis Industry 6725421

6 Copyright 2011 G.Tzanetakis Music Collections ‣ Personal music collections ~ thousands ‣ Streaming music sites, stores ~ millions ‣ Great celestial jukebox in the sky ~ all of recorded music in human history ‣ A 5-minute music track is digitally represented using approximately 26 million floating point numbers

7 7 Copyright 2011 G.Tzanetakis Overview  Focus on signal processing and audio  Audio Feature Extraction  Timbre, Pitch, Rhythm  Analysis  Similarity, Classification, Modelling Time  Tasks  Similarity, Genre classification, Tag annotation, Query-by-Humming, Audio-Score Alignment

8 8 Copyright 2011 G.Tzanetakis Audio Feature Extraction  Sound and sine waves  Timbral Features  Short Time Fourier Transform (STFT) Mel-Frequency Cepstral Coefficients (MFCC), Perceptual Audio Compression  Pitch and Harmony  Rhythm

9 9 Copyright 2011 G.Tzanetakis Linear Systems and Sinusoids in1 in2 in1 + in2 out1 out2 out1 + out2 Amplitude Period = 1 / Frequency 0 180 360 Phase True sine waves last forever sine wave -> LTI -> new sine wave

10 10 Copyright 2011 G.Tzanetakis Fourier Transform Text 1768-1830

11 Copyright 2011 G.Tzanetakis Short Time Fourier Transform Time-varying spectra Fast Fourier Transform FFT Input Time t t+1 t+2 Filters Oscillators Output Amplitude Frequency

12 12 Copyright 2011 G.Tzanetakis Spectrum and Shape Descriptors M F Centroid Rolloff Flux Bandwidth Moments.... Centroid Feature Space Feature vector =

13 13 Copyright 2011 G.Tzanetakis Mel Frequency Cepstral Coefficients Mel-scale 13 linearly-spaced filters 27 log-spaced filters CF CF-130 CF / 1.0718 CF+130 CF * 1.0718 Mel-filtering Log DCT MFCCs

14 14 Copyright 2011 G.Tzanetakis Audio Feature Extraction

15 15 Copyright 2011 G.Tzanetakis Traditional Music Representations

16 16 Copyright 2011 G.Tzanetakis Pitch content  Harmony, melody = pitch concepts  Music Theory Score = Music  Bridge to symbolic MIR  Automatic music transcription  Non-transcriptive arguments Split the octave to discrete logarithmically spaced intervals

17 17 Copyright 2011 G.Tzanetakis Pitch Detection P Time-domain Frequency-domain Perceptual Pitch is a PERCEPTUAL attribute correlated but not equivalent to fundamental frequency

18 18 Copyright 2011 G.Tzanetakis Time Domain C4 Clarinet Note C4 Sine Wave # zero-crossings sensitive to noise – needs LPF

19 19 Copyright 2011 G.Tzanetakis AutoCorrelation Efficient computation possible for powers of 2 using FFT F(f) = FFT(X(t)) S(f) = F(f) F*(f) R(l) = IFFT(S(f))

20 20 Copyright 2011 G.Tzanetakis Frequency Domain Fundamental frequency (as well as pitch) will correspond to peaks in the Spectrum. The fundamental does not necessarily have the highest amplitude. Sine C4Clarinet C4

21 21 Copyright 2011 G.Tzanetakis Chroma – Pitch perception

22 22 Copyright 2011 G.Tzanetakis Automatic Rhythm Description

23 23 Copyright 2011 G.Tzanetakis Beat Histograms Tzanetakis et al AMTA01 max(h(i)), argmax(h(i)) Beat Histogram Features

24 24 Copyright 2011 G.Tzanetakis Analysis Overview Musical Piece Trajectory Point Cloud

25 25 Copyright 2011 G.Tzanetakis Content-based Similarity Retrieval (or query-by-example) Point Input: Query example Output: Ranked list of similar audio files based on feature vector similarity

26 26 Copyright 2011 G.Tzanetakis p( | ) * P( ) Classification Decision boundary Partitioning of feature space Generative vs discriminative models P( | )= p( ) Music Speech

27 27 Copyright 2011 G.Tzanetakis Classification  Genre/Style  Emotion/Mood  Artist  Instrument MIREX 2007 10 genres 700 30-second clips / genre

28 28 Copyright 2011 G.Tzanetakis Multi-tag annotation  Free-form tags (female voice, woman singing)  Multi-label classification problems with twists  Issues: synonyms, subpart relations, sparse,noisy  Cold start problem  Typically each tag is treated independently as a classification problem  Inverse also interesting (query-by-keywords)

29 29 Copyright 2011 G.Tzanetakis Stacking

30 30 Copyright 2011 G.Tzanetakis Polyphonic Audio-Score Alignment  Representation  Time Series of Chroma  Matching Procedure  Dynamic Time Warping

31 31 Copyright 2011 G.Tzanetakis Dynamic Time Wraping Aligned Performances of the same orchestral piece Attempting to align two different orchestra pieces

32 32 Copyright 2011 G.Tzanetakis Query-by-humming  User sings a melody  Computer searches database for song containing the melody  The challenge of difficult queries

33 33 Copyright 2011 G.Tzanetakis The MUSART system  Query preprocessing  Pitch contour extraction (audio)  Note segmentation (symbolic)  Target preprocessing (symbolic)  Theme extraction  Model-forming, representation  Search to find approximate match  Dynamic Time Warping, HMMs

34 34 Copyright 2011 G.Tzanetakis Conclusions  Through a combination of digital signal processing and machine learning techniques a variety of music information retrieval tasks have been explored in the literature  The tasks covered in this presentation are representative of existing work and there are already commercial implementations for them. There are many more that are actively being investigated.  Music is a complex and fascinating signal and we are just beginning to understand it better using computers


Download ppt "1 Copyright 2011 G.Tzanetakis Music Information Retrieval George Tzanetakis Associate Professor, IEEE Senior Member."

Similar presentations


Ads by Google