Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon.

Similar presentations


Presentation on theme: "Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon."— Presentation transcript:

1 Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon University gtzan@cs.cmu.edu http://www.cs.cmu.edu/~gtzan

2 Copyright Nov. 2002, George Tzanetakis Overview Music Information Retrieval (MIR) and Computer Audition Motivation Techniques Applications Computer Music and Sound Synthesis Examples, demos

3 Copyright Nov. 2002, George Tzanetakis MIR Music History 9000 B.C 1000 1700 187719602002

4 Copyright Nov. 2002, George Tzanetakis Music 4 million recorded CD tracks 4000 CDs / month Mp3 bandwidth % Global Pervasive Persistent Why ?

5 Copyright Nov. 2002, George Tzanetakis The future of MIR Library of all recorded music Tasks: organize, search, retrieve, classify, recommend, browse, listen, annotate Examples:

6 Copyright Nov. 2002, George Tzanetakis Audio MIR Pipeline Signal ProcessingMachine LearningHuman Computer Interaction Hearing Representation Understanding Analysis Reacting Interaction

7 Copyright Nov. 2002, George Tzanetakis Traditional Music Representations

8 Copyright Nov. 2002, George Tzanetakis Time domain waveform time pressure Decompose into building blocks time frequency

9 Copyright Nov. 2002, George Tzanetakis MIDI Musical Instrument Digital Interface Hardware interface File format Note events Duration, discrete pitch, “instrument” Extensions General MIDI Notation, OMR, continuous pitch

10 Copyright Nov. 2002, George Tzanetakis Symbolic vs Audio MIR Audio Polyphonic Transcription Symbolic Representation (MIDI) MIR Audio Computer Audition MIR Machine Learning Models

11 Copyright Nov. 2002, George Tzanetakis Feature extraction

12 Copyright Nov. 2002, George Tzanetakis Timbral Texture Timbre = differentiate sounds of same loudness, pitch Timbral Texture = differentiate mixtures of sounds Global, statistical and fuzzy properties

13 Copyright Nov. 2002, George Tzanetakis Spectrum t t+1 M M

14 Copyright Nov. 2002, George Tzanetakis Fourier Transform P=1/f

15 Copyright Nov. 2002, George Tzanetakis Short Time Fourier Transform STFT Filterbank interpretation FiltersOscillators Amplitude Frequency output

16 Copyright Nov. 2002, George Tzanetakis Short Time Fourier Transform II t t+1 M M

17 Copyright Nov. 2002, George Tzanetakis Formants From “Real Time Synthesis for Interactive Applications” P.Cook, A.K Peters Press, used by permission

18 Copyright Nov. 2002, George Tzanetakis Linear Prediction Coefficients Impulses @ f0 White Noise SourceFilterSpeech Lossless tubes

19 Copyright Nov. 2002, George Tzanetakis MPEG Audio Coding (mp3) Analysis Filterbank Psychoacoustic Model Available bits 32 linearly spaced bands Encoder: Slower, Complicated Decoder: Faster, Simpler Perceptual Audio Coding

20 Copyright Nov. 2002, George Tzanetakis Spectral Shape t M Centroid Rolloff Flux RMS Moments …

21 Copyright Nov. 2002, George Tzanetakis Summary of Timbral Texture Features Time-Frequency analysis Signal Processing (STFT, DWT) Source-filter (LPC) Perceptual (MP3) Spectral Shape to feature vector

22 Copyright Nov. 2002, George Tzanetakis Pitch Content Harmony-melody = pitch concepts Music theory score = music Bridge to symbolic MIR Automatic music transcription Non-transcriptive arguments

23 Copyright Nov. 2002, George Tzanetakis Automatic Pitch Detection P=1/f Time-domain Frequency-domain Perceptual Zerocrossings Autocorrelation analysis = peaks of function correspond to dominant pitches

24 Copyright Nov. 2002, George Tzanetakis Pitch Histograms JazzIrish Chroma - folded Height - unfolded

25 Copyright Nov. 2002, George Tzanetakis Automatic Music Transcription OriginalTranscribed Mixture signalNoise suppresion Predominant Pitch Estimation Remove detected sound Estimate # of voices

26 Copyright Nov. 2002, George Tzanetakis Rhythm Movement in time Origins in Poetry (iambic, trochaic) Foot tapping definition Hierarchical semi-periodic structure at multiple levels of detail Links to motion, dance Running vs global

27 Copyright Nov. 2002, George Tzanetakis Self similarity DWTAutocorrelation Peak Picking Beat Histograms Envelope Extraction

28 Copyright Nov. 2002, George Tzanetakis Beat Histograms

29 Copyright Nov. 2002, George Tzanetakis Analysis Classification Segmentation Similarity Retrieval Clustering Thumbnailing Fingerprinting

30 Copyright Nov. 2002, George Tzanetakis Analysis Overview Musical PieceTrajectoryPoint

31 Copyright Nov. 2002, George Tzanetakis Query-by-example Content-based Retrieval Ranked list of k nearest neighbors

32 Copyright Nov. 2002, George Tzanetakis QBE examples Rock: Beatles Jazz: Bobby Hutserson Funk: Mano negra World: Tibetan singer Computer Music: Paul Lansky Query Match

33 Copyright Nov. 2002, George Tzanetakis Automatic Musical Genre Classification Categorical music descriptions created by humans Fuzzy boundaries Statistical properties Timbral texture, rhythmic structure, harmonic content Automatic musical genre classification Evaluate musical content features Structure audio collections

34 Copyright Nov. 2002, George Tzanetakis Genregram demo Dynamic real-time visualization for classification of radio signals

35 Copyright Nov. 2002, George Tzanetakis Audio segmentation Detect changes of audio texture

36 Copyright Nov. 2002, George Tzanetakis Multifeature automatic segmenation methodology Time series of feature vector v(t) Detect abrupt changes in trajectory

37 Copyright Nov. 2002, George Tzanetakis Context & Content Aware User Interfaces Automatic results not perfect Music listening is personal and subjective Browsing vs retrieval “Overview, zoom and filter, details on demand”, Shneiderman mantra Adapt UI to music content and context Computer Audition Visualization

38 Copyright Nov. 2002, George Tzanetakis Content and Context Content ~ file Genre, male voice, saxophone Content ~ file, collection Similarity Slow-fast Multiple visualizations Same content Different context

39 Copyright Nov. 2002, George Tzanetakis Timbregrams Content & Context Similarity + Time Structure Principal Component Analysis Map feature vectors to color

40 Copyright Nov. 2002, George Tzanetakis Timbrespaces

41 Copyright Nov. 2002, George Tzanetakis Islands of Music

42 Copyright Nov. 2002, George Tzanetakis Auditory Scene Analysis


Download ppt "Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon."

Similar presentations


Ads by Google