Copyright Nov. 2002, George Tzanetakis Digital Music & Music Processing George Tzanetakis PostDoctoral Fellow Computer Science Department Carnegie Mellon University
Copyright Nov. 2002, George Tzanetakis Overview Music Information Retrieval (MIR) and Computer Audition Motivation Techniques Applications Computer Music and Sound Synthesis Examples, demos
Copyright Nov. 2002, George Tzanetakis MIR Music History 9000 B.C
Copyright Nov. 2002, George Tzanetakis Music 4 million recorded CD tracks 4000 CDs / month Mp3 bandwidth % Global Pervasive Persistent Why ?
Copyright Nov. 2002, George Tzanetakis The future of MIR Library of all recorded music Tasks: organize, search, retrieve, classify, recommend, browse, listen, annotate Examples:
Copyright Nov. 2002, George Tzanetakis Audio MIR Pipeline Signal ProcessingMachine LearningHuman Computer Interaction Hearing Representation Understanding Analysis Reacting Interaction
Copyright Nov. 2002, George Tzanetakis Traditional Music Representations
Copyright Nov. 2002, George Tzanetakis Time domain waveform time pressure Decompose into building blocks time frequency
Copyright Nov. 2002, George Tzanetakis MIDI Musical Instrument Digital Interface Hardware interface File format Note events Duration, discrete pitch, “instrument” Extensions General MIDI Notation, OMR, continuous pitch
Copyright Nov. 2002, George Tzanetakis Symbolic vs Audio MIR Audio Polyphonic Transcription Symbolic Representation (MIDI) MIR Audio Computer Audition MIR Machine Learning Models
Copyright Nov. 2002, George Tzanetakis Feature extraction
Copyright Nov. 2002, George Tzanetakis Timbral Texture Timbre = differentiate sounds of same loudness, pitch Timbral Texture = differentiate mixtures of sounds Global, statistical and fuzzy properties
Copyright Nov. 2002, George Tzanetakis Spectrum t t+1 M M
Copyright Nov. 2002, George Tzanetakis Fourier Transform P=1/f
Copyright Nov. 2002, George Tzanetakis Short Time Fourier Transform STFT Filterbank interpretation FiltersOscillators Amplitude Frequency output
Copyright Nov. 2002, George Tzanetakis Short Time Fourier Transform II t t+1 M M
Copyright Nov. 2002, George Tzanetakis Formants From “Real Time Synthesis for Interactive Applications” P.Cook, A.K Peters Press, used by permission
Copyright Nov. 2002, George Tzanetakis Linear Prediction Coefficients f0 White Noise SourceFilterSpeech Lossless tubes
Copyright Nov. 2002, George Tzanetakis MPEG Audio Coding (mp3) Analysis Filterbank Psychoacoustic Model Available bits 32 linearly spaced bands Encoder: Slower, Complicated Decoder: Faster, Simpler Perceptual Audio Coding
Copyright Nov. 2002, George Tzanetakis Spectral Shape t M Centroid Rolloff Flux RMS Moments …
Copyright Nov. 2002, George Tzanetakis Summary of Timbral Texture Features Time-Frequency analysis Signal Processing (STFT, DWT) Source-filter (LPC) Perceptual (MP3) Spectral Shape to feature vector
Copyright Nov. 2002, George Tzanetakis Pitch Content Harmony-melody = pitch concepts Music theory score = music Bridge to symbolic MIR Automatic music transcription Non-transcriptive arguments
Copyright Nov. 2002, George Tzanetakis Automatic Pitch Detection P=1/f Time-domain Frequency-domain Perceptual Zerocrossings Autocorrelation analysis = peaks of function correspond to dominant pitches
Copyright Nov. 2002, George Tzanetakis Pitch Histograms JazzIrish Chroma - folded Height - unfolded
Copyright Nov. 2002, George Tzanetakis Automatic Music Transcription OriginalTranscribed Mixture signalNoise suppresion Predominant Pitch Estimation Remove detected sound Estimate # of voices
Copyright Nov. 2002, George Tzanetakis Rhythm Movement in time Origins in Poetry (iambic, trochaic) Foot tapping definition Hierarchical semi-periodic structure at multiple levels of detail Links to motion, dance Running vs global
Copyright Nov. 2002, George Tzanetakis Self similarity DWTAutocorrelation Peak Picking Beat Histograms Envelope Extraction
Copyright Nov. 2002, George Tzanetakis Beat Histograms
Copyright Nov. 2002, George Tzanetakis Analysis Classification Segmentation Similarity Retrieval Clustering Thumbnailing Fingerprinting
Copyright Nov. 2002, George Tzanetakis Analysis Overview Musical PieceTrajectoryPoint
Copyright Nov. 2002, George Tzanetakis Query-by-example Content-based Retrieval Ranked list of k nearest neighbors
Copyright Nov. 2002, George Tzanetakis QBE examples Rock: Beatles Jazz: Bobby Hutserson Funk: Mano negra World: Tibetan singer Computer Music: Paul Lansky Query Match
Copyright Nov. 2002, George Tzanetakis Automatic Musical Genre Classification Categorical music descriptions created by humans Fuzzy boundaries Statistical properties Timbral texture, rhythmic structure, harmonic content Automatic musical genre classification Evaluate musical content features Structure audio collections
Copyright Nov. 2002, George Tzanetakis Genregram demo Dynamic real-time visualization for classification of radio signals
Copyright Nov. 2002, George Tzanetakis Audio segmentation Detect changes of audio texture
Copyright Nov. 2002, George Tzanetakis Multifeature automatic segmenation methodology Time series of feature vector v(t) Detect abrupt changes in trajectory
Copyright Nov. 2002, George Tzanetakis Context & Content Aware User Interfaces Automatic results not perfect Music listening is personal and subjective Browsing vs retrieval “Overview, zoom and filter, details on demand”, Shneiderman mantra Adapt UI to music content and context Computer Audition Visualization
Copyright Nov. 2002, George Tzanetakis Content and Context Content ~ file Genre, male voice, saxophone Content ~ file, collection Similarity Slow-fast Multiple visualizations Same content Different context
Copyright Nov. 2002, George Tzanetakis Timbregrams Content & Context Similarity + Time Structure Principal Component Analysis Map feature vectors to color
Copyright Nov. 2002, George Tzanetakis Timbrespaces
Copyright Nov. 2002, George Tzanetakis Islands of Music
Copyright Nov. 2002, George Tzanetakis Auditory Scene Analysis