LAM: Musical Audio Similarity Michael Casey Centre for Cognition, Computation and Culture Department of Computing Goldsmiths College, University of London
Overview Machine Music Understanding Features / Classes / Clusters Real-Time Audio Matching Feature Extraction Feature Similarity (Indexing / Retrieval) PD/MSP Tools Music Similarity Applications Sound object matching Texture matching
Sound Understanding Signal ProcessingSound Understanding
Feature Extraction
p( | ) * P( ) Statistical Learning for Decision Making Decision boundary Partitioning of feature space P( | )= p( ) Music Speech
MPEG-7 Audio Tools Audio
MPEG-7 Audio Tools Log Frequency Spectrogram Audio AudioSpectrumEnvelopeD
MPEG-7 Audio Tools Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD
SoundModelStatePathD State Path Use estimated state sequence as a feature
MPEG-7 Audio Tools Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD Hidden Markov Model SoundModelDS
MPEG-7 Audio Strings Acoustic Lexicons Log Frequency Spectrogram Audio Log Amplitude Decorrelating Transform / Dimension Reduction AudioSpectrumEnvelopeD AudioSpectrumProjectionD Hidden Markov Model SoundModelDS State Path ? 7 1 V SoundModelStatePathD SYMBOL STRING
State Symbol Sequence (40 State Model) ?71V
State Symbol Sequence (40 State Model) ?71V
State Symbol Sequence (40 State Model) ?71V
State Symbol Sequence (40 State Model) ?71V
SoundModelStateHistogramD seconds state index 0.01s Frames
Self-Similarity Matrix
a
a b
a b
S-Matrix
Efficient Storage / Retrieval Real-Time Access Large Databases Distributed Databases
PostgreSQL Database Representation of State Path “Strings” and Histograms
Similarity Compute distance between feature pairs Features == SoundModelStateHistogramD Similarity Metric dist(a,b) >= 0 dist(a,b)== 0 iff a==b dist(a,b) + dist(b,c) >= dist(a,c) Vector Dot Product
Similarity of Feature Trajectories
Dynamic Time Warping
Acousticon Strings Distance Metric –String Edit Distance (Levenschtein) Scalable to Large Databases –PostgreSQL Implementation –Can use built-in Index Structures Scalable to Real-Time Implementation –matching and audio streaming (< 20ms )
Information Retrieval for Creativity Utilize sound extant database for new material Take the structure of a music clip but replace the content. New interfaces for music creativity.
Audio Information Retrieval MPEG-7 Database A pre-indexed Collection of Sounds
Audio Query Extract MPEG-7 Database SegmentMatch Result List A Sound or Scene or List of Sounds Audio Information Retrieval
Audio Query Extract MPEG-7 Database SegmentMatch Result List Feature extraction from audio. Audio Information Retrieval
Audio Query Extract MPEG-7 Database SegmentMatch Result List Partitioning of audio into chunks. Audio Information Retrieval
Audio Query Extract MPEG-7 Database SegmentMatch Result List Find similar chunks of Audio Audio Information Retrieval
Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching
Musaics Real-Time Matching