Audio Segmentation, Classification, and Retrieval Princeton Sound Lab Prof. Perry Cook George Tzanetakis, PhD ‘02 (CMU) Ari Lazier, ‘03 Ge Wang, G3 Tom Briggs, G2
Roadmap Framework MARSYAS Demos: Smart Sound Editor Musical Genre Classification Content-based Query
Audio Framework MARSYAS (Tzanetakis, Cook, Lazier) Feature Extraction Source Segmentation Content-based Retrieval Classification General Approach / Not Domain-specific Highly Extensible
Smart Sound Editor Automatic Segmentation Music Speech Male Female …
Music Genre Classifiction Training set: Large corpus of music and speech How good? 90% speech vs. music 67% correct forced decision on genre (same agreement as humans)
Content-based Query Distance in Multi-Dimensional Feature-space Navigate Feature-space Nearest Neighbor / Similarity Retrieval