Presentation is loading. Please wait.

Presentation is loading. Please wait.

Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias.

Similar presentations


Presentation on theme: "Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias."— Presentation transcript:

1 Matthias Gruhne, ghe@emt.iis.fhg.de Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias Gruhne ghe@emt.iis.fhg.de Fraunhofer IIS, AEMT, D-98693 Ilmenau, Germany

2 Matthias Gruhne, ghe@emt.iis.fhg.de Page 2 Fraunhofer Institut Integrierte Schaltungen Overview What is AudioID? Requirements System Architecture MPEG 7 Recognition Performance Applications Conclusions Demonstration

3 Matthias Gruhne, ghe@emt.iis.fhg.de Page 3 Fraunhofer Institut Integrierte Schaltungen What is AudioID?

4 Matthias Gruhne, ghe@emt.iis.fhg.de Page 4 Fraunhofer Institut Integrierte Schaltungen What is AudioID? Identify audio material (artist, song, etc.) by analysis of the signal itself Content-Based Identification No associated information required (headers, ID3 tags) No embedded signals (e.g. watermark), are required Some knowledge available about music to be identified (reference database) Purpose Conditions

5 Matthias Gruhne, ghe@emt.iis.fhg.de Page 5 Fraunhofer Institut Integrierte Schaltungen Requirements High recognition rates (> 95%), even with distorted signals Robust against various distortions: –volume change, equalization, noise addition, audio coding (e.g. MP3),... –analog artifacts (e.g. D/A, A/D) Small signature size Extensibility of database (> 10 6 items) while keeping processing time low (few ms/item) Recognition rate Robustness Compactness Scalability

6 Matthias Gruhne, ghe@emt.iis.fhg.de Page 6 Fraunhofer Institut Integrierte Schaltungen System Architecture - Overview

7 Matthias Gruhne, ghe@emt.iis.fhg.de Page 7 Fraunhofer Institut Integrierte Schaltungen System Architecture Signal preprocessing Extract the essence of audio signal Increase discriminance & efficiency Temporal grouping of features (super vector) Statistics calculation (mean, variance, etc.) Feature Extractor Feature Processor

8 Matthias Gruhne, ghe@emt.iis.fhg.de Page 8 Fraunhofer Institut Integrierte Schaltungen System Architecture Clustering of processed feature vectors: –further reduce the amount of data –enhance robustness (overfitting) Add class with associated metadata to database Compare feature vectors against classes in database by means of some metric Find class yielding the best approximation Retrieve associated metadata Class generator Classification

9 Matthias Gruhne, ghe@emt.iis.fhg.de Page 9 Fraunhofer Institut Integrierte Schaltungen MPEG-7 - Elements for Robust Audio Matching AudioSpectrumFlatness LLD –Derived from: Spectral Flatness Measure (SFM) –Describes un/flatness of spectrum in frequency bands (tonal noise) AudioSignature Description Scheme –Statistical data summarization of AudioSpectrumFlatness LLD –Textual description in XML syntax Low level data Fingerprint

10 Matthias Gruhne, ghe@emt.iis.fhg.de Page 10 Fraunhofer Institut Integrierte Schaltungen MPEG-7 - Benefits Standardized Feature Format guarantees worldwide interoperability Published, open format descriptive data can be produced easily Large MPEG-7 compliant databases expected to be available in near future (incl. fingerprints) Long term format stability/ life time

11 Matthias Gruhne, ghe@emt.iis.fhg.de Page 11 Fraunhofer Institut Integrierte Schaltungen Recognition Performance- Conditions Training and test sets (mostly rock / pop): –15,000 items –90,000 items Spectral Flatness Measure (SFM) Number of correctly identified items (both single best and within top 10) Conditions Considered feature Classification performance

12 Matthias Gruhne, ghe@emt.iis.fhg.de Page 12 Fraunhofer Institut Integrierte Schaltungen Recognition Performance - 15k items Top 1 / Top 10 16 bands Advanced matching with temporal tracking Feature:SFM Cropping100.0% / 100.0% MP3 @ 96kbps99.6% / 99.8% Loudsp./Mic.98.0% / 99.0%

13 Matthias Gruhne, ghe@emt.iis.fhg.de Page 13 Fraunhofer Institut Integrierte Schaltungen Recognition Performance - 90k items 16 bands Advanced matching with temporal tracking

14 Matthias Gruhne, ghe@emt.iis.fhg.de Page 14 Fraunhofer Institut Integrierte Schaltungen Applications Retrieve associated metadata by identifying audio content Automated search of audio content on the Internet Broadcast monitoring by protocoling the transmission of audio material Feature based indexing of audio databases (similarity search)...

15 Matthias Gruhne, ghe@emt.iis.fhg.de Page 15 Fraunhofer Institut Integrierte Schaltungen Conclusions High recognition rates (>99 % tested with 90,000 items) Robust to real world signal distortions Fast and reliable extraction and classification Underlying feature specified in MPEG-7 standard ensures worldwide interoperability and licensing available for everyone

16 Matthias Gruhne, ghe@emt.iis.fhg.de Page 16 Fraunhofer Institut Integrierte Schaltungen Real Time Demonstration: Demo running on laptop (Pentium III @ 500 MHz) Local database with 15,000 items (Rock / Pop genre) Acoustic transmission: mp3 -> D/A -> Speakers -> Noisy Environment -> Microphone -> A/D -> AudioID

17 Matthias Gruhne, ghe@emt.iis.fhg.de Page 17 Fraunhofer Institut Integrierte Schaltungen Thanks for your Attention !


Download ppt "Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias."

Similar presentations


Ads by Google