Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias.

Slides:



Advertisements
Similar presentations
Case Study: Examining the Results of P2P Collaboration at PricewaterhouseCoopers February 14, 2001 Case Study: Examining the Results of Collaboration at.
Advertisements

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Slide 1 Insert your own content. Slide 2 Insert your own content.
Chris Burges, John Platt, Jon Goldstein, Erin Renshaw Microsoft Research Name That Tune: Stream Audio Fingerprinting.
…to Ontology Repositories Mathieu dAquin Knowledge Media Institute, The Open University From…
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Learning Introductory Signal Processing Using Multimedia 1 Outline Overview of Information and Communications Some signal processing concepts Tools available.
April, 2004 Lars Thygesen International Trade Expert meeting Whats going on at OECD: statistical information management.
Cultural Heritage in REGional NETworks REGNET Project Meeting Content Group Part 1: Usability Testing.
CONFIDENTIAL DIGITAL WATERMARKING ALLIANCE. CONFIDENTIAL DIGITAL WATERMARKING ALLIANCE 2 Digital Watermarking Alliance Charter The Digital Watermarking.
Smart Tracking: Usage of IPv6 in RFID System for Global Mobility
0 - 0.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
Addition Facts
1 The eLecture Portal An Advanced Archive for Lecture Recordings Christoph Hermann, Wolfgang Hürst, Martina Welte Institut für Informatik Albert-Ludwigs-Universität.
Smart Card System Training The New York City Department of Education Office of Auditor General The New York City Department of Education Office of Auditor.
GROUND BASED AUGMENTATION SYSTEM System Overview Christophe DEHAYNAIN Direction Générale de l’Aviation Civile FRANCE.
Woodcock Reading Mastery Test-Revised
Copyright © 2011 by the Commonwealth of Pennsylvania. All Rights Reserved. Load Test Report.
DVB Update: Service Information
Context-Aware Mobile Music Recommendation for Daily Activities
New Products for © 2009 ANGEL Learning, Inc. Proprietary and Confidential, 2 Update Summary Enrich teaching and learning Meet accountability needs.
Matthias Wimmer, Bernd Radig, Michael Beetz Chair for Image Understanding Computer Science TU München, Germany A Person and Context.
© Arjen P. de Vries Arjen P. de Vries Fascinating Relationships between Media and Text.
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
Executional Architecture
The estimation of the SZ effects with unbiased multifilters Diego Herranz, J.L. Sanz, R.B. Barreiro & M. López-Caniego Instituto de Física de Cantabria.
Addition 1’s to 20.
Test B, 100 Subtraction Facts
11 = This is the fact family. You say: 8+3=11 and 3+8=11
Week 1.
A SMALL TRUTH TO MAKE LIFE 100%
Part II (MPEG-4) Audio TSBK01 Image Coding and Data Compression Lecture 11, 2003 Jörgen Ahlberg.
Hydrological information systems Svein Taksdal Head of section, Section for Hydroinformatics Hydrology department Norwegian Water Resources and Energy.
Discriminative Training in Speech Processing Filipp Korkmazsky LORIA.
MP3 Overview John Ehrhardt Elena Silenok CSE228 – Spring 03.
Outline Introduction Music Information Retrieval Classification Process Steps Pitch Histograms Multiple Pitch Detection Algorithm Musical Genre Classification.
Rhythmic Similarity Carmine Casciato MUMT 611 Thursday, March 13, 2005.
FINGER PRINTING BASED AUDIO RETRIEVAL Query by example Content retrieval Srinija Vallabhaneni.
ADVISE: Advanced Digital Video Information Segmentation Engine
Chorus cluster meeting, Vilamoura April SAPIR Search in Audio-visual content using P2p IR Yosi Mass, Raul Santos.
Tools and Services for the Long Term Preservation and Access of Digital Archives Joseph JaJa, Mike Smorul, and Sangchul Song Institute for Advanced Computer.
Overview of Search Engines
Fundamentals of Perceptual Audio Encoding Craig Lewiston HST.723 Lab II 3/23/06.
Enabling Access to Sound Archives through Integration, Enrichment and Retrieval WP3 – Retrieval systems.
Normalization of the Speech Modulation Spectra for Robust Speech Recognition Xiong Xiao, Eng Siong Chng, and Haizhou Li Wen-Yi Chu Department of Computer.
Sound Applications Advanced Multimedia Tamara Berg.
Ihr Logo Data Explorer - A data profiling tool. Your Logo Agenda  Introduction  Existing System  Limitations of Existing System  Proposed Solution.
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Sharing Systems for Future HiFi Systems Wedelmusic 2004 Barcelona, 13th – 14th September 2004 Jürgen Nützel, TU Ilmenau Matthias Kaufmann, Fraunhofer IDMT.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Audio Fingerprinting MUMT 611 Ichiro Fujinaga McGill University.
IST DIVAS Presentation 1 Advanced search technologies for digital audio-visual content.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
MUMT611: Music Information Acquisition, Preservation, and Retrieval Presentation on Timbre Similarity Alexandre Savard March 2006.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
Duraid Y. Mohammed Philip J. Duncan Francis F. Li. School of Computing Science and Engineering, University of Salford UK Audio Content Analysis in The.
MMDB-9 J. Teuhola Standardization: MPEG-7 “Multimedia Content Description Interface” Standard for describing multimedia content (metadata).
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
MPEG-7 Audio Overview Ichiro Fujinaga MUMT 611 McGill University.
Audio Fingerprinting MUMT 611 Philippe Zaborowski March 2005.
Introduction to MPEG  Moving Pictures Experts Group,  Geneva based working group under the ISO/IEC standards.  In charge of developing standards for.
A content-based System for Music Recommendation and Visualization of User Preference Working on Semantic Notions Dmitry Bogdanov, Martin Haro, Ferdinand.
[1] National Institute of Science & Technology Technical Seminar Presentation 2004 Suresh Chandra Martha National Institute of Science & Technology Audio.
Audio Fingerprinting Wes Hatch MUMT-614 Mar.13, 2003.
Technologies: for Enhancing Broadcast Programmes with Bridgets
A review of audio fingerprinting (Cano et al. 2005)
Spread Spectrum Audio Steganography using Sub-band Phase Shifting
Multimedia Content Description Interface
Presentation transcript:

Matthias Gruhne, Page 1 Fraunhofer Institut Integrierte Schaltungen Robust Audio Identification for Commercial Applications Matthias Gruhne Fraunhofer IIS, AEMT, D Ilmenau, Germany

Matthias Gruhne, Page 2 Fraunhofer Institut Integrierte Schaltungen Overview What is AudioID? Requirements System Architecture MPEG 7 Recognition Performance Applications Conclusions Demonstration

Matthias Gruhne, Page 3 Fraunhofer Institut Integrierte Schaltungen What is AudioID?

Matthias Gruhne, Page 4 Fraunhofer Institut Integrierte Schaltungen What is AudioID? Identify audio material (artist, song, etc.) by analysis of the signal itself Content-Based Identification No associated information required (headers, ID3 tags) No embedded signals (e.g. watermark), are required Some knowledge available about music to be identified (reference database) Purpose Conditions

Matthias Gruhne, Page 5 Fraunhofer Institut Integrierte Schaltungen Requirements High recognition rates (> 95%), even with distorted signals Robust against various distortions: –volume change, equalization, noise addition, audio coding (e.g. MP3),... –analog artifacts (e.g. D/A, A/D) Small signature size Extensibility of database (> 10 6 items) while keeping processing time low (few ms/item) Recognition rate Robustness Compactness Scalability

Matthias Gruhne, Page 6 Fraunhofer Institut Integrierte Schaltungen System Architecture - Overview

Matthias Gruhne, Page 7 Fraunhofer Institut Integrierte Schaltungen System Architecture Signal preprocessing Extract the essence of audio signal Increase discriminance & efficiency Temporal grouping of features (super vector) Statistics calculation (mean, variance, etc.) Feature Extractor Feature Processor

Matthias Gruhne, Page 8 Fraunhofer Institut Integrierte Schaltungen System Architecture Clustering of processed feature vectors: –further reduce the amount of data –enhance robustness (overfitting) Add class with associated metadata to database Compare feature vectors against classes in database by means of some metric Find class yielding the best approximation Retrieve associated metadata Class generator Classification

Matthias Gruhne, Page 9 Fraunhofer Institut Integrierte Schaltungen MPEG-7 - Elements for Robust Audio Matching AudioSpectrumFlatness LLD –Derived from: Spectral Flatness Measure (SFM) –Describes un/flatness of spectrum in frequency bands (tonal noise) AudioSignature Description Scheme –Statistical data summarization of AudioSpectrumFlatness LLD –Textual description in XML syntax Low level data Fingerprint

Matthias Gruhne, Page 10 Fraunhofer Institut Integrierte Schaltungen MPEG-7 - Benefits Standardized Feature Format guarantees worldwide interoperability Published, open format descriptive data can be produced easily Large MPEG-7 compliant databases expected to be available in near future (incl. fingerprints) Long term format stability/ life time

Matthias Gruhne, Page 11 Fraunhofer Institut Integrierte Schaltungen Recognition Performance- Conditions Training and test sets (mostly rock / pop): –15,000 items –90,000 items Spectral Flatness Measure (SFM) Number of correctly identified items (both single best and within top 10) Conditions Considered feature Classification performance

Matthias Gruhne, Page 12 Fraunhofer Institut Integrierte Schaltungen Recognition Performance - 15k items Top 1 / Top bands Advanced matching with temporal tracking Feature:SFM Cropping100.0% / 100.0% 96kbps99.6% / 99.8% Loudsp./Mic.98.0% / 99.0%

Matthias Gruhne, Page 13 Fraunhofer Institut Integrierte Schaltungen Recognition Performance - 90k items 16 bands Advanced matching with temporal tracking

Matthias Gruhne, Page 14 Fraunhofer Institut Integrierte Schaltungen Applications Retrieve associated metadata by identifying audio content Automated search of audio content on the Internet Broadcast monitoring by protocoling the transmission of audio material Feature based indexing of audio databases (similarity search)...

Matthias Gruhne, Page 15 Fraunhofer Institut Integrierte Schaltungen Conclusions High recognition rates (>99 % tested with 90,000 items) Robust to real world signal distortions Fast and reliable extraction and classification Underlying feature specified in MPEG-7 standard ensures worldwide interoperability and licensing available for everyone

Matthias Gruhne, Page 16 Fraunhofer Institut Integrierte Schaltungen Real Time Demonstration: Demo running on laptop (Pentium 500 MHz) Local database with 15,000 items (Rock / Pop genre) Acoustic transmission: mp3 -> D/A -> Speakers -> Noisy Environment -> Microphone -> A/D -> AudioID

Matthias Gruhne, Page 17 Fraunhofer Institut Integrierte Schaltungen Thanks for your Attention !