Download presentation
Presentation is loading. Please wait.
Published byRonald Page Modified over 8 years ago
1
Audio Fingerprinting Overview: RARE Algorithms, Resources Chris Burges, John Platt, Jon Goldstein, Erin Renshaw http://msrweb/~cburges/rare.htm
2
Let’s agree on names… A ‘fingerprint’ is a vector that represents a given audio clip. It lives in a database with a lot of other fingerprints. A ‘confirmation fingerprint’ is a second fingerprint used to confirm a match. A ‘trace’ is generated from audio every 186 ms. It’s computed exactly the same way as a fingerprint.
3
64 floats / frame In Database? Confirmed? 6 sec Analyze a Stream Design of the Funnel 00 01 6 sec of distorted Song A 6 sec of Song B 6 sec of Song A Find 64 good projections of 6 seconds of audio If 1, declare match δ2δ2 δ1δ1 good projection Good projections maximize δ 2 / δ 1
4
Feature Extraction Feature Extraction (186 ms)
5
De-Equalization BeforeAfter De-equalize by flattening the log spectrum.
6
De-Equalization Details Goal: Remove slow variation in frequency space
7
Perceptual Thresholding Remove coefficients that are below a perceptual threshold to lower unwanted variance. … inaudible to human … audible to human
8
Project to 64 Floats
9
Bitvector yields 50x Speedup
10
Server Internet Client... Feature Extraction Audio stream Lookup Audio stream identity Example Architecture Optional Pruning
11
Client: Resources Computing traces takes approx 10% CPU on 750 MHz P3. However we can get speedup over the current DCT, since we’re only modifying the first 6 coefficients: O(Nlog(N)) → O(6N). Total data loaded by client is 2.1MB.
12
Client Side Options What can be done on the client side to off- load the server lookup? Three ideas (in addition to only querying untagged music, and adding ID3 tags when found): 1. Leverage Zipf’s law (if it holds!) 2. Reduce rate at which traces are sent 3. Prune traces on the client
13
Client Side Pruning – Local Lookup Having a database of fingerprints for e.g. the top 10,000 songs would significantly reduce server load, but we don’t know by how much. Also requires updates (e.g. weekly?) log(# times played) log(rank) Zipf’s Law
14
Client Options, cont. Can reduce sampling by factor of 2 (from 186 to 372 ms) at some (likely small) loss in accuracy. This would halve both client CPU and server load.
15
Client Side Pruning – Margin Trees Using a tree built from first 24 components: No overpopulating, but flip 5 most error- prone bits in each trace Gets a factor 2 reduction in throughput at 0.5% increase in false neg. for very noisy data Number of nodes in tree (for 254,885 fingerprints) was found to be 1,531,508 Requires updates (e.g. weekly?)
16
A note on the code Upper bound: 22,000 lines of C++. File- and stream-based versions use the same libraries.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.