Presentation is loading. Please wait.

Presentation is loading. Please wait.

Audio Fingerprinting Overview: RARE Algorithms, Resources Chris Burges, John Platt, Jon Goldstein, Erin Renshaw

Similar presentations


Presentation on theme: "Audio Fingerprinting Overview: RARE Algorithms, Resources Chris Burges, John Platt, Jon Goldstein, Erin Renshaw"— Presentation transcript:

1 Audio Fingerprinting Overview: RARE Algorithms, Resources Chris Burges, John Platt, Jon Goldstein, Erin Renshaw http://msrweb/~cburges/rare.htm

2 Let’s agree on names… A ‘fingerprint’ is a vector that represents a given audio clip. It lives in a database with a lot of other fingerprints. A ‘confirmation fingerprint’ is a second fingerprint used to confirm a match. A ‘trace’ is generated from audio every 186 ms. It’s computed exactly the same way as a fingerprint.

3 64 floats / frame In Database? Confirmed? 6 sec Analyze a Stream Design of the Funnel 00 01 6 sec of distorted Song A 6 sec of Song B 6 sec of Song A Find 64 good projections of 6 seconds of audio If 1, declare match δ2δ2 δ1δ1 good projection Good projections maximize δ 2 / δ 1

4 Feature Extraction Feature Extraction (186 ms)

5 De-Equalization BeforeAfter De-equalize by flattening the log spectrum.

6 De-Equalization Details Goal: Remove slow variation in frequency space

7 Perceptual Thresholding Remove coefficients that are below a perceptual threshold to lower unwanted variance. … inaudible to human … audible to human

8 Project to 64 Floats

9 Bitvector yields 50x Speedup

10 Server Internet Client... Feature Extraction Audio stream Lookup Audio stream identity Example Architecture Optional Pruning

11 Client: Resources Computing traces takes approx 10% CPU on 750 MHz P3. However we can get speedup over the current DCT, since we’re only modifying the first 6 coefficients: O(Nlog(N)) → O(6N). Total data loaded by client is 2.1MB.

12 Client Side Options What can be done on the client side to off- load the server lookup? Three ideas (in addition to only querying untagged music, and adding ID3 tags when found): 1. Leverage Zipf’s law (if it holds!) 2. Reduce rate at which traces are sent 3. Prune traces on the client

13 Client Side Pruning – Local Lookup Having a database of fingerprints for e.g. the top 10,000 songs would significantly reduce server load, but we don’t know by how much. Also requires updates (e.g. weekly?) log(# times played) log(rank) Zipf’s Law

14 Client Options, cont. Can reduce sampling by factor of 2 (from 186 to 372 ms) at some (likely small) loss in accuracy. This would halve both client CPU and server load.

15 Client Side Pruning – Margin Trees Using a tree built from first 24 components: No overpopulating, but flip 5 most error- prone bits in each trace Gets a factor 2 reduction in throughput at 0.5% increase in false neg. for very noisy data Number of nodes in tree (for 254,885 fingerprints) was found to be 1,531,508 Requires updates (e.g. weekly?)

16 A note on the code Upper bound: 22,000 lines of C++. File- and stream-based versions use the same libraries.


Download ppt "Audio Fingerprinting Overview: RARE Algorithms, Resources Chris Burges, John Platt, Jon Goldstein, Erin Renshaw"

Similar presentations


Ads by Google