Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)

Dimension Reduction  Algorithmic metric embedding technique (R d, L q ) ! (R k, L p ) k << d  Useful in algorithms requiring exponential (in d) time/space  Johnson-Lindenstrauss for L 2  What is exact complexity?

Dimension Reduction Applications  Approximate nearest neighbor [KOR00, IM98]…  Text analysis [PRTV98]  Clustering [BOR99, S00]  Streaming [I00]  Linear algebra [DKM05, DKM06] Matrix multiplication Matrix multiplication SVD computation SVD computation L 2 regression L 2 regression  VLSI layout Design [V98]  Learning [AV99, D99, V98]...

Three Quick Slides on: Approximate Nearest Neighbor Searching...

Approximate Nearest Neighbor P = Set of n points x p min p dist(x,p) · (1+  )dist(x,p min )

Approximate Nearest Neighbor  d can be very large   -approx beats “curse of dimensionality”  [IM98, H01] (Euclidean), [KOR00] (Cube): Time O(  -2 d log n) Time O(  -2 d log n) Space n O(  -2 ) Space n O(  -2 ) Bottleneck: Dimension reduction Using FJLT O(d log d +  -3 log 2 n)

The d-Hypercube Case  [KOR00]  Binary search on distance 2 [d]  For distance  multiply space by random matrix  2 Z 2 k £ d k=O(  -  log n)  ij i.i.d. » biased coin  Preprocess lookup tables for  x (mod 2)  Our observation:  can be made sparse Using “handle” to p2 P s.t. dist(x,p)   Using “handle” to p2 P s.t. dist(x,p)    Time for each step: O(  -2 d log n) ) O(d +  -2 log n) How to make similar improvement for L 2 ?

Back to Euclidean Space and Johnson-Lindenstrauss...

History of Johnson-Lindenstrauss Dimension Reduction [JL84]   : Projection of R d onto random subspace of dimension k=c  -2 log n  w.h.p.: 8 p i,p j 2 P ||  p i -  p j || 2 =  (1±O(  ) ||p i - p j || 2  L 2 ! L 2 embedding

History of Johnson-Lindenstrauss Dimension Reduction [FM87], [DG99]  Simplified proof, improved constant c   2 R k £ d : random orthogonal matrix 11 22 kk ||  i || 2 =1  i ¢  j = 0

History of Johnson-Lindenstrauss Dimension Reduction [IM98]   2 R k£ d :  ij i.i.d. » N(0,1/d) 11 22 kk E ||  i || 2 2 =1 E  i ¢  j = 0

History of Johnson-Lindenstrauss Dimension Reduction [A03]  Need only tight concentration of |  i ¢ v| 2   2 R k£ d :  ij i.i.d. » 11 22 kk E ||  i || 2 2 =1 E  i ¢  j = 0 +1 1/2 -1 1/2

History of Johnson-Lindenstrauss Dimension Reduction [A03]   2 R k£ d :  ij i.i.d. »  Sparse 11 22 kk E ||  i || 2 2 =1 E  i ¢  j = 0 +1 1/6 0 2/3 -1 1/6

0000000000000000000000000000000000000000000000 Sparse Johnson-Lindenstrauss  Sparsity parameter: s = Pr[  ij  0 ]  Cannot be o(1) due to “hidden coordinate” 01000100 v = 2 R d 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

Uncertainty Principle v sparse ) v dense v = H v ^ ^ - Walsh - Hadamard matrix - Fourier transform on {0,1} log 2 d - Computable in time O(d log d) - Isometry: ||v|| 2 = ||v|| 2 ^

Adding Randomization  H deterministic, invertible ) We’re back to square one!  Precondition H with random diagonal D ±1 ±1... D = - Computable in time O(d) - Isometry

The l 1 -Bound Lemma  w.h.p.: 8 p i,p j 2 P µ R d : ||HD(p i - p j )|| 1 · O(d -1/2 log 1/2 n) ||p i - p j || 2  Rules out: HD(p i – p j ) = “hidden coordinate vector” !! instead... instead...

Hidden Coordinate-Set Worst-case v = p i - p j (assuming l 1 -bound):  8 j  J: |v j | =  (d -1/2 log 1/2 n) 8 j  J: v j = 0 J µ [d], |J| =  (d/log n) (assume ||v|| 2 = 1)

Fast J-L Transform FJLT =  H D  ij i.i.d » 0 1-s N(0,1) s Diag(±1) Hadamard Sparse JL l 2 ! l 1 l 2 ! l 2  -1 log n d log 2 n d s  Bottleneck: Bias of |  i ¢ v| Bottleneck: Variance of |  i ¢ v| 2

Applications  Approximate nearest neighbor in (R d, l 2 )  l 2 regression: minimize ||Ax-b|| 2 A 2 R n £ d over-constrained: d<<n [DMM06] approximate by sampling [Sarlos06] using FJLT ) constructive  More applications...? non-constructive

Interesting Problem I Improvement & lower bound for J-L computation

Interesting Problem II  Dimension reduction is sampling  Sampling by random walk: Expander graphs for uniform sampling Expander graphs for uniform sampling Convex bodies for volume estimation Convex bodies for volume estimation  [Kac59]: Random walk on orthogonal group for t=1..T: pick i,j 2 R [d],  2 R [0,2  ) v i v i cos  + v j sin   v j -v i sin  + v j cos   Output (v 1,..., v k ) as dimension reduction of v  How many steps for J-L guarantee?  [CCL01], [DS00], [P99]... Ã Thank You !

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)

Similar presentations

Presentation on theme: "Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)

Similar presentations

Presentation on theme: "Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)"— Presentation transcript:

Similar presentations

About project

Feedback