Download presentation
Presentation is loading. Please wait.
Published byJason Page Modified over 9 years ago
1
Summer School on Hashing’14 Dimension Reduction Alex Andoni (Microsoft Research)
2
Nearest Neighbor Search (NNS)
3
Motivation Generic setup: Points model objects (e.g. images) Distance models (dis)similarity measure Application areas: machine learning: k-NN rule speech/image/video/music recognition, vector quantization, bioinformatics, etc… Distance can be: Hamming,Euclidean, edit distance, Earth-mover distance, etc… Primitive for other problems: find the similar pairs in a set D, clustering… 000000 011100 010100 000100 010100 011111 000000 001100 000100 110100 111111
4
2D case
5
High-dimensional case AlgorithmQuery timeSpace Full indexing No indexing – linear scan
6
Dimension Reduction
7
Johnson Lindenstrauss Lemma
8
Idea:
9
1D embedding
10
22
11
Full Dimension Reduction
12
Concentration
13
Johnson Lindenstrauss: wrap-up
14
Faster JL ? 14
15
Fast JL Transform 15 normalization constant
16
Fast JLT: sparse projection 16
17
FJLT: full 17 Projection: sparse matrix “spreading around” Hadamard (Fast Fourier Transform) Diagonal
18
Spreading around: intuition 18 Projection: sparse matrix Hadamard (Fast Fourier Transform) Diagonal “spreading around”
19
Spreading around: proof 19
20
20
21
21
22
FJLT: wrap-up 22
23
Dimension Reduction: beyond Euclidean 23
24
24
25
Weak embedding [Indyk’00] 25
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.