Download presentation
Presentation is loading. Please wait.
Published byByron Amos Gallagher Modified over 9 years ago
1
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT) 1
2
Sketching Compress a massive object to a small sketch Rich theories: high-dimensional vectors, matrices, graphs Similarity search, compressed sensing, numerical linear algebra Dimension reduction (Johnson, Lindenstrauss 1984): random projection on a low-dimensional subspace preserves distances n d When is sketching possible? 2
3
Similarity search Motivation: similarity search Model similarity as a metric Sketching may speed-up computation and allow indexing Interesting metrics: Euclidean ℓ 2 : d(x, y) = (∑ i |x i – y i | 2 ) 1/2 Manhattan, Hamming ℓ 1 : d(x, y) = ∑ i |x i – y i | ℓ p distances d(x, y) = (∑ i |x i – y i | p ) 1/p for p ≥ 1 Edit distance, Earth Mover’s Distance etc. 3
4
Sketching metrics Alice and Bob each hold a point from a metric space (say x and y) Both send s-bit sketches to Charlie For r > 0 and D > 1 distinguish d(x, y) ≤ r d(x, y) ≥ Dr Shared randomness, allow 1% probability of error Trade-off between s and D sketch(x)sketch(y) d(x, y) ≤ r or d(x, y) ≥ Dr? 0110…1 4 AliceBob Charlie x y
5
Near Neighbor Search via sketches Near Neighbor Search (NNS): Given n-point dataset P A query q within r from some data point Return any data point within Dr from q Sketches of size s imply NNS with space n O(s) and a 1-probe query Proof idea: amplify probability of error to 1/n by increasing the size to O(s log n); sketch of q determines the answer 5
6
Sketching real line Distinguish |x – y| ≤ 1 vs. |x – y| ≥ 1 + ε Randomly shifted pieces of size w = 1 + ε/2 Repeat O(1 / ε 2 ) times Overall: D = 1 + ε s = O(1 / ε 2 ) xy 00 1 6
7
Sketching ℓ p for 0 < p ≤ 2 (Indyk 2000): can reduce sketching of ℓ p with 0 < p ≤ 2 to sketching reals via random projections If (G 1, G 2, …, G d ) are i.i.d. N(0, 1)’s, then ∑ i x i G i – ∑ i y i G i is distributed as ‖x - y‖ 2 N(0, 1) For 0 < p < 2 use p-stable distributions instead Again, get D = 1 + ε with s = O(1 / ε 2 ) For p > 2 sketching ℓ p is hard: to achieve D = O(1) one needs sketch size to be s = Θ~(d 1-2/p ) (Bar-Yossef, Jayram, Kumar, Sivakumar 2002), (Indyk, Woodruff 2005) 7
8
Anything else? A map f: X → Y is an embedding with distortion D’, if for a, b from X: d X (a, b) / D’ ≤ d Y (f(a), f(b)) ≤ d X (a, b) Reductions for geometric problems If Y has s-bit sketches for approximation D, then for X one gets s bits and approximation DD’ 8
9
Metrics with good sketches A metric X admits sketches with s, D = O(1), if: X = ℓ p for p ≤ 2 X embeds into ℓ p for p ≤ 2 with distortion O(1) Are there any other metrics with efficient sketches (D and s are O(1))? We don’t know! Some new techniques waiting to be discovered? No new techniques?! 9
10
The main result If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ 1 – ε with distortion O(sD / ε) Embedding into ℓ p, p ≤ 2 Efficient sketches (Kushilevitz, Ostrovsky, Rabani 1998) (Indyk 2000) For norms A vector space X with ‖.‖: X → R ≥0 is a normed space, if ‖x‖ = 0 iff x = 0 ‖αx‖ = |α|‖x‖ ‖x + y‖ ≤ ‖x‖ + ‖y‖ Every norm gives rise to a metric: define d(x, y) = ‖x - y‖ 10
11
Sanity check ℓ p spaces: p > 2 is hard, 1 ≤ p ≤ 2 is easy, p < 1 is not a norm Can classify mixed norms ℓ p (ℓ q ): in particular, ℓ 1 (ℓ 2 ) is easy, while ℓ 2 (ℓ 1 ) is hard! (Jayram, Woodruff 2009), (Kalton 1985) A non-example: edit distance is not a norm, sketchability is largely open (Ostrovsky, Rabani 2005), (Andoni, Jayram, P ă traşcu 2010) ℓqℓq ℓpℓp 11
12
No embeddings → no sketches In the contrapositive: if a normed space is non-embeddable into ℓ 1 – ε, then it does not have good sketches Can convert sophisticated non-embeddability results into lower bounds for sketches 12
13
Example 1: the Earth Mover’s Distance For x: R [Δ]×[Δ] → R with ∑ i,j x i,j = 0, define the Earth Mover’s Distance ‖x‖ EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm) Best upper bounds: D = O(1 / ε) and s = Δ ε (Andoni, Do Ba, Indyk, Woodruff 2009) D = O(log Δ) and s = O(1) (Charikar 2002), (Indyk, Thaper 2003), (Naor, Schechtman 2005) No embedding into ℓ 1 – ε with distortion O(1) (Naor, Schechtman 2005) No sketches with D = O(1) and s = O(1) 13
14
Example 2: the Trace Norm For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖ to be the sum of the singular values Previously: lower bounds only for certain restricted classes of sketches (Li, Nguyen, Woodruff 2014) Any embedding into ℓ 1 requires distortion Ω(n 1/2 ) (Pisier 1978) Any sketch must satisfy sD = Ω(n 1/2 / log n) 14
15
The plan of the proof If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ 1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ 2 Linear embedding into ℓ 1 – ε Information theoryNonlinear functional analysis A map f: X → Y is (s 1, s 2, τ 1, τ 2 )-threshold, if d X (x 1, x 2 ) ≤ s 1 implies d Y (f(x 1 ), f(x 2 )) ≤ τ 1 d X (x 1, x 2 ) ≥ s 2 implies d Y (f(x 1 ), f(x 2 )) ≥ τ 2 (1, O(sD), 1, 10)-threshold map from X to ℓ 2 15
16
Sketch → Threshold map X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ 2 ? No (1, O(sD), 1, 10)- threshold map from X to ℓ 2 Poincaré-type inequalities on X Convex duality ℓ k ∞ (X) has no sketches of size Ω(k) and approximation Θ(sD) (Andoni, Jayram, P ă traşcu 2010) (direct sum theorem for information complexity) X has no sketches of size s and approximation D ‖(x 1, …, x k )‖ = max i ‖x i ‖ 16
17
Sketching direct sums X has sketches of size s and approximation D ℓ k ∞ (X) has sketches of size O(s) and approximation Dk Alice Bob (a 1, a 2, …, a k )(b 1, b 2, …, b k ) ∑ i σ i a i ∑ i σ i b i sketch(∑ i σ i a i )sketch(∑ i σ i b i ) max i ‖a i - b i ‖ ≤ ‖∑ i σ i (a i – b i )‖ ≤ ∑ i ‖a i - b i ‖ ≤ k max i ‖a i - b i ‖ Crucially use the linear structure of X (not enough to be just a metric!) 17 (σ 1, σ 2, …, σ k ) — random ±1’s with probability 1/2
18
Threshold map → linear embedding (1, O(sD), 1, 10)-threshold map from X to ℓ 2 Linear embedding into ℓ 1 – ε with distortion O(sD / ε) Uniform embedding into ℓ 2 g: X → ℓ 2 s.t. L(‖x 1 – x 2 ‖) ≤ ‖g(x 1 ) – g(x 2 )‖ ≤ U(‖x 1 – x 2 ‖) where L and R are non-decreasing, L(t) > 0 for t > 0 U(t) → 0 as t → 0 (Aharoni, Maurey, Mityagin 1985) (Nikishin 1973) ? 18
19
Threshold map → uniform embedding A map f: X → ℓ 2 such that ‖x 1 - x 2 ‖ ≤ 1 implies ‖f(x 1 ) - f(x 2 )‖ ≤ 1 ‖x 1 - x 2 ‖ ≥ Θ(sD) implies ‖f(x 1 ) - f(x 2 )‖ ≥ 10 Building on (Johnson, Randrianarivony 2006) 1-net N of X; f Lipschitz on N Extend f from N to a Lipschitz function on the whole X 19
20
Open problems Extend to as general class of metrics as possible Connection to linear sketches? Sketches of the form x → Ax Conjecture: sketches of size s and approximation D can be converted to linear sketches with f(s) measurements and approximation g(D) Spaces that admit no non-trivial sketches (s = Ω(d) for D = O(1)): is there anything besides ℓ ∞ ? Can one strengthen our theorem to “sketchability implies embeddability into ℓ 1 ”? Equivalent to an old open problem from Functional Analysis. Sketches imply NNS, is there a reverse implication? 20
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.