Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimension reduction techniques for lp (1<p<2), with applications

Similar presentations


Presentation on theme: "Dimension reduction techniques for lp (1<p<2), with applications"— Presentation transcript:

1 Dimension reduction techniques for lp (1<p<2), with applications
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University

2 Introduction for all u,v in S,
Fundamental result in dimension reduction: Johnson-Lindenstrauss Lemma (JL-84) for Euclidean space. Given: set S of n points in Rd There exists: ƒ : Rd → Rk k = O( ln(n) / ε 2 ) for all u,v in S, ||u-v||2 ≤ ||f(u)-f(v)||2 ≤ (1+ε)||u-v||2

3 Introduction JL Lemma is specific to l2.
Dimension reduction for other lp spaces? Impossible for l and l1. Not known for other lp spaces. This paper: Dimension reduction techniques for lp (1<p<2) Specifically, single scale and snowflake embeddings

4 JL transform for all u,v in S,
Given: set S of n points in Rd There exists: ƒ : Rd → Rk k = O( ln(n) / ε 2 ) for all u,v in S, ||u-v||2 ≤ ||f(u)-f(v)||2 ≤ (1+ε)||u-v||2

5 JL transform Proof by (randomized) construction =
f : Rd → Rk : multiply vectors by random d x k matrix Matrix entries can be {-1,1} or Gaussians g2 g1 g4 g3 g6 g5 = 3 2 1 24 2

6 JL transform Prove: with constant probability, for all u,v in S
║u-v║2 ≤ ║f(u)-f(v)║2 ≤ (1+ε) ║u-v║2 Observation: f is linear if w = u-v f(w) = f(u-v) = f(u)-f(v) Suffices to prove ║w║2 ≤ ║f(w)║2 ≤ (1+ε)║w║2

7 JL transform = Consider an embedding into R1, with G=N(0,1)
Normals are 2-stable: If: X,Y ~ N(0,1) Then: aX ~ N(0,a2) Also: aX + bY ~ N(0,a2+b2) ~ √(a2+b2) N(0,1) So: ∑ wigi ~ √(∑ wi2) N(0,1) = ║w║2 N(0,1) g1 g2 g3 = c b a ag1 + bg2 + cg3

8 JL transform Even a single coordinate preserves magnitude.
Each coordinate is distributed ~ ║w║2 N(0,1) So (up to scaling) E[║f(w)║2] = ║w║2 Need this to hold simultaneously for all point pairs Multiple coordinates: ║f(w)║22 ~ ║w║22 ∑ N2 (0,1) ~ χ2(k) Sum of k coordinates squared tightly concentrated around its mean Can demonstrate When k= ln(n) / ε2 all point pairs preserved simultaneously

9 Dimension reduction for lp?
JL works well for l2. Let’s try to do the same thing for lp (1<p<2) Hint: won’t work… but will be instructive p-stable distributions: If: X,Y ~ Fp p≤2 Then: aX + bY ~ (ap+bp)1/p Fp [Johnson-Schechtman 82, Datar-Immorlica-Indyk-Mirrokni 04, Mendel-Naor 04]

10 Dimension reduction for lp?
Suppose we embedded into R1, with G=Fp ║f(w)║p distributed as ║w║p Fp So (up to scaling) E[║f(w)║p] = ║w║p Multiple coordinates from lp into lp or lq (q≤p) ║f(w)║pp = ║w║pp ∑gp ║f(w)║pq = ║w║pq ∑gq Looks good! But what’s E[gp] and E[gq]?

11 p-stable distribution
Familiar examples: Guassian: 2-stable Cauchy: 1-stable Density function Unimodal [SY-78, Y-78, H-84] Bell-shaped [G-84] Heavy-tailed when p<2: h(x) ≈ 1/(1+xp+1) When p<2, E[gq] = ∫0∞ xqh(x)dx ≈ ∫0∞ xq/(1+xp+1) ≈ ∫01 xqdx + ∫1∞ xq−(p+1)dx ≈ -x-(p-q) /(p-q) |1∞ 0<q<p E[gq] ≈ 1/(p-q) ← OK q≥p E[gq] ≈ ∞ ← Problem

12 Dimension reduction for lp?
Problems using p-stables for dimension reduction Heavy tails for p<2  E[gp]   When q<p, E[gq] is finite, but how many coordinates are needed?

13 Dimension reduction for lp?
What’s known for non-Euclidean space? For l1 : Bounded range dimension reduction [OR-02] Dimension: O(R logn / ε3 ) Distortion: Distances in range [1,R] retained to (1+ε) Expansion: Distances <1 remain smaller Contraction: Distances >R remain larger Used as a subroutine for clustering, ANNS

14 Dimension reduction for lp?
Our contributions for lp (1<p<2): Bounded range dimension reduction (lp  lq q≤p) Dimension: Oε(R logn) Distortion: Distances in [1,R] retained to (1+ε) Expansion: Distances <1 remain smaller Contraction: Distances >R remain larger Snowflake embedding: ║x-y║p  (1ε) ║x-y║pα α ≤ 1 Dimension: O(ddim2) Previously known only for l1, with dimension O(22ddim) Both embeddings have application to clustering.

15 Single scale dimension reduction
Our single-coordinate embedding is as follows: f: Rd → R1 s: upper distance threshold (~ R) φ: random angle F(v) = Fφ,s(v) = s  sin(φ + (1/s) ∑i givi) Motivated by [Mendel-Naor 04] Intuition: sin(ε) ≈ ε Small values retained Large values truncated

16 Single scale dimension reduction
F(v) = Fφ,s(v) = s sin(φ + 1/s ∑i givi) E[|F(u)-F(v)|q] = sq E[|sin(φ + 1/s ∑i giui) - sin(φ + 1/s ∑I givi)|q] = c (2s)q E[|sin(1/(2s) ∑i gi(ui-vi)) cos(φ + 1/(2s) ∑I gi(ui+vi))|q] = c (2s)q E[|sin(1/(2s) ∑i gi(ui-vi))|q] Multiple dimensions: repeat n=sO(1)logn times, tight bounds using Bernstein’s inequality Final embedding: Threshold: ║F(u)-F(v) ║q = O(s) Distortion: when 1<w < εs ║F(u)-F(v) ║q ≈ ║(1+ε)u-v║p Expansion: when w < 1 ║F(u)-F(v) ║q < ║(1+ε)u-v║p

17 Snowflake embedding Snowflake embedding is created by concatenating many single-scale embeddings An idea due to Assouad (84) Need many properties of single scale: threshold, smoothness, fidelity. Thank you!


Download ppt "Dimension reduction techniques for lp (1<p<2), with applications"

Similar presentations


Ads by Google