Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sublinear Algorithmic Tools 3

Similar presentations


Presentation on theme: "Sublinear Algorithmic Tools 3"β€” Presentation transcript:

1 Sublinear Algorithmic Tools 3
Alex Andoni

2 Plan Dimension reduction Sketching
Application: Numerical Linear Algebra Sketching Application: Streaming Application: Nearest Neighbor Search and more… Dimension reduction: linear map 𝑆: ℝ 𝑛 β†’ ℝ π‘˜ s.t: for any points 𝑝,π‘žβˆˆ ℝ 𝑛 : Pr 𝑆 ||𝑆 𝑝 βˆ’π‘†(π‘ž)|| ||π‘βˆ’π‘ž|| ∈ 1Β±πœ– β‰₯1βˆ’π›Ώ Sketch: linear map 𝑆: ℝ 𝑛 β†’ ℝ π‘š , estimator 𝑬, s.t. for any π‘βˆˆ β„œ 𝑛 : Pr 𝑆 𝑬(𝑆 𝑝 ) ||𝑝| | 1 ∈(1Β±πœ–) β‰₯1βˆ’π›Ώ

3 Approximate Near Neighbor Search
Preprocess: a set of 𝑛 point approximation 𝑐>1 Query: given a query point π‘ž, report a point 𝑝 βˆ— βˆˆπ‘ƒ with the smallest distance to π‘ž up to factor 𝑐>1 Near neighbor: threshold π‘Ÿ Applications to: closest pair, Max/Min inner product (MIPS), greedy coordinate descent, kernel approximation & estimation,… Parameters: space & query time π‘Ÿ 𝑝 βˆ— π‘π‘Ÿ π‘ž 𝑝 β€²

4 𝑃1= 𝑃2= Locality-Sensitive Hashing π‘›πœŒ, where
[Indyk-Motwani’98] π‘ž 𝑝′ Random hash function 𝑔 on ℝ 𝑑 satisfying: for close pair (when π‘žβˆ’π‘ β‰€π‘Ÿ) Pr⁑[𝑔(π‘ž)=𝑔(𝑝)] is β€œhigh” for far pair (when π‘žβˆ’π‘β€² >π‘π‘Ÿ) Pr⁑[𝑔(π‘ž)=𝑔(𝑝′)] is β€œsmall” Use several hash tables 𝑝 π‘ž 𝑃1= β€œnot-so-small” 𝑃2= Pr⁑[𝑔(π‘ž)=𝑔(𝑝)] 𝜌= log 1/ 𝑃 1 log 1/ 𝑃 2 1 π‘›πœŒ, where 𝑃1 𝑃2 π‘žβˆ’π‘ π‘Ÿ π‘π‘Ÿ

5 Locality sensitive hash functions
Hash function 𝑔 is usually a concatenation of β€œprimitive” functions: 𝑔 𝑝 = β„Ž 1 (𝑝), β„Ž 2 (𝑝),…, β„Ž π‘˜ (𝑝) Example: Hamming space 0,1 𝑑 β„Ž 𝑝 = 𝑝 𝑗 , i.e., choose 𝑗 π‘‘β„Ž bit for a random 𝑗 𝑔(𝑝) chooses π‘˜ bits at random Pr β„Ž 𝑝 =β„Ž π‘ž = 1 – π»π‘Žπ‘š 𝑝,π‘ž 𝑑 𝑃 1 =1βˆ’ π‘Ÿ 𝑑 β‰ˆ 𝑒 βˆ’π‘Ÿ/𝑑 𝑃 2 =1βˆ’ π‘π‘Ÿ 𝑑 β‰ˆ 𝑒 βˆ’π‘π‘Ÿ/𝑑 𝜌= log 1/ 𝑃 1 log 1/ 𝑃 2 = π‘Ÿ/𝑑 π‘π‘Ÿ/𝑑 = 1 𝑐

6 Full algorithm Data structure is just 𝐿= 𝑛 𝜌 hash tables: Query:
Each hash table uses a fresh random function 𝑔 𝑖 𝑝 = β„Ž 𝑖,1 (𝑝),…, β„Ž 𝑖,π‘˜ (𝑝) Hash all dataset points into the table Query: Check for collisions in each of the hash tables until we encounter a point within distance π‘π‘Ÿ Guarantees: Space: 𝑂 𝑛𝐿 =𝑂( 𝑛 1+𝜌 ), plus space to store points Query time: 𝑂 𝐿⋅(π‘˜+𝑑) =𝑂( 𝑛 𝜌 ⋅𝑑) (in expectation) 50% probability of success. Draw L hash tables

7 Analysis of LSH Scheme Choice of parameters π‘˜,𝐿 ?
𝐿 hash tables with 𝑔 𝑝 = β„Ž 1 (𝑝),…, β„Ž π‘˜ (𝑝) Pr[collision of far pair] = 𝑃 2 π‘˜ Pr[collision of close pair] = 𝑃 1 π‘˜ Hence 𝐿=𝑂( 𝑛 𝜌 ) β€œrepetitions” (tables) suffice! set π‘˜ s.t. = 1/𝑛 = 𝑃 2 𝜌 π‘˜ = 1/𝑛 𝜌 𝑃 1 𝑃 1 2 𝑃 2 π‘˜=1 𝑃 2 2 π‘˜=2 π‘Ÿ π‘π‘Ÿ

8 Analysis: Correctness
Let 𝑝 βˆ— be an π‘Ÿ-near neighbor If does not exists, algorithm can output anything Algorithm fails when: near neighbor 𝑝 βˆ— is not in the searched buckets 𝑔 1 π‘ž , 𝑔 2 π‘ž , …, 𝑔 𝐿 π‘ž Probability of failure: Probability π‘ž, 𝑝 βˆ— do not collide in a hash table: ≀1βˆ’ 𝑃 1 π‘˜ Probability they do not collide in 𝐿 hash tables at most 1βˆ’ 𝑃 1 π‘˜ 𝐿 = 1βˆ’ 1 𝑛 𝜌 𝑛 𝜌 ≀1/𝑒

9 Analysis: Runtime Runtime dominated by: Distance computations:
Hash function evaluation: 𝑂(πΏβ‹…π‘˜) time Distance computations to points in buckets Distance computations: Care only about far points, at distance >𝑐𝑅 In one hash table, we have Probability a far point collides is at most 𝑃 2 π‘˜ =1/𝑛 Expected number of far points in a bucket: 𝑛⋅ 1 𝑛 =1 Over 𝐿 hash tables, expected number of far points is 𝐿 Total: 𝑂 πΏπ‘˜ +𝑂 𝐿𝑑 =𝑂( 𝑛 𝜌 𝑑) in expectation

10 LSH maps: Euclidean space
[Datar-Indyk-Immorlica-Mirrokni’04, A.-Indyk’06] Space Time Exponent 𝒄=𝟐 𝑛 1+𝜌 𝑛 𝜌 𝜌=1/𝑐 𝜌=1/2 πœŒβ‰ˆ1/ 𝑐 2 𝜌=1/4 πœŒβ‰ˆ 1 2 𝑐 2 βˆ’1 𝜌=1/7 π‘ž 𝑝 𝑝′ Even better LSH maps? NO: example of isoperimetry [Motwani-Naor-Panigrahy’06, O’Donell-Wu-Zhou’11] Example question: Among bodies in 𝑅 𝑑 of volume 1, which has the lowest perimeter? Can get better maps, if 𝑔 allowed to (mildly) depend on the dataset [A-Indyk-Nguyen-Razenshteyn’14, A-Razenshteyn’15]

11 Plan Dimension reduction Sketching
Application: Numerical Linear Algebra Sketching Application: Streaming Application: Nearest Neighbor Search and more… Dimension reduction: linear map 𝑆: ℝ 𝑛 β†’ ℝ π‘˜ s.t: for any points 𝑝,π‘žβˆˆ ℝ 𝑛 : Pr 𝑆 ||𝑆 𝑝 βˆ’π‘†(π‘ž)|| ||π‘βˆ’π‘ž|| ∈ 1Β±πœ– β‰₯1βˆ’π›Ώ Sketch: linear map 𝑆: ℝ 𝑛 β†’ ℝ π‘š , estimator 𝑬, s.t. for any π‘βˆˆ β„œ 𝑛 : Pr 𝑆 𝑬(𝑆 𝑝 ) ||𝑝| | 1 ∈(1Β±πœ–) β‰₯1βˆ’π›Ώ

12 Sketching/NNS for other distances?
Earth Mover Distance: Given two sets A, B of points in a metric space EMD(A,B) = min cost bipartite matching between A and B Applications in image vision 𝑆 010110 𝐸𝑀𝐷(𝐴,𝐡) small or large? 𝑆 010101

13 Tool: metric embeddings
Map sets into vectors in β„“ 1 preserving the distance (approximately) Then use algorithms for β„“ 1 ! Say, EMD over 𝑠 2 Theorem [Cha02, IT03]: Exists a map 𝑓 mapping all π΄βŠ‚ 𝑠 2 into β„“ 1 with distortion 𝑂(log⁑𝑠): i.e., for any 𝐴,π΅βŠ‚ 𝑠 2 we have: 𝐸𝑀𝐷 𝐴,𝐡 ≀||𝑓 𝐴 βˆ’π‘“ 𝐡 | | 1 ≀𝑂( log 𝑠 )⋅𝐸𝑀𝐷(𝐴,𝐡) Implies sketch/NNS: 𝑂 log 𝑠 -approximation

14 Embeddability into β„“ 1 edit( banana , ananas ) = 2 edit(1234567,
Metric Upper bound Earth-mover distance (𝑠-sized sets in 2D plane) 𝑂 log 𝑠 [Cha02, IT03] (𝑠-sized sets in 0,1 𝑑 ) 𝑂( log 𝑠 β‹… log 𝑑 ) [AIK08] Edit distance over 0,1 𝑑 (#indels to transform x->y) 2 𝑂 log 𝑑 [OR05] Ulam (edit distance between permutations) 𝑂 log 𝑑 [CK06] Block edit distance 𝑂 log 𝑑 [MS00, CM07] edit( , ) = 2

15 Non-embeddability into β„“ 1
Metric Upper bound Earth-mover distance (𝑠-sized sets in 2D plane) 𝑂 log 𝑠 [Cha02, IT03] (𝑠-sized sets in 0,1 𝑑 ) 𝑂( log 𝑠 β‹… log 𝑑 ) [AIK08] Edit distance over 0,1 𝑑 (#indels to transform x->y) 2 𝑂 log 𝑑 [OR05] Ulam (edit distance between permutations) 𝑂 log 𝑑 [CK06] Block edit distance 𝑂 log 𝑑 [MS00, CM07] Lower bounds Ξ© log 𝑠 [NS07] Ξ© log 𝑠 [KN05] Ξ© log 𝑑 [KN05,KR06] Ξ© log 𝑑 [AK07] 4/3 [Cor03]

16 A sketch of the rest Sketching Efficient algorithms
Numerical Linear Algebra linear sketching Streaming (smaller space) Nearest neighbor search Locality Sensitive Hashing Sketching of graphs Sparsifiers: reduce number of edges so as to approximately preserve all cuts, distances, effective resistances etc [Benczur-Karger’94,…] Linear sketch for graph => data structures for dynamic connectivity [AGM’12, KKM’13] Characterization of sketching size for distance estimation: [BO10, LNW14,AKR15, BBCKY’17] E.g.: a normed space X has small sketch iff embeds into β„“ 0.9 Adaptive sketching: when we know we sketch set π΄βŠ‚ ℝ 𝑑 Then 𝑆 β‹… may depend (weakly) on 𝐴 Non-oblivious subspace embeddings [DMM’06,…, Woodruff’14]


Download ppt "Sublinear Algorithmic Tools 3"

Similar presentations


Ads by Google