Download presentation
Presentation is loading. Please wait.
1
Sublinear Algorithmic Tools 2
Alex Andoni
2
Plan Dimension reduction Sketching
Application: Numerical Linear Algebra Sketching Application: Streaming Application: Nearest Neighbor Search and moreโฆ Dimension reduction: linear map ๐: โ ๐ โ โ ๐ s.t: for any points ๐,๐โ โ ๐ : Pr ๐ ||๐ ๐ โ๐(๐)|| ||๐โ๐|| โ 1ยฑ๐ โฅ1โ๐ฟ
3
Dimension reduction in other norms/distances?
E.g., โ 1 ? Essentially no [CSโ02, BCโ03, LNโ04, JNโ10โฆ] For ๐ points, ๐ท approximation: dimension between ๐ ฮฉ 1/ ๐ท and ๐(๐/๐ท) [BC03, NR10, ANN10โฆ] even if map depends on the dataset! In contrast to โ 2 : [JL] gives ๐( ๐ โ2 log ๐ ), and doesnโt depend on the dataset Generalize the notion of dimension reduction!
4
Computational view ๐:๐ด๏ฎ โ ๐ 0,1 ๐ 0,1 ๐ ร 0,1 ๐ ๏ฎโ
๐ฅ ๐ฆ ๐:๐ด๏ฎ โ ๐ Arbitrary computation ๐ถ:โ๐รโ๐๏ฎ โ + Cons: Less geometric structure (e.g., ๐ถ not metric) Pros: More expressability: better trade-off approximation vs โdimensionโ ๐ Sketch ๐: โfunctional compression schemeโ for estimating distances almost all lossy (1+๐ distortion or more) and randomized 0,1 ๐ 0,1 ๐ ร 0,1 ๐ ๏ฎโ ๐ ๐(๐ฅ) ๐(๐ฆ) ๐
๐ด (๐ฅ,๐ฆ)โ ๐=1 ๐ ( ๐น ๐ (๐ฅ)โ ๐น ๐ (๐ฆ)) 2 ๐ถ(๐ ๐ฅ , ๐ ๐ฆ )
5
Sketching for โ 1 Analog of Euclidean projections ?
For โ 2 , we used: Gaussian distribution has stability property: ๐ 1 ๐ง 1 + ๐ 2 ๐ง 2 +โฆ ๐ ๐ ๐ง ๐ is distributed as ๐โ
||๐ง|| Is there something similar for 1-norm? Yes: Cauchy distribution! 1-stable: ๐ 1 ๐ง 1 + ๐ 2 ๐ง 2 +โฆ ๐ ๐ ๐ง ๐ is distributed as ๐โ
||๐ง| | 1 Whatโs wrong then? Cauchy are heavy-tailedโฆ doesnโt even have finite expectation (of abs) ๐๐๐ ๐ = 1 ๐( ๐ 2 +1)
6
Sketching for โ 1 [Indykโ00]
Still, can consider similar random map ๐ ๐ฅ =๐ช๐ฅ= ๐ถ 1 ๐ฅ, ๐ถ 2 ๐ฅ, โฆ, ๐ถ ๐ ๐ฅ Consider ๐ ๐ฅ โ๐ ๐ฆ =๐ช๐ฅโ๐ช๐ฆ=๐ช ๐ฅโ๐ฆ =๐ช๐ง where ๐ง=๐ฅโ๐ฆ each coordinate distributed as ||๐ง| | 1 รCauchy Take 1-norm: ||๐ช๐ง| | 1 ? does not have finite expectation, butโฆ Can estimate ||๐ง| | 1 by: median ๐ถ 1 ๐ง , ๐ถ 2 ๐ง ,โฆ ๐ถ ๐ ๐ง Correctness claim: for each ๐ Pr ๐ถ ๐ ๐ง >||๐ง| | 1 โ
(1โ๐) >1/2+ฮฉ(๐) Pr ๐ถ ๐ ๐ง <||๐ง| | 1 โ
(1+๐) >1/2+ฮฉ(๐)
7
Estimator for โ 1 Estimator: median ๐ถ 1 ๐ง , ๐ถ 2 ๐ง ,โฆ ๐ถ ๐ ๐ง
Correctness claim: for each ๐ Pr ๐ถ ๐ ๐ง >||๐ง| | 1 โ
(1โ๐) >1/2+ฮฉ(๐) Pr ๐ถ ๐ ๐ง <||๐ง| | 1 โ
(1+๐) >1/2+ฮฉ(๐) Proof: ๐ถ ๐ ๐ง is distributed as abs ||๐ง| | 1 ๐ =||๐ง| | 1 โ
|๐| Hence claim equivalent to Pr ๐ > 1โ๐ >1/2+ฮฉ ๐ Pr ๐ < 1+๐ >1/2+ฮฉ ๐ Matter of checking the pdf of the Cauchy varsโฆ
8
Estimator for โ 1 : high probability bnd
Estimator: median ๐ถ 1 ๐ง , ๐ถ 2 ๐ง ,โฆ ๐ถ ๐ ๐ง Claim: for each ๐ Pr ๐ถ ๐ ๐ง >||๐ง| | 1 โ
(1โ๐) >1/2+ฮฉ(๐) Pr ๐ถ ๐ ๐ง <||๐ง| | 1 โ
(1+๐) >1/2+ฮฉ(๐) Take ๐=๐ 1 ๐ 2 log 1 ๐ฟ ๐ธ ๐ ๐ฟ ๐ โฅ๐(1/2+ฮฉ ๐ ) Hence Pr ๐ ๐ฟ ๐ โค ๐ 2 <๐ฟ (CLT: Chernoff bound) Similarly with ๐ ๐ The above means that median ๐ถ 1 ๐ง , ๐ถ 2 ๐ง ,โฆ ๐ถ ๐ ๐ง โ 1ยฑ๐ ||๐ง| | 1 with probability at most ๐ฟ ๐ฟ ๐ =1 if holds ๐ ๐ =1 if holds
9
Yesterdayโs Application: โ 1 regression
Problem: ๐ฅ โ =๐๐๐๐๐ ๐ ๐ฅ ||๐ด๐ฅโ๐| | 1 +structured ๐, +preconditioner: ๐ ๐๐๐ง ๐ด โ
log ๐ + ๐ ๐ ๐ 1 More: other norms ( โ ๐ , M-estimator, Orlicz norms), low-rank approximation & optimization, matrix multiplication, see [Woodruff, FnTTCSโ14,โฆ] Weak DR: linear map ๐: โ ๐ โ โ ๐ , s.t. for any ๐โ โ ๐ : Pr ๐ 1โค ||๐ ๐ | | 1 ||๐| | 1 โค 1 ๐ฟ โฅ1โ๐(๐ฟ) ๐ ๐๐ โผ Cauchy distribution [Iโ00] Weak(er) OSE: linear map ๐: โ ๐ โ โ ๐ s.t. for any linear subspace ๐โ โ ๐ of dimension ๐: Pr ๐ โ๐โ๐ :1โค ||๐ ๐ | | 1 ||๐| | 1 โค ๐ ๐ 1 โฅ0.9 [SWโ11, MMโ13, WZโ13, WWโ18] ๐=๐(๐โ
log ๐)
10
Today Application: Streaming 1
Challenge: log statistics of the data, using small space IP Frequency 3 2 9 8 1 IP Frequency 3 2
11
Streaming statistics Let ๐ฅ๐ = frequency of IP ๐
1st moment (sum): โ ๐ฅ ๐ Trivial: keep a total counter 2nd moment (variance): โ ๐ฅ ๐ 2 =||๐ฅ| | 2 Trivially: ๐ counters โ too much space Canโt do better if exact Small space via (approximate) dimension reduction in โ 2 IP Frequency 3 2 โ ๐ฅ ๐ =7 โ ๐ฅ ๐ 2 =17
12
2nd frequency moment via DR
๐ ๐ฅ๐ = frequency of IP ๐ 2nd moment: โ ๐ฅ ๐ 2 =||๐ฅ| | 2 Store ๐ ๐ฅ = ๐บ 1 ๐ฅ, ๐บ 2 ๐ฅ,โฆ ๐บ ๐ ๐ฅ =๐ฎ๐ฅ Estimator: 1 ๐ ||๐ฎ๐ฅ| | 2 = 1 ๐ ๐บ 1 ๐ฅ ๐บ 2 ๐ฅ 2 +โฆ+ ๐บ ๐ ๐ฅ 2 Updating the sketch: Use linearity of the sketching function: ๐ฎ ๐ฅ+ ๐ ๐ =๐ฎ๐ฅ+๐ฎ ๐ ๐ Correctness from dimension reduction guarantee ๐ฎ ๐ฅ ๐
13
Streaming Scenario 2 ๐ฅ ๐ฆ Question : difference in traffic
๐ฅ IP Frequency 1 IP Frequency 1 2 ๐ฆ Question : difference in traffic โ | ๐ฅ ๐ โ ๐ฆ ๐ |= ๐ฅโ๐ฆ 1 ๐ฅโ๐ฆ 1 =2 Similar Qs: average delay/variance in a network differential statistics between logs at different servers, etc
14
Sketching for Difference
Use โ 1 sketching! Using random ๐(๐ฅ)=๐ช๐ฅ (common for 2 routers) Estimator: median |๐ ๐ฅ โ๐ ๐ฆ | Already proved: can get 1+๐ approximation with ๐=๐ 1 ๐ 2 IP Frequency 1 2 IP Frequency 1 ๐ฆ ๐ฅ ๐ ๐ ๐ช๐ฅ 010110 ๐ช๐ฆ 010101 Estimate ||๐ฅโ๐ฆ| | 1
15
Sketching for โ ๐ norms ๐-moment: ฮฃ ๐ฅ ๐ ๐ = ๐ฅ ๐ ๐ ๐โค2 ๐>2
About ๐ 1 ๐ 2 counters enough works via ๐-stable distributions [Indykโ00] ๐>2 Can do (and need) ๐ ๐ ( ๐ 1โ2/๐ ) counters [AMSโ96, SSโ02, BYJKSโ02, CKSโ03, IWโ05, BGKSโ06, BO10, AKOโ11, Gโ11, BKSVโ14,โฆ]
16
Streaming 3: # distinct elements
Problem: compute the number of distinct elements in the stream Trivial solution: ๐(๐) space for ๐ distinct elements Will see: ๐(logโก๐) space (approximate) IP Frequency 1 2 ๐ฅ ยฉ 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
17
Distinct Elements 1 Algorithm:
11/13/2018 6:52 AM Initialize: minHash=1 hash function h into [0,1] Process(int i): if (h(i) < minHash) minHash = h(index); Output: 1/minHash-1 Distinct Elements [Flajolet-Martinโ85, Alon-Matias-Szegedyโ96] Algorithm: Hash function โ:๐โ 0,1 Compute ๐๐๐๐ป๐๐ โ= min ๐โ๐ โ(๐) Output is 1 ๐๐๐๐ป๐๐ โ โ1 Main claim: ๐ธ ๐๐๐๐ป๐๐ โ = 1 ๐+1 , for ๐ distinct elements Proof: repeats of the same element donโt matter ๐ง = minimum of ๐ random numbers in [0,1] Pick another random number ๐โ[0,1] Whatโs the probability ๐<๐ง ? 1) exactly ๐ง 2) probability it is smallest among ๐+1 reals: 1 ๐+1 Take majority of ๐=๐ 1 ๐ 2 log 1 ๐ฟ repetitions 5 7 2 1 โ(5) โ(7) โ(2) ๏ป1/(๐+1) ยฉ 2007 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.