Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sublinear Algorihms for Big Data

Similar presentations


Presentation on theme: "Sublinear Algorihms for Big Data"— Presentation transcript:

1 Sublinear Algorihms for Big Data
Slides are available at Sublinear Algorihms for Big Data Lecture 4 Grigory Yaroslavtsev

2 Today Dimensionality reduction Sublinear time algorithms
AMS as dimensionality reduction Johnson-Lindenstrauss transform Sublinear time algorithms Definitions: approximation, property testing Basic examples: approximating diameter, testing properties of images Testing sortedness Testing connectedness

3 𝐿 𝑝 -norm Estimation Stream: 𝒎 updates 𝑥 𝑖 , Δ 𝑖 ∈ 𝑛 ×ℝ that define vector 𝑓 where 𝑓 𝑗 = 𝑖: 𝑥 𝑖 =𝑗 Δ 𝑖 . Example: For 𝑛=4 〈 1,3 , 3, 0.5 , 1,2 , 2,−2 , 2,1 , 1,−1 , (4,1)〉 𝑓=(4, −1, 0.5, 1) 𝐿 𝑝 -norm: 𝑓 𝑝 = 𝑖 𝑓 𝑝 𝑝

4 𝐿 𝑝 -norm Estimation 𝐿 𝑝 -norm: 𝑓 𝑝 = 𝑖 𝑓 𝑝 1 𝑝 Two lectures ago:
𝑓 0 = 𝐹 0 -moment 𝑓 = 𝐹 2 -moment (via AMS sketching) Space: O log 𝑛 𝜖 2 log 1 𝛿 Technique: linear sketches 𝑓 0 : 𝑖∈𝑆 𝑓 𝑖 for random set s 𝑆 𝑓 : 𝑖 𝜎 𝑖 𝑓 𝑖 for random signs 𝜎 𝑖

5 AMS as dimensionality reduction
Maintain a “linear sketch” vector 𝒁= 𝑍 1 , …, 𝑍 𝑘 =𝑅𝑓 𝑍 𝑖 = 𝑗∈[𝑛] 𝜎 𝑖𝑗 𝑓 𝑗 , where 𝜎 𝑖𝑗 ∈ 𝑅 {−1,1} Estimator 𝒀 for 𝑓 : 1 𝑘 𝑖=1 𝑘 𝑍 𝑖 2 = 𝑅𝑓 𝑘 “Dimensionality reduction”: 𝑥→𝑅𝑥, “heavy” tail Pr 𝒀 − 𝑓 ≥𝑐 2 𝑘 𝑓 ≤ 1 𝑐 2

6 Normal Distribution Normal distribution 𝑁(0,1) Basic facts:
Range: (−∞, +∞) Density: 𝜇 𝑥 = 2𝜋 − 𝑒 − 𝑥 2 2 Mean = 0, Variance = 1 Basic facts: If 𝑋 and 𝑌 are independent r.v. with normal distribution then 𝑋+𝑌 has normal distribution 𝑉𝑎𝑟 𝑐𝑋 = 𝑐 2 𝑉𝑎𝑟[𝑋] If 𝑋, 𝑌 are independent, then 𝑉𝑎𝑟 𝑋+𝑌 =𝑉𝑎𝑟 𝑋 +𝑉𝑎𝑟[𝑌]

7 Johnson-Lindenstrauss Transform
Instead of ±1 let 𝜎 𝑖 be i.i.d. random variables from normal distribution 𝑁(0,1) 𝑍= 𝑖 𝜎 𝑖 𝑓 𝑖 We still have 𝔼 𝑍 2 = 𝑖 𝑓 𝑖 2 = 𝑓 because: 𝔼 𝜎 𝑖 𝔼 𝜎 𝑗 =0; 𝔼 𝜎 𝑖 2 = “variance of 𝜎 𝑖 “ = 1 Define 𝒁=( 𝑍 1 , …, 𝑍 𝑘 ) and define: 𝒁 = 𝑗 𝑍 𝑗 𝔼 𝒁 =𝑘 𝑓 2 2 JL Lemma: There exists 𝐶>0 s.t. for small enough 𝜖>0: Pr 𝒁 −𝑘 𝑓 >𝜖𝑘 𝑓 ≤ exp −𝐶 𝜖 2 𝑘

8 Proof of JL Lemma JL Lemma: ∃𝐶>0 s.t. for small enough 𝜖>0:
Pr 𝒁 −𝑘 𝑓 >𝜖𝑘 𝑓 ≤ exp −𝐶 𝜖 2 𝑘 Assume 𝑓 =1. We have 𝒁 𝑖 = 𝑗 𝜎 𝑖𝑗 𝑓 𝑖 and 𝒁=( 𝒁 𝟏 , …, 𝒁 𝒌 ) 𝔼 𝒁 =𝑘 𝑓 =𝑘 Alternative form of JL Lemma: Pr 𝒁 >𝑘 1+𝜖 ≤ exp − 𝜖 2 𝑘+𝑂(𝑘 𝜖 3 )

9 Proof of JL Lemma Alternative form of JL Lemma:
Pr 𝒁 >𝑘 1+𝜖 ≤ exp − 𝜖 2 𝑘+𝑂(𝑘 𝜖 3 ) Let 𝒀= 𝒁 and 𝛼=𝑘 1+𝜖 2 For every 𝒔>0 we have: Pr 𝒀>𝛼 =Pr⁡[ 𝑒 𝒔𝒀 > 𝑒 𝒔𝛼 ] By Markov and independence of 𝒁 𝑖 ′ 𝑠: Pr 𝑒 𝒔𝒀 > 𝑒 𝒔𝛼 ≤ 𝔼 𝑒 𝒔𝒀 𝑒 𝒔𝛼 = 𝑒 −𝒔𝛼 𝔼 𝑒 𝒔 𝑖 𝒁 𝑖 2 = 𝑒 −𝒔𝛼 𝑖=1 𝑘 𝔼 𝑒 𝒔 𝒁 𝑖 2 We have 𝑍 𝑖 ∈𝑁(0,1), hence: 𝔼 𝑒 𝒔 𝒁 𝑖 2 = 2𝜋 − −∞ ∞ 𝑒 𝒔 𝑡 2 𝑒 − 𝑡 𝑑𝑡 = 1 1−2𝒔

10 Proof of JL Lemma Alternative form of JL Lemma:
Pr 𝒁 >𝑘 1+𝜖 ≤ exp − 𝜖 2 𝑘+𝑂(𝑘 𝜖 3 ) For every 𝒔>0 we have: Pr 𝒀>𝛼 ≤ 𝑒 −𝒔𝛼 𝑖=1 𝑘 𝔼 𝑒 𝒔 𝒁 𝑖 2 = 𝑒 −𝒔𝛼 1 −2𝒔 − 𝑘 2 Let 𝒔= − 𝑘 𝛼 and recall that 𝛼=𝑘 1+𝜖 2 A calculation finishes the proof: Pr 𝒀>𝛼 ≤ exp − 𝜖 2 𝑘+𝑂(𝑘 𝜖 3 )

11 Johnson-Lindenstrauss Transform
Single vector: 𝑘=𝑂 log 1 𝛿 𝜖 2 Tight: 𝑘=Ω log 1 𝛿 𝜖 2 [Woodruff’10] 𝒏 vectors simultaneously: 𝑘=𝑂 log 𝒏 log 1 𝛿 𝜖 2 Tight: 𝑘=Ω log 𝒏 log 1 𝛿 𝜖 2 [Molinaro, Woodruff, Y. ’13] Distances between 𝒏 vectors = O( 𝒏 𝟐 ) vectors: 𝑘=𝑂 log 𝒏 log 1 𝛿 𝜖 2


Download ppt "Sublinear Algorihms for Big Data"

Similar presentations


Ads by Google