Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linear sketching over

Similar presentations


Presentation on theme: "Linear sketching over "β€” Presentation transcript:

1 Linear sketching over 𝔽 𝟐
Grigory Yaroslavtsev (Indiana University, Bloomington) with Sampath Kannan (U. Pennsylvania), Elchanan Mossel (MIT) and Swagato Sanyal (NUS)

2 Linear sketching with parities
Input π’™βˆˆ 0,1 𝑛 Parity = Linear function over 𝔾 𝐹 2 : βŠ• π‘–βˆˆπ‘† π‘₯ 𝑖 E.g. π‘₯ 4 βŠ• π‘₯ 2 βŠ• π‘₯ 42 Deterministic linear sketch: set of π’Œ parities: β„“ 𝒙 = βŠ• 𝑖 1 ∈ 𝑆 1 π‘₯ 𝑖 1 ; βŠ• 𝑖 2 ∈ 𝑆 2 π‘₯ 𝑖 2 ;…; βŠ• 𝑖 π’Œ ∈ 𝑆 π’Œ π‘₯ 𝑖 π’Œ Randomized linear sketch: distribution over π’Œ parities (random 𝑆 1 , 𝑆 2 , …, 𝑆 π’Œ ):

3 Linear sketching over 𝔾 𝐹 2
Given 𝒇 𝒙 : 0,1 𝑛 β†’{0,1} Question: Can one recover 𝒇(𝒙) from a small (π’Œβ‰ͺ𝑛) linear sketch over 𝔾 𝐹 2 ? Allow randomized computation (99% success) Probability over choice of random sets Sets are known at recovery time Recovery is deterministic (also consider randomized)

4 Motivation: Distributed Computing
Distributed computation among 𝑴 machines: 𝒙=( 𝒙 𝟏 , 𝒙 𝟐 , …, 𝒙 𝑴 ) (more generally 𝒙= βŠ• 𝑖=1 𝑴 𝒙 π’Š ) 𝑴 machines can compute sketches locally: β„“ 𝒙 𝟏 , …, β„“( 𝒙 𝑴 ) Send them to the coordinator who computes: β„“ 𝑖 𝒙 = β„“ 𝑖 𝒙 𝟏 βŠ•β‹―βŠ• β„“ 𝑖 ( 𝒙 𝑴 ) (coordinate-wise XORs) Coordinator computes 𝒇 𝒙 with π’Œπ‘΄ communication 1 𝒙 𝒙 𝟏 𝒙 𝟐

5 Motivation: Streaming
𝒙 generated through a sequence of updates Updates 𝑖 1 ,…, 𝑖 π‘š : update 𝑖 𝑑 flips bit at position 𝑖 𝑑 𝒙 𝟎 Updates: (1, 3, 8, 3) 𝒙 𝟏 1 𝒙 𝟐 1 𝒙 πŸ‘ 1 𝒙 1 β„“ 𝒙 allows to recover 𝒇(𝒙) with π’Œ bits of space

6 Deterministic vs. Randomized
Fact: 𝒇 has a deterministic sketch if and only if 𝒇=π’ˆ( βŠ• 𝑖 1 ∈ 𝑆 1 π‘₯ 𝑖 1 ; βŠ• 𝑖 2 ∈ 𝑆 2 π‘₯ 𝑖 2 ;…; βŠ• 𝑖 π‘˜ ∈ 𝑆 π‘˜ π‘₯ 𝑖 π‘˜ ) Equivalent to β€œπ’‡ has Fourier dimension π’Œ" Randomization can help: OR: 𝒇 𝒙 = π‘₯ 1 βˆ¨β‹―βˆ¨ π‘₯ 𝑛 Has β€œFourier dimension” =𝑛 Pick 𝒕= log 1/𝜹 random sets 𝑆 1 ,…, 𝑆 𝒕 If there is 𝑗 such that βŠ• π‘–βˆˆ 𝑆 𝑗 π‘₯ 𝑖 =1 output 1, otherwise output 0 Error probability 𝜹

7 Fourier Analysis 𝒇 π‘₯ 1 , …, π‘₯ 𝑛 : 0,1 𝑛 β†’{0,1} Notation switch:
𝒇 π‘₯ 1 , …, π‘₯ 𝑛 : 0,1 𝑛 β†’{0,1} Notation switch: 0β†’1 1β†’βˆ’1 𝒇 β€² : βˆ’1,1 𝑛 β†’{βˆ’1,1} Functions as vectors form a vector space: 𝒇: βˆ’1,1 𝑛 β†’{βˆ’1,1}β‡”π’‡βˆˆ {βˆ’1,1} 2 𝑛 Inner product on functions = β€œcorrelation”: 𝒇,π’ˆ = 2 βˆ’π‘› π‘₯∈ βˆ’1,1 𝑛 𝒇 π‘₯ π’ˆ(π‘₯) = 𝔼 π‘₯∼ βˆ’1,1 𝑛 𝒇 π‘₯ π’ˆ π‘₯ 𝒇 2 = 𝑓,𝑓 = 𝔼 π‘₯∼ βˆ’1,1 𝑛 𝒇 2 π‘₯ =1 (for Boolean only)

8 β€œMain Characters” are Parities
For π‘ΊβŠ†[𝑛] let character 𝝌 𝑺 (π‘₯)= π‘–βˆˆπ‘Ί π‘₯ 𝑖 Fact: Every function 𝒇: βˆ’1,1 𝑛 β†’{βˆ’1,1} uniquely represented as multilinear polynomial 𝒇 π‘₯ 1 , …, π‘₯ 𝑛 = π‘ΊβŠ†[𝑛] 𝒇 𝑺 𝝌 𝑺 (π‘₯) 𝒇 𝑺 a.k.a. Fourier coefficient of 𝒇 on 𝑺 𝒇 𝑺 ≑ 𝒇, 𝝌 𝑺 = 𝔼 π‘₯∼ βˆ’1,1 𝑛 𝒇 π‘₯ 𝝌 𝑺 π‘₯ 𝑺 𝒇 𝑺 2 =1 (Parseval)

9 Fourier Dimension Fourier sets 𝑆 ≑ vectors in 𝔾 𝐹 2 𝑛
β€œπ’‡ has Fourier dimension π’Œβ€œ = a π’Œ-dimensional subspace in Fourier domain has all weight π‘ΊβŠ† 𝐴 π’Œ 𝒇 𝑺 2 =1 𝒇 π‘₯ 1 , …, π‘₯ 𝑛 = π‘ΊβŠ†[𝑛] 𝒇 𝑺 𝝌 𝑺 (𝒙) = π‘ΊβŠ† 𝐴 π’Œ 𝒇 𝑺 𝝌 𝑺 (𝒙) Pick a basis 𝑺 𝟏 , …, 𝑺 π’Œ in 𝐴 π’Œ : Sketch: 𝝌 𝑺 𝟏 (𝒙), …, 𝝌 𝑺 π’Œ (𝒙) For every π‘Ίβˆˆ 𝐴 π’Œ there exists π’βŠ† π’Œ : 𝑺= βŠ• π’Šβˆˆπ’ 𝑺 π’Š 𝝌 𝑺 𝒙 = βŠ• π’Šβˆˆπ’ 𝝌 𝑺 π’Š (𝒙)

10 Deterministic Sketching and Noise
Suppose β€œnoise” has a bounded norm 𝒇 = π’Œ-dimensional βŠ• β€œ noise” Sparse Fourier noise (via [Sanyal’15]) 𝒇 = π’Œ-dim. + β€œFourier 𝐿 0 -noise” π‘›π‘œπ‘–π‘ π‘’ = # non-zero Fourier coefficients of noise (aka β€œFourier sparsity”) Linear sketch size: π’Œ+𝑂( π‘›π‘œπ‘–π‘ π‘’ /2 ) Our work: can’t be improved even with randomness and even for uniform π‘₯, e.g for ``addressing function’’.

11 How Randomization Handles Noise
𝐿 0 -noise in original domain (via hashing a la OR) 𝒇= π’Œ-dim. + β€œ 𝐿 0 -noise” Linear sketch size: π’Œ + O(log π‘›π‘œπ‘–π‘ π‘’ 0 ) Optimal (but only existentially, i.e. βˆƒπ’‡:…) 𝐿 1 -noise in the Fourier domain (via [Grolmusz’97]) 𝒇 = π’Œ-dim. + β€œFourier 𝐿 1 -noise” Linear sketch size: π’Œ+𝑂( π‘›π‘œπ‘–π‘ π‘’ ) Example = π’Œ-dim. + small decision tree / DNF / etc.

12 Randomized Sketching: Hardness
π’Œ -dimensional affine extractors require π’Œ: 𝒇 is an affine-extractor for dim. π’Œ if any restriction on a π’Œ-dim. affine subspace has values 0/1 w/prob. β‰₯0.1 each Example (inner product): 𝒇 𝒙 = βŠ• 𝑖=1 𝑛/2 π‘₯ 2π‘–βˆ’1 π‘₯ 2𝑖 Not 𝜸-concentrated on π’Œ -dim. Fourier subspaces For βˆ€ π’Œ -dim. Fourier subspace 𝐴 : π‘†βˆ‰π΄ 𝒇 𝑺 2 β‰₯ 1βˆ’πœΈ Any π’Œ -dim. linear sketch makes error 1βˆ’ 𝜸 2 Converse doesn’t hold, i.e. concentration is not enough

13 Randomized Sketching: Hardness
Not 𝜸-concentrated on π‘œ(𝑛)-dim. Fourier subspaces: Almost all symmetric functions, i.e. 𝒇 𝒙 =𝒉( 𝑖 π‘₯ 𝑖 ) If not Fourier-close to constant or 𝑖=1 𝑛 π‘₯ 𝑖 E.g. Majority (not an extractor even for O( 𝑛 )) Tribes (balanced DNF) Recursive majority: π‘€π‘Ž 𝑗 βˆ˜π‘˜ =π‘€π‘Ž 𝑗 3 βˆ˜π‘€π‘Ž 𝑗 3 β€¦βˆ˜π‘€π‘Ž 𝑗 3

14 Approximate Fourier Dimension
Not 𝜸-concentrated on π’Œ -dim. Fourier subspaces βˆ€ π’Œ -dim. Fourier subspace 𝐴: π‘†βˆ‰π΄ 𝒇 𝑺 2 β‰₯ 1βˆ’πœΈ Any π’Œ -dim. linear sketch makes error Β½(1βˆ’ 𝜸 ) Definition (Approximate Fourier Dimension) dim 𝜸 𝒇 = smallest 𝒅 such that 𝒇 is 𝜸-concentrated on some Fourier subspace of dimension 𝒅 𝒇 ( 𝑆 1 ) 𝒇 ( 𝑆 2 ) 𝒇 ( 𝑆 3 ) 𝒇 ( 𝑆 2 + 𝑆 3 ) 𝒇 ( 𝑆 1 + 𝑆 3 ) 𝒇 ( 𝑆 1 +𝑆 2 + 𝑆 3 ) π‘†βˆˆπ΄ 𝒇 𝑺 2 β‰₯ 𝜸

15 Sketching over Uniform Distribution + Approximate Fourier Dimension
Sketching error over uniform distribution of 𝒙. dim 𝝐 𝒇 -dimensional sketch gives error πŸβˆ’π: Fix dim 𝝐 𝒇 -dimensional 𝐴: π‘†βˆˆπ΄ 𝒇 𝑺 2 β‰₯ 𝝐 Output: π’ˆ π‘₯ =sign π‘Ίβˆˆπ΄ 𝒇 𝑺 𝝌 𝑺 π‘₯ : Pr π‘₯βˆΌπ‘ˆ βˆ’1,1 𝑛 π’ˆ π‘₯ =𝒇 π‘₯ β‰₯𝝐⇒ error πŸβˆ’π We show a basic refinement β‡’ error πŸβˆ’π 𝟐 Pick 𝜽 from a carefully chosen distribution Output: π’ˆ 𝜽 π‘₯ =sign π‘Ίβˆˆπ΄ 𝒇 𝑺 𝝌 𝑺 π‘₯ βˆ’πœ½ -1 +1 1

16 1-way Communication Complexity of XOR-functions
Shared randomness Alice: 𝒙 ∈ 0,1 𝑛 Bob: π’š ∈ 0,1 𝑛 𝑀(𝒙) 𝒇 + =𝒇(π’™βŠ•π’š) Examples: 𝒇(z) = 𝑂 𝑅 𝑖=1 𝑛 ( 𝑧 𝑖 ) β‡’ (not) Equality 𝒇(z) = ( 𝑧 0 > d) β‡’ Hamming Distance > d 𝑅 πœ– 1 𝒇 + = min.|M| so that Bob’s error prob. πœ–

17 Communication Complexity of XOR-functions
Well-studied (often for 2-way communication): [Montanaro,Osborne], ArXiv’09 [Shi, Zhang], QIC’09, [Tsang, Wong, Xie, Zhang], FOCS’13 [O’Donnell, Wright, Zhao,Sun,Tan], CCC’14 [Hatami, Hosseini, Lovett], FOCS’16 Connections to log-rank conjecture [Lovett’14]: Even special case for XOR-functions still open

18 Deterministic 1-way Communication Complexity of XOR-functions
Alice: 𝒙 ∈ 0,1 𝑛 Bob: π’š ∈ 0,1 𝑛 𝑀(𝒙) 𝒇 + =𝒇(π’™βŠ•π’š) 𝐷 1 𝒇 = min.|M| so that Bob is always correct [Montanaro-Osborne’09]: 𝐷 1 𝒇 = 𝐷 𝑙𝑖𝑛 𝒇 𝐷 𝑙𝑖𝑛 𝒇 + = deterministic lin. sketch complexity of 𝒇 + 𝐷 1 𝒇 = 𝐷 𝑙𝑖𝑛 𝒇 + = Fourier dimension of 𝒇

19 1-way Communication Complexity of XOR-functions
Shared randomness Alice: 𝒙 ∈ 0,1 𝑛 Bob: π’š ∈ 0,1 𝑛 𝑀(𝒙) 𝒇(π’™βŠ•π’š) 𝑅 πœ– 1 𝒇 = min.|M| so that Bob’s error prob. πœ– 𝑅 πœ– 𝑙𝑖𝑛 𝒇 + = rand. lin. sketch complexity (error πœ– ) 𝑅 πœ– 1 𝒇 + ≀ 𝑅 πœ– 𝑙𝑖𝑛 𝒇 Question: 𝑅 πœ– 1 𝒇 + β‰ˆ 𝑅 πœ– 𝑙𝑖𝑛 𝒇 ?

20 𝑅 πœ– 1 𝒇 + β‰ˆ 𝑅 πœ– 𝑙𝑖𝑛 𝒇 ? Holds for:
𝑅 πœ– 1 𝒇 + β‰ˆ 𝑅 πœ– 𝑙𝑖𝑛 𝒇 ? Holds for: Majority, Tribes, recursive majority, addressing function Linear threshold functions (Almost all) symmetric functions Degree-𝒅 𝔽 2 -polynomials: 𝑅 5πœ– 𝑙𝑖𝑛 𝒇 =𝑂(𝒅 𝑅 πœ– 1 𝒇 + ) Analogous question for 2-way is wide open: [HHL’16] 𝑄 πœ– βŠ•βˆ’π‘‘π‘‘ 𝒇 =π‘π‘œπ‘™π‘¦( 𝑅 πœ– 𝒇 + )?

21 Distributional 1-way Communication under Uniform Distribution
Alice: 𝒙 βˆΌπ‘ˆ( 0,1 𝑛 ) Bob: π’š βˆΌπ‘ˆ( 0,1 𝑛 ) 𝑀(𝒙) 𝒇(π’™βŠ•π’š) 𝑅 πœ– 1 𝒇 = sup 𝐷 𝕯 πœ– 1,𝐷 𝒇 𝕯 πœ– 1,π‘ˆ 𝒇 = min.|M| so that Bob’s error prob. πœ– is over the uniform distribution over (𝒙,π’š) Enough to consider deterministic messages only Motivation: streaming/distributed with random input

22 Sketching over Uniform Distribution
Thm: If dim 𝝐 𝒇 = π’…βˆ’1 then 𝕯 1βˆ’ 𝝐 πŸ” 1,π‘ˆ 𝒇 + β‰₯ 𝒅 6 . Optimal up to error as 𝒅-dim. linear sketch has error 1βˆ’ 𝝐 𝟐 Weaker: If 𝝐 𝟐 > 𝝐 𝟏 , dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = π’…βˆ’1 then: 𝕯 𝜹 1,π‘ˆ (𝒇)β‰₯𝒅, where 𝜹=( 𝝐 𝟐 βˆ’ 𝝐 𝟏 )/4. Corollary: If 𝒇 (βˆ…)<𝐢 for 𝐢<1 then there exists 𝒅: 𝕯 Θ 1 𝑛 1,π‘ˆ 𝒇 β‰₯𝒅. Tight for the Majority function, etc.

23 𝕯 πœ– 1,π‘ˆ and Approximate Fourier Dimension
Thm: If 𝝐 𝟐 > 𝝐 𝟏 >0, dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = π’…βˆ’1 then: 𝕯 𝜹 1,π‘ˆ (𝒇)β‰₯𝒅, where 𝜹=( 𝝐 𝟐 βˆ’ 𝝐 𝟏 )/4. π’šβˆˆ 𝟎,𝟏 𝒏 𝒇(π’™βŠ•π’š) = 𝒇 𝒙 (π’š) 𝒇 𝒙 𝟏 𝒇 𝒙 𝟐 00 01 10 11 𝒇 𝒙 πŸ‘ 𝑀(𝒙)= π’™βˆˆ 𝟎,𝟏 𝒏

24 𝕯 πœ– 1,π‘ˆ and Approximate Fourier Dimension
If 𝑴 𝒙 =π’…βˆ’1 average β€œrectangle” size = 2 π’βˆ’π’…+𝟏 A subspace 𝐴 distinguishes 𝒙 𝟏 and 𝒙 𝟐 if: βˆƒπ‘Ίβˆˆπ΄ : πœ’ 𝑺 𝒙 𝟏 β‰  πœ’ 𝑺 𝒙 𝟐 Lem 1: Fix a 𝒅-dim. subspace 𝐴 𝒅 : typical 𝒙 𝟏 and 𝒙 𝟐 in a typical β€œrectangle” are distinguished by 𝐴 𝒅 Lem 2: If a 𝒅-dim. subspace 𝐴 𝒅 distinguishes 𝒙 𝟏 and 𝒙 𝟐 + 1) 𝒇 is 𝝐 𝟐 -concentrated on 𝐴 𝒅 2) 𝒇 not 𝝐 𝟏 -concentrated on any (π’…βˆ’1)-dim. subspace β‡’ Pr π‘§βˆΌπ‘ˆ( βˆ’1,1 𝑛 ) 𝒇 𝒙 𝟏 𝑧 β‰  𝒇 𝒙 𝟐 𝑧 β‰₯ 𝝐 𝟐 βˆ’ 𝝐 𝟏

25 𝕯 πœ– 1,π‘ˆ and Approximate Fourier Dimension
Thm: If 𝝐 𝟐 > 𝝐 𝟏 >0, dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = π’…βˆ’1 then: 𝕯 𝜹 1,π‘ˆ (𝒇)β‰₯𝒅, Where 𝜹=( 𝝐 𝟐 βˆ’ 𝝐 𝟏 )/4. Pr π‘§βˆΌπ‘ˆ( βˆ’1,1 𝑛 ) 𝒇 𝒙 𝟏 𝑧 β‰  𝒇 𝒙 𝟐 𝑧 β‰₯ 𝝐 𝟐 βˆ’ 𝝐 𝟏 Error for fixed π’š = min( Pr π‘₯βˆˆπ‘… [ 𝒇 𝒙 π’š =0], Pr π‘₯βˆˆπ‘… [ 𝒇 𝒙 π’š =1]) Average error for (𝒙,π’š)βˆˆπ‘… = Ξ©( 𝝐 𝟐 βˆ’ 𝝐 𝟏 ) π’š π’ˆ 𝒙 𝟏 1 𝑅=β€œtypical rectangle” π’ˆ 𝒙 𝟐

26 Application: Random Streams
π’™βˆˆ 0,1 𝑛 generated via a stream of updates Each update flips a random coordinate Goal: maintain 𝒇 𝒙 during the stream (error 𝝐) Question: how much space necessary? Answer: 𝕯 πœ– 1,π‘ˆ and best algorithm is linear sketch After first O(𝑛 log⁑𝑛) updates input 𝒙 is uniform Big open question: Is the same true if 𝒙 is not uniform? True for VERY LONG ( Ξ© 𝑛 ) streams (via [LNW’14]) How about short ones? Answer would follow from our conjecture if true

27 Thanks! Questions? Other stuff:
Sketching Linear Threshold Functions: 𝑂 πœƒ π‘š log πœƒ π‘š Resolves a communication conjecture of [MO’09] Blog post:

28 Example: Majority 𝕯 𝑂(1/ 𝑛 ) 1,π‘ˆ (𝑴𝒂 𝒋 𝒏 )β‰₯𝒏 Majority function:
𝑴𝒂 𝒋 𝒏 𝑧 1 ,…, 𝑧 𝑛 ≑ 𝑖=1 𝑛 𝑧 𝑖 β‰₯𝑛/2 𝑴𝒂 𝒋 𝒏 𝑺 only depends on 𝑺 𝑴𝒂 𝒋 𝒏 𝑺 =0 if |𝑺| is odd π‘Š π’Œ 𝑴𝒂 𝒋 𝒏 = 𝑺: 𝑺 =π’Œ 𝑴𝒂 𝒋 𝒏 𝑺 =𝛼 π’Œ βˆ’ ±𝑂 1 π’Œ (π‘›βˆ’1)-dimensional subspace with most weight: 𝑨 π’βˆ’πŸ =π‘ π‘π‘Žπ‘›( 1 , 2 ,…,{π‘›βˆ’1}) π‘Ίβˆˆ 𝑨 π’βˆ’πŸ 𝑴𝒂 𝒋 𝒏 𝑺 =1βˆ’ 𝛾 𝑛 ±𝑂 𝑛 βˆ’3/2 Set 𝝐 𝟐 =1βˆ’π‘‚( 𝑛 βˆ’3/2 ), 𝝐 𝟏 =1βˆ’ 𝛾 𝑛 +𝑂 𝑛 βˆ’3/2 𝕯 𝑂(1/ 𝑛 ) 1,π‘ˆ (𝑴𝒂 𝒋 𝒏 )β‰₯𝒏


Download ppt "Linear sketching over "

Similar presentations


Ads by Google