Linear sketching over

Slides:

Advertisements

Similar presentations

Estimating Distinct Elements, Optimally

Advertisements

1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.

Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT

Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.

Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.

Sublinear-time Algorithms for Machine Learning Ken Clarkson Elad Hazan David Woodruff IBM Almaden Technion IBM Almaden.

Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.

Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.

An Introduction to Randomness Extractors Ronen Shaltiel University of Haifa Daddy, how do computers get random bits?

LEARNIN HE UNIFORM UNDER DISTRIBUTION – Toward DNF – Ryan ODonnell Microsoft Research January, 2006.

Models of Computation Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms Week 1, Lecture 2.

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Inapproximability of MAX-CUT Khot,Kindler,Mossel and O ’ Donnell Moshe Ben Nehemia June 05.

Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.

Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.

On the Power of Adaptivity in Sparse Recovery Piotr Indyk MIT Joint work with Eric Price and David Woodruff, 2011.

Learning Juntas Elchanan Mossel UC Berkeley Ryan O’Donnell MIT Rocco Servedio Harvard.

Simple Affine Extractors using Dimension Expansion. Matt DeVos and Ariel Gabizon.

Constraint Satisfaction over a Non-Boolean Domain Approximation Algorithms and Unique Games Hardness Venkatesan Guruswami Prasad Raghavendra University.

Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff.

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

How Robust are Linear Sketches to Adaptive Inputs? Moritz Hardt, David P. Woodruff IBM Research Almaden.

Correlation testing for affine invariant properties on Shachar Lovett Institute for Advanced Study Joint with Hamed Hatami (McGill)

Quantum Computing MAS 725 Hartmut Klauck NTU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A.

Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.

1 Embedding and Similarity Search for Point Sets under Translation Minkyoung Cho and David M. Mount University of Maryland SoCG 2008.

Amplification and Derandomization Without Slowdown Dana Moshkovitz MIT Joint work with Ofer Grossman (MIT)

Data Stream Algorithms Lower Bounds Graham Cormode

1 Introduction to Quantum Information Processing CS 467 / CS 667 Phys 467 / Phys 767 C&O 481 / C&O 681 Richard Cleve DC 3524 Course.

Hartmut Klauck Centre for Quantum Technologies Nanyang Technological University Singapore.

Imperfectly Shared Randomness

Information Complexity Lower Bounds

Stochastic Streams: Sample Complexity vs. Space Complexity

New Characterizations in Turnstile Streams with Applications

Dana Ron Tel Aviv University

Approximating the MST Weight in Sublinear Time

Open Problems in Streaming

Estimating L2 Norm MIT Piotr Indyk.

Gil Kalai Einstein Institute of Mathematics

Communication Amid Uncertainty

Lecture 18: Uniformity Testing Monotonicity Testing

Background: Lattices and the Learning-with-Errors problem

Sublinear Algorithmic Tools 2

COMS E F15 Lecture 2: Median trick + Chernoff, Distinct Count, Impossibility Results Left to the title, a presenter can insert his/her own image.

Effcient quantum protocols for XOR functions

Tight Fourier Tails for AC0 Circuits

Sketching and Embedding are Equivalent for Norms

CS 154, Lecture 6: Communication Complexity

Turnstile Streaming Algorithms Might as Well Be Linear Sketches

CIS 700: “algorithms for Big Data”

Linear sketching with parities

When are Fuzzy Extractors Possible?

CSCI B609: “Foundations of Data Science”

Quantum Information Theory Introduction

Sublinear Algorihms for Big Data

Advances in Linear Sketching over Finite Fields

CSCI B609: “Foundations of Data Science”

Sudocodes Fast measurement and reconstruction of sparse signals

When are Fuzzy Extractors Possible?

Linear sketching with parities

CSCI B609: “Foundations of Data Science”

Classical Algorithms from Quantum and Arthur-Merlin Communication Protocols Lijie Chen MIT Ruosong Wang CMU.

Imperfectly Shared Randomness

Streaming Symmetric Norms via Measure Concentration

Communication Amid Uncertainty

CS21 Decidability and Tractability

Lecture 6: Counting triangles Dynamic graphs & sampling

CIS 700: “algorithms for Big Data”

Shengyu Zhang The Chinese University of Hong Kong

Sublinear Algorihms for Big Data

Presentation transcript:

Linear sketching over 𝔽 𝟐 Grigory Yaroslavtsev (Indiana University, Bloomington) http://grigory.us with Sampath Kannan (U. Pennsylvania), Elchanan Mossel (MIT) and Swagato Sanyal (NUS)

Linear sketching with parities Input 𝒙∈ 0,1 𝑛 Parity = Linear function over 𝔾 𝐹 2 : ⊕ 𝑖∈𝑆 𝑥 𝑖 E.g. 𝑥 4 ⊕ 𝑥 2 ⊕ 𝑥 42 Deterministic linear sketch: set of 𝒌 parities: ℓ 𝒙 = ⊕ 𝑖 1 ∈ 𝑆 1 𝑥 𝑖 1 ; ⊕ 𝑖 2 ∈ 𝑆 2 𝑥 𝑖 2 ;…; ⊕ 𝑖 𝒌 ∈ 𝑆 𝒌 𝑥 𝑖 𝒌 Randomized linear sketch: distribution over 𝒌 parities (random 𝑆 1 , 𝑆 2 , …, 𝑆 𝒌 ):

Linear sketching over 𝔾 𝐹 2 Given 𝒇 𝒙 : 0,1 𝑛 →{0,1} Question: Can one recover 𝒇(𝒙) from a small (𝒌≪𝑛) linear sketch over 𝔾 𝐹 2 ? Allow randomized computation (99% success) Probability over choice of random sets Sets are known at recovery time Recovery is deterministic (also consider randomized)

Motivation: Distributed Computing Distributed computation among 𝑴 machines: 𝒙=( 𝒙 𝟏 , 𝒙 𝟐 , …, 𝒙 𝑴 ) (more generally 𝒙= ⊕ 𝑖=1 𝑴 𝒙 𝒊 ) 𝑴 machines can compute sketches locally: ℓ 𝒙 𝟏 , …, ℓ( 𝒙 𝑴 ) Send them to the coordinator who computes: ℓ 𝑖 𝒙 = ℓ 𝑖 𝒙 𝟏 ⊕⋯⊕ ℓ 𝑖 ( 𝒙 𝑴 ) (coordinate-wise XORs) Coordinator computes 𝒇 𝒙 with 𝒌𝑴 communication 1 𝒙 𝒙 𝟏 𝒙 𝟐

Motivation: Streaming 𝒙 generated through a sequence of updates Updates 𝑖 1 ,…, 𝑖 𝑚 : update 𝑖 𝑡 flips bit at position 𝑖 𝑡 𝒙 𝟎 Updates: (1, 3, 8, 3) 𝒙 𝟏 1 𝒙 𝟐 1 𝒙 𝟑 1 𝒙 1 ℓ 𝒙 allows to recover 𝒇(𝒙) with 𝒌 bits of space

Deterministic vs. Randomized Fact: 𝒇 has a deterministic sketch if and only if 𝒇=𝒈( ⊕ 𝑖 1 ∈ 𝑆 1 𝑥 𝑖 1 ; ⊕ 𝑖 2 ∈ 𝑆 2 𝑥 𝑖 2 ;…; ⊕ 𝑖 𝑘 ∈ 𝑆 𝑘 𝑥 𝑖 𝑘 ) Equivalent to “𝒇 has Fourier dimension 𝒌" Randomization can help: OR: 𝒇 𝒙 = 𝑥 1 ∨⋯∨ 𝑥 𝑛 Has “Fourier dimension” =𝑛 Pick 𝒕= log 1/𝜹 random sets 𝑆 1 ,…, 𝑆 𝒕 If there is 𝑗 such that ⊕ 𝑖∈ 𝑆 𝑗 𝑥 𝑖 =1 output 1, otherwise output 0 Error probability 𝜹

Fourier Analysis 𝒇 𝑥 1 , …, 𝑥 𝑛 : 0,1 𝑛 →{0,1} Notation switch: 𝒇 𝑥 1 , …, 𝑥 𝑛 : 0,1 𝑛 →{0,1} Notation switch: 0→1 1→−1 𝒇 ′ : −1,1 𝑛 →{−1,1} Functions as vectors form a vector space: 𝒇: −1,1 𝑛 →{−1,1}⇔𝒇∈ {−1,1} 2 𝑛 Inner product on functions = “correlation”: 𝒇,𝒈 = 2 −𝑛 𝑥∈ −1,1 𝑛 𝒇 𝑥 𝒈(𝑥) = 𝔼 𝑥∼ −1,1 𝑛 𝒇 𝑥 𝒈 𝑥 𝒇 2 = 𝑓,𝑓 = 𝔼 𝑥∼ −1,1 𝑛 𝒇 2 𝑥 =1 (for Boolean only)

“Main Characters” are Parities For 𝑺⊆[𝑛] let character 𝝌 𝑺 (𝑥)= 𝑖∈𝑺 𝑥 𝑖 Fact: Every function 𝒇: −1,1 𝑛 →{−1,1} uniquely represented as multilinear polynomial 𝒇 𝑥 1 , …, 𝑥 𝑛 = 𝑺⊆[𝑛] 𝒇 𝑺 𝝌 𝑺 (𝑥) 𝒇 𝑺 a.k.a. Fourier coefficient of 𝒇 on 𝑺 𝒇 𝑺 ≡ 𝒇, 𝝌 𝑺 = 𝔼 𝑥∼ −1,1 𝑛 𝒇 𝑥 𝝌 𝑺 𝑥 𝑺 𝒇 𝑺 2 =1 (Parseval)

Fourier Dimension Fourier sets 𝑆 ≡ vectors in 𝔾 𝐹 2 𝑛 “𝒇 has Fourier dimension 𝒌“ = a 𝒌-dimensional subspace in Fourier domain has all weight 𝑺⊆ 𝐴 𝒌 𝒇 𝑺 2 =1 𝒇 𝑥 1 , …, 𝑥 𝑛 = 𝑺⊆[𝑛] 𝒇 𝑺 𝝌 𝑺 (𝒙) = 𝑺⊆ 𝐴 𝒌 𝒇 𝑺 𝝌 𝑺 (𝒙) Pick a basis 𝑺 𝟏 , …, 𝑺 𝒌 in 𝐴 𝒌 : Sketch: 𝝌 𝑺 𝟏 (𝒙), …, 𝝌 𝑺 𝒌 (𝒙) For every 𝑺∈ 𝐴 𝒌 there exists 𝒁⊆ 𝒌 : 𝑺= ⊕ 𝒊∈𝒁 𝑺 𝒊 𝝌 𝑺 𝒙 = ⊕ 𝒊∈𝒁 𝝌 𝑺 𝒊 (𝒙)

Deterministic Sketching and Noise Suppose “noise” has a bounded norm 𝒇 = 𝒌-dimensional ⊕ “ noise” Sparse Fourier noise (via [Sanyal’15]) 𝒇 = 𝒌-dim. + “Fourier 𝐿 0 -noise” 𝑛𝑜𝑖𝑠𝑒 0 = # non-zero Fourier coefficients of noise (aka “Fourier sparsity”) Linear sketch size: 𝒌+𝑂( 𝑛𝑜𝑖𝑠𝑒 0 1/2 ) Our work: can’t be improved even with randomness and even for uniform 𝑥, e.g for ``addressing function’’.

How Randomization Handles Noise 𝐿 0 -noise in original domain (via hashing a la OR) 𝒇= 𝒌-dim. + “ 𝐿 0 -noise” Linear sketch size: 𝒌 + O(log 𝑛𝑜𝑖𝑠𝑒 0 ) Optimal (but only existentially, i.e. ∃𝒇:…) 𝐿 1 -noise in the Fourier domain (via [Grolmusz’97]) 𝒇 = 𝒌-dim. + “Fourier 𝐿 1 -noise” Linear sketch size: 𝒌+𝑂( 𝑛𝑜𝑖𝑠𝑒 1 2 ) Example = 𝒌-dim. + small decision tree / DNF / etc.

Randomized Sketching: Hardness 𝒌 -dimensional affine extractors require 𝒌: 𝒇 is an affine-extractor for dim. 𝒌 if any restriction on a 𝒌-dim. affine subspace has values 0/1 w/prob. ≥0.1 each Example (inner product): 𝒇 𝒙 = ⊕ 𝑖=1 𝑛/2 𝑥 2𝑖−1 𝑥 2𝑖 Not 𝜸-concentrated on 𝒌 -dim. Fourier subspaces For ∀ 𝒌 -dim. Fourier subspace 𝐴 : 𝑆∉𝐴 𝒇 𝑺 2 ≥ 1−𝜸 Any 𝒌 -dim. linear sketch makes error 1− 𝜸 2 Converse doesn’t hold, i.e. concentration is not enough

Randomized Sketching: Hardness Not 𝜸-concentrated on 𝑜(𝑛)-dim. Fourier subspaces: Almost all symmetric functions, i.e. 𝒇 𝒙 =𝒉( 𝑖 𝑥 𝑖 ) If not Fourier-close to constant or 𝑖=1 𝑛 𝑥 𝑖 E.g. Majority (not an extractor even for O( 𝑛 )) Tribes (balanced DNF) Recursive majority: 𝑀𝑎 𝑗 ∘𝑘 =𝑀𝑎 𝑗 3 ∘𝑀𝑎 𝑗 3 …∘𝑀𝑎 𝑗 3

Approximate Fourier Dimension Not 𝜸-concentrated on 𝒌 -dim. Fourier subspaces ∀ 𝒌 -dim. Fourier subspace 𝐴: 𝑆∉𝐴 𝒇 𝑺 2 ≥ 1−𝜸 Any 𝒌 -dim. linear sketch makes error ½(1− 𝜸 ) Definition (Approximate Fourier Dimension) dim 𝜸 𝒇 = smallest 𝒅 such that 𝒇 is 𝜸-concentrated on some Fourier subspace of dimension 𝒅 𝒇 ( 𝑆 1 ) 𝒇 ( 𝑆 2 ) 𝒇 ( 𝑆 3 ) 𝒇 ( 𝑆 2 + 𝑆 3 ) 𝒇 ( 𝑆 1 + 𝑆 3 ) 𝒇 ( 𝑆 1 +𝑆 2 + 𝑆 3 ) 𝑆∈𝐴 𝒇 𝑺 2 ≥ 𝜸

Sketching over Uniform Distribution + Approximate Fourier Dimension Sketching error over uniform distribution of 𝒙. dim 𝝐 𝒇 -dimensional sketch gives error 𝟏−𝝐: Fix dim 𝝐 𝒇 -dimensional 𝐴: 𝑆∈𝐴 𝒇 𝑺 2 ≥ 𝝐 Output: 𝒈 𝑥 =sign 𝑺∈𝐴 𝒇 𝑺 𝝌 𝑺 𝑥 : Pr 𝑥∼𝑈 −1,1 𝑛 𝒈 𝑥 =𝒇 𝑥 ≥𝝐⇒ error 𝟏−𝝐 We show a basic refinement ⇒ error 𝟏−𝝐 𝟐 Pick 𝜽 from a carefully chosen distribution Output: 𝒈 𝜽 𝑥 =sign 𝑺∈𝐴 𝒇 𝑺 𝝌 𝑺 𝑥 −𝜽 -1 +1 1

1-way Communication Complexity of XOR-functions Shared randomness Alice: 𝒙 ∈ 0,1 𝑛 Bob: 𝒚 ∈ 0,1 𝑛 𝑀(𝒙) 𝒇 + =𝒇(𝒙⊕𝒚) Examples: 𝒇(z) = 𝑂 𝑅 𝑖=1 𝑛 ( 𝑧 𝑖 ) ⇒ (not) Equality 𝒇(z) = ( 𝑧 0 > d) ⇒ Hamming Distance > d 𝑅 𝜖 1 𝒇 + = min.|M| so that Bob’s error prob. 𝜖

Communication Complexity of XOR-functions Well-studied (often for 2-way communication): [Montanaro,Osborne], ArXiv’09 [Shi, Zhang], QIC’09, [Tsang, Wong, Xie, Zhang], FOCS’13 [O’Donnell, Wright, Zhao,Sun,Tan], CCC’14 [Hatami, Hosseini, Lovett], FOCS’16 Connections to log-rank conjecture [Lovett’14]: Even special case for XOR-functions still open

Deterministic 1-way Communication Complexity of XOR-functions Alice: 𝒙 ∈ 0,1 𝑛 Bob: 𝒚 ∈ 0,1 𝑛 𝑀(𝒙) 𝒇 + =𝒇(𝒙⊕𝒚) 𝐷 1 𝒇 = min.|M| so that Bob is always correct [Montanaro-Osborne’09]: 𝐷 1 𝒇 = 𝐷 𝑙𝑖𝑛 𝒇 𝐷 𝑙𝑖𝑛 𝒇 + = deterministic lin. sketch complexity of 𝒇 + 𝐷 1 𝒇 = 𝐷 𝑙𝑖𝑛 𝒇 + = Fourier dimension of 𝒇

1-way Communication Complexity of XOR-functions Shared randomness Alice: 𝒙 ∈ 0,1 𝑛 Bob: 𝒚 ∈ 0,1 𝑛 𝑀(𝒙) 𝒇(𝒙⊕𝒚) 𝑅 𝜖 1 𝒇 = min.|M| so that Bob’s error prob. 𝜖 𝑅 𝜖 𝑙𝑖𝑛 𝒇 + = rand. lin. sketch complexity (error 𝜖 ) 𝑅 𝜖 1 𝒇 + ≤ 𝑅 𝜖 𝑙𝑖𝑛 𝒇 Question: 𝑅 𝜖 1 𝒇 + ≈ 𝑅 𝜖 𝑙𝑖𝑛 𝒇 ?

𝑅 𝜖 1 𝒇 + ≈ 𝑅 𝜖 𝑙𝑖𝑛 𝒇 ? Holds for: 𝑅 𝜖 1 𝒇 + ≈ 𝑅 𝜖 𝑙𝑖𝑛 𝒇 ? Holds for: Majority, Tribes, recursive majority, addressing function Linear threshold functions (Almost all) symmetric functions Degree-𝒅 𝔽 2 -polynomials: 𝑅 5𝜖 𝑙𝑖𝑛 𝒇 =𝑂(𝒅 𝑅 𝜖 1 𝒇 + ) Analogous question for 2-way is wide open: [HHL’16] 𝑄 𝜖 ⊕−𝑑𝑡 𝒇 =𝑝𝑜𝑙𝑦( 𝑅 𝜖 𝒇 + )?

Distributional 1-way Communication under Uniform Distribution Alice: 𝒙 ∼𝑈( 0,1 𝑛 ) Bob: 𝒚 ∼𝑈( 0,1 𝑛 ) 𝑀(𝒙) 𝒇(𝒙⊕𝒚) 𝑅 𝜖 1 𝒇 = sup 𝐷 𝕯 𝜖 1,𝐷 𝒇 𝕯 𝜖 1,𝑈 𝒇 = min.|M| so that Bob’s error prob. 𝜖 is over the uniform distribution over (𝒙,𝒚) Enough to consider deterministic messages only Motivation: streaming/distributed with random input

Sketching over Uniform Distribution Thm: If dim 𝝐 𝒇 = 𝒅−1 then 𝕯 1− 𝝐 𝟔 1,𝑈 𝒇 + ≥ 𝒅 6 . Optimal up to error as 𝒅-dim. linear sketch has error 1− 𝝐 𝟐 Weaker: If 𝝐 𝟐 > 𝝐 𝟏 , dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = 𝒅−1 then: 𝕯 𝜹 1,𝑈 (𝒇)≥𝒅, where 𝜹=( 𝝐 𝟐 − 𝝐 𝟏 )/4. Corollary: If 𝒇 (∅)<𝐶 for 𝐶<1 then there exists 𝒅: 𝕯 Θ 1 𝑛 1,𝑈 𝒇 ≥𝒅. Tight for the Majority function, etc.

𝕯 𝜖 1,𝑈 and Approximate Fourier Dimension Thm: If 𝝐 𝟐 > 𝝐 𝟏 >0, dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = 𝒅−1 then: 𝕯 𝜹 1,𝑈 (𝒇)≥𝒅, where 𝜹=( 𝝐 𝟐 − 𝝐 𝟏 )/4. 𝒚∈ 𝟎,𝟏 𝒏 𝒇(𝒙⊕𝒚) = 𝒇 𝒙 (𝒚) 𝒇 𝒙 𝟏 𝒇 𝒙 𝟐 00 01 10 11 𝒇 𝒙 𝟑 𝑀(𝒙)= 𝒙∈ 𝟎,𝟏 𝒏

𝕯 𝜖 1,𝑈 and Approximate Fourier Dimension If 𝑴 𝒙 =𝒅−1 average “rectangle” size = 2 𝒏−𝒅+𝟏 A subspace 𝐴 distinguishes 𝒙 𝟏 and 𝒙 𝟐 if: ∃𝑺∈𝐴 : 𝜒 𝑺 𝒙 𝟏 ≠ 𝜒 𝑺 𝒙 𝟐 Lem 1: Fix a 𝒅-dim. subspace 𝐴 𝒅 : typical 𝒙 𝟏 and 𝒙 𝟐 in a typical “rectangle” are distinguished by 𝐴 𝒅 Lem 2: If a 𝒅-dim. subspace 𝐴 𝒅 distinguishes 𝒙 𝟏 and 𝒙 𝟐 + 1) 𝒇 is 𝝐 𝟐 -concentrated on 𝐴 𝒅 2) 𝒇 not 𝝐 𝟏 -concentrated on any (𝒅−1)-dim. subspace ⇒ Pr 𝑧∼𝑈( −1,1 𝑛 ) 𝒇 𝒙 𝟏 𝑧 ≠ 𝒇 𝒙 𝟐 𝑧 ≥ 𝝐 𝟐 − 𝝐 𝟏

𝕯 𝜖 1,𝑈 and Approximate Fourier Dimension Thm: If 𝝐 𝟐 > 𝝐 𝟏 >0, dim 𝝐 𝟏 (𝒇) = dim 𝝐 𝟐 (𝒇) = 𝒅−1 then: 𝕯 𝜹 1,𝑈 (𝒇)≥𝒅, Where 𝜹=( 𝝐 𝟐 − 𝝐 𝟏 )/4. Pr 𝑧∼𝑈( −1,1 𝑛 ) 𝒇 𝒙 𝟏 𝑧 ≠ 𝒇 𝒙 𝟐 𝑧 ≥ 𝝐 𝟐 − 𝝐 𝟏 Error for fixed 𝒚 = min( Pr 𝑥∈𝑅 [ 𝒇 𝒙 𝒚 =0], Pr 𝑥∈𝑅 [ 𝒇 𝒙 𝒚 =1]) Average error for (𝒙,𝒚)∈𝑅 = Ω( 𝝐 𝟐 − 𝝐 𝟏 ) 𝒚 𝒈 𝒙 𝟏 1 𝑅=“typical rectangle” 𝒈 𝒙 𝟐

Application: Random Streams 𝒙∈ 0,1 𝑛 generated via a stream of updates Each update flips a random coordinate Goal: maintain 𝒇 𝒙 during the stream (error 𝝐) Question: how much space necessary? Answer: 𝕯 𝜖 1,𝑈 and best algorithm is linear sketch After first O(𝑛 log⁡𝑛) updates input 𝒙 is uniform Big open question: Is the same true if 𝒙 is not uniform? True for VERY LONG ( 2 2 2 Ω 𝑛 ) streams (via [LNW’14]) How about short ones? Answer would follow from our conjecture if true

Thanks! Questions? Other stuff: Sketching Linear Threshold Functions: 𝑂 𝜃 𝑚 log 𝜃 𝑚 Resolves a communication conjecture of [MO’09] Blog post: http://grigory.us/blog/the-binary-sketchman

Example: Majority 𝕯 𝑂(1/ 𝑛 ) 1,𝑈 (𝑴𝒂 𝒋 𝒏 )≥𝒏 Majority function: 𝑴𝒂 𝒋 𝒏 𝑧 1 ,…, 𝑧 𝑛 ≡ 𝑖=1 𝑛 𝑧 𝑖 ≥𝑛/2 𝑴𝒂 𝒋 𝒏 𝑺 only depends on 𝑺 𝑴𝒂 𝒋 𝒏 𝑺 =0 if |𝑺| is odd 𝑊 𝒌 𝑴𝒂 𝒋 𝒏 = 𝑺: 𝑺 =𝒌 𝑴𝒂 𝒋 𝒏 𝑺 =𝛼 𝒌 − 3 2 1±𝑂 1 𝒌 (𝑛−1)-dimensional subspace with most weight: 𝑨 𝒏−𝟏 =𝑠𝑝𝑎𝑛( 1 , 2 ,…,{𝑛−1}) 𝑺∈ 𝑨 𝒏−𝟏 𝑴𝒂 𝒋 𝒏 𝑺 =1− 𝛾 𝑛 ±𝑂 𝑛 −3/2 Set 𝝐 𝟐 =1−𝑂( 𝑛 −3/2 ), 𝝐 𝟏 =1− 𝛾 𝑛 +𝑂 𝑛 −3/2 𝕯 𝑂(1/ 𝑛 ) 1,𝑈 (𝑴𝒂 𝒋 𝒏 )≥𝒏