Polynomial Optimization over the Unit Sphere

Slides:

Advertisements

Similar presentations

Quantum Lower Bounds The Polynomial and Adversary Methods Scott Aaronson September 14, 2001 Prelim Exam Talk.

Advertisements

The Future (and Past) of Quantum Lower Bounds by Polynomials Scott Aaronson UC Berkeley.

Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.

Numerical Linear Algebra in the Streaming Model

How to Fool People to Work on Circuit Lower Bounds Ran Raz Weizmann Institute & Microsoft Research.

5.1 Real Vector Spaces.

Shortest Vector In A Lattice is NP-Hard to approximate

Applied Informatics Štefan BEREŽNÝ

Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?

Graph Laplacian Regularization for Large-Scale Semidefinite Programming Kilian Weinberger et al. NIPS 2006 presented by Aggeliki Tsoli.

2.III. Basis and Dimension 1.Basis 2.Dimension 3.Vector Spaces and Linear Systems 4.Combining Subspaces.

Tutorial 10 Iterative Methods and Matrix Norms. 2 In an iterative process, the k+1 step is defined via: Iterative processes Eigenvector decomposition.

(work appeared in SODA 10’) Yuk Hei Chan (Tom)

Dirac Notation and Spectral decomposition

Subdivision Analysis via JSR We already know the z-transform formulation of schemes: To check if the scheme generates a continuous limit curve ( the scheme.

Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.

Compiled By Raj G. Tiwari

 Row and Reduced Row Echelon  Elementary Matrices.

Matrices & Determinants Chapter: 1 Matrices & Determinants.

Quantum Computing MAS 725 Hartmut Klauck NTU TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A.

Three different ways There are three different ways to show that ρ(A) is a simple eigenvalue of an irreducible nonnegative matrix A:

Chapter 2 Nonnegative Matrices. 2-1 Introduction.

Chapter 3 Determinants Linear Algebra. Ch03_2 3.1 Introduction to Determinants Definition The determinant of a 2  2 matrix A is denoted |A| and is given.

Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)

8.4.2 Quantum process tomography 8.5 Limitations of the quantum operations formalism 量子輪講 2003 年 10 月 16 日担当：徳本晋

Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,

Instructor: Mircea Nicolescu Lecture 8 CS 485 / 685 Computer Vision.

Section 2.1 Determinants by Cofactor Expansion. THE DETERMINANT Recall from algebra, that the function f (x) = x 2 is a function from the real numbers.

Hartmut Klauck Centre for Quantum Technologies Nanyang Technological University Singapore.

2.1 Matrix Operations 2. Matrix Algebra. j -th column i -th row Diagonal entries Diagonal matrix : a square matrix whose nondiagonal entries are zero.

Commuting birth-and-death processes Caroline Uhler Department of Statistics UC Berkeley (joint work with Steven N. Evans and Bernd Sturmfels) MSRI Workshop.

7.3 Linear Systems of Equations. Gauss Elimination

Mathematics-I J.Baskar Babujee Department of Mathematics

Singular Value Decomposition and its applications

Aaron Potechin Institute for Advanced Study

The Euclidean Algorithm

4. The Eigenvalue.

Computation of the solutions of nonlinear polynomial systems

Information Complexity Lower Bounds

Markov Chains Mixing Times Lecture 5

Polynomial Norms Amir Ali Ahmadi (Princeton University) Georgina Hall

Amir Ali Ahmadi (Princeton University)

Streaming & sampling.

Georgina Hall Princeton, ORFE Joint work with Amir Ali Ahmadi

Sum of Squares, Planted Clique, and Pseudo-Calibration

Joint work with Avishay Tal (IAS) and Jiapeng Zhang (UCSD)

Distinct Distances in the Plane

Structural Properties of Low Threshold Rank Graphs

Singular Value Decomposition

Polynomial DC decompositions

Analysis and design of algorithm

Finding Large Set Covers Faster via the Representation Method

Deterministic Gossiping

Basis and Dimension Basis Dimension Vector Spaces and Linear Systems

2. Matrix Algebra 2.1 Matrix Operations.

On the effect of randomness on planted 3-coloring models

Chapter 11 Limitations of Algorithm Power

Classical Algorithms from Quantum and Arthur-Merlin Communication Protocols Lijie Chen MIT Ruosong Wang CMU.

2.III. Basis and Dimension

On The Quantitative Hardness of the Closest Vector Problem

Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)

Flow Feasibility Problems

Lecture 15: Least Square Regression Metric Embeddings

Math review - scalars, vectors, and matrices

DETERMINANT MATH 80 - Linear Algebra.

The Complexity of Approximation

Alexandr Andoni, Rishabh Dudeja, Daniel Hsu, Kiran Vodrahalli

Introduction to Machine Learning

Presentation transcript:

Polynomial Optimization over the Unit Sphere Vijay Bhattiprolu (CMU) Mrinal Ghosh (TTIC) Venkat Guruswami (CMU) Euiwoong Lee (CMU / Simons) Madhur Tulsiani (TTIC)

Problem Input Task 𝑑=2: Spectral norm of a matrix. 𝑛-variate, degree-𝑑, homogeneous polynomial 𝑓 𝑥 1 ,…, 𝑥 𝑛 ∈ℝ[ 𝑥 1 ,…, 𝑥 𝑛 ]. Task Maximize |𝑓 𝑥 1 ,…, 𝑥 𝑛 | Subject to 𝑥 1 2 +…+ 𝑥 𝑛 2 =1. Notation: 𝑓 2 ≔ sup 𝑥 2 =1 𝑓 𝑥 𝑑=2: Spectral norm of a matrix.

𝑓 2 in TCS Unique Games and Small Set Expansion (2 → 4 norm) [BBHKSZ12] When 𝑑≥4 is an even integer, 𝐺 is Small Set Expander iff 𝑀 2→𝑑 is small for some 𝑀=𝑀(𝐺). Quantum Computing (Quantum Merlin-Arthur) [ABDFS08, HM13, HNW17, BKS17] Current best hardness was proved via this connection. Tensor Decomposition / PCA [AGHKT 14, MR 14, BKS 15, HSS15, HSSS 16, MSS16, BGL17, SS17, PS17] Planted Clique, Densest Sub-Hypergraph, Refuting CSP’s, etc.

Complexity [Kozhasov 17] There can be exponentially many critical points when 𝑑=3. (Only 𝑛 when 𝑑=2). [Gurvits 03 / Nesterov 03] NP-hard to exactly optimize when 𝑑≥3. [BBHKSZ12] ETH-Hard to approximate within 2 log 1/2−𝜀 𝑛 when 𝑑=4.

Approximability (𝑑=𝑂 1 ) Approximation Ratio [KN08, HLZ11, So13, …]: 𝑂 ( 𝑛 𝑑/2−1 ). 𝑂(𝑛/𝜖)-degree SoS Hierarchy gives (1 + 𝜖)-approximation [DW13]. Holds for Ω(𝑛)-degree.

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝑑≤𝑞≤𝑛, 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in time 𝑛 𝑂(𝑞) . Smooth tradeoff between the previous results. 𝑎𝑝𝑝𝑟𝑜𝑥. 𝑟𝑎𝑡𝑖𝑜 𝑛 𝑑 2 −1 1+𝜖 𝑟𝑢𝑛𝑡𝑖𝑚𝑒 𝑛 𝑂(1) 𝑛 𝑂(𝑛/𝜖)

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝑑≤𝑞≤𝑛, 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in time 𝑛 𝑂(𝑞) . Smooth tradeoff between the previous results. 𝑎𝑝𝑝𝑟𝑜𝑥. 𝑟𝑎𝑡𝑖𝑜 𝑛 𝑑 2 −1 𝑂(1) 1+𝜖 𝑟𝑢𝑛𝑡𝑖𝑚𝑒 𝑛 𝑂(1) 𝑛 𝑛 𝑛 𝑂(𝑛/𝜖)

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝑑≤𝑞≤𝑛, 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in time 𝑛 𝑂(𝑞) . Smooth tradeoff between the previous results. Motivation: Analyze SoS in the Sub-exponential regime ( 2 𝑛 𝜀 runtime) for worst case problems, which is the regime of interest in many of the aforementioned applications.

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝑑≤𝑞≤𝑛, 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in time 𝑛 𝑂(𝑞) . Smooth tradeoff between the previous results. Nonnegative coefficients: 𝑂 𝑛 𝑞 𝑑/4−1/2 -approximation in time 𝑛 𝑂(𝑞) . [BKS 14] Connection to Small Set Expansion, Densest Sub-Hypergraph

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝑑≤𝑞≤𝑛, 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in time 𝑛 𝑂(𝑞) . Smooth tradeoff between the previous results. Nonnegative coefficients: 𝑂 𝑛 𝑞 𝑑/4−1/2 -approximation in time 𝑛 𝑂(𝑞) . 𝑚 nonzero coefficients: 𝑂( 𝑚/𝑞 ) -approximation in time 𝑛 𝑂(𝑞) .

Our Result (𝑑=𝑂(1)) Previous: 𝑂 𝑛 𝑑/2−1 -approximation in 𝑛 𝑂(1) time. Our Result: For 𝒅≤𝒒≤𝒏, 𝑶 𝒏 𝒒 𝒅/𝟐−𝟏 -approximation in time 𝒏 𝑶(𝒒) . Smooth tradeoff between the previous results. Nonnegative coefficients: 𝑂 𝑛 𝑞 𝑑/4−1/2 -approximation in time 𝑛 𝑂(𝑞) . 𝑚 nonzero coefficients: 𝑂( 𝑚/𝑞 ) -approximation in time 𝑛 𝑂(𝑞) .

What we will prove Assume 𝑓 is a 𝑑-form. 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in 𝑛 𝑂(𝑞) -time

What we will prove Assume 𝑓 is a degree-𝑑 form. 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in 𝑛 𝑂(𝑞) -time 𝑂 𝑛 𝑞 𝑑/2 -approximation in 𝑛 𝑂(𝑞) -time At the end, will briefly see how we get −1 back.

First Step Goal: 𝑂 𝑛 𝑞 𝑑/2 -approximation in 𝑛 𝑂(𝑞) -time Let 𝑞 be a multiple of 𝑑. Let 𝐹= 𝑓 𝑞/𝑑 . 𝐹 is a 𝑞-form. 𝑓 2 = 𝐹 2 𝑑/𝑞 If we have 𝑂(𝑛/𝑞) 𝑞/2 -approximation for 𝐹, It implies 𝑂 𝑛/𝑞 𝑞/2 𝑑/𝑞 = 𝑂 𝑛/𝑞 𝑑/2 - approximation for 𝑓. New Goal: 𝑂 𝑛 𝑞 𝑞/2 approx. in 𝑛 𝑂(𝑞) -time when 𝑔 is *any* 𝑞-form.

First Step Goal: 𝑂 𝑛 𝑞 𝑑/2 -approximation in 𝑛 𝑂(𝑞) -time Let 𝑞 be a multiple of 𝑑. Let 𝐹= 𝑓 𝑞/𝑑 . 𝐹 is a 𝑞-form. 𝑓 2 = 𝐹 2 𝑑/𝑞 If we have 𝑂(𝑛/𝑞) 𝑞/2 -approximation for 𝑔, It implies 𝑂 𝑛/𝑞 𝑞/2 𝑑/𝑞 = 𝑂 𝑛/𝑞 𝑑/2 - approximation for 𝑓. New Goal: 𝑶 𝒏 𝒒 𝒒/𝟐 approx. in 𝒏 𝑶(𝒒) -time when 𝒈 is *any* 𝒒-form.

Tuples and Monomials Set of monomials of 𝑛-variate 𝑞-forms. 𝑀={ 𝑥 1 4 , 𝑥 1 𝑥 2 𝑥 3 𝑥 4 , 𝑥 1 2 𝑥 3 𝑥 4 , 𝑥 1 𝑥 2 3 ,…} (when 𝑞 = 4). Set of all 𝑞-tuples T= 𝑛 𝑞 . Natural many-to-one correspondence from 𝑇 to 𝑀 𝑖 1 ,…, 𝑖 𝑞 → 𝑥 𝑖 1 … 𝑥 𝑖 𝑞 1,1,2,3 → 𝑥 1 2 𝑥 2 𝑥 3 , 3,1,2,1 → 𝑥 1 2 𝑥 2 𝑥 3 , etc.

Multi-Indices and Monomials Multi-Index 𝛾∈ ℕ 𝑛 represents a multi-set where the element 𝑖 appears with multiplicity 𝛾 𝑖 . 𝛾 ≔ 𝑖∈[𝑛] | 𝛾 𝑖 | denotes the size of the multiset. ℕ 𝑞 𝑛 ≔ 𝛾∈ ℕ 𝑛 𝛾 =𝑞} Correspondence between Multi-Index and Monomials: 𝛾→ 𝑥 𝛾 where 𝑥 𝛾 = 𝑖∈[𝑛] 𝑥 𝑖 𝛾 𝑖 𝛾 denotes the degree of 𝑥 𝛾 Ο(𝛾) denotes the set of distinct tuples corresponding to multi-index 𝛾.

Matrix Representations Given 𝑛 𝑞/2 × 𝑛 𝑞/2 matrix 𝐴, Each row and column indexed by ( 𝑖 1 ,…, 𝑖 𝑞/2 ) Each entry is indexed by ( 𝑖 1 ,…, 𝑖 𝑞 ). (By concatenating row / column indices) Many-to-one correspondence from entries of 𝐴 to monomials of 𝑔. For a 𝑞-form 𝑔, we say 𝐴~𝑔 (𝐴 represents 𝑔) Every monomial, (its coefficient in 𝑓) = (sum of corresponding entries in 𝐴) Equivalently, 𝑔 𝑥 1 ,…, 𝑥 𝑛 = 𝑥 ⊗𝑞/2 𝑇 𝐴 𝑥 ⊗𝑞/2 .

1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3 𝑛=3 𝑞=4 𝑛 2 rows 𝑛 2 columns

( 𝑖 𝑥 𝑖 2 ) 𝑞/2 = 𝑖 1 ,…, 𝑖 𝑞/2 𝑥 𝑖 1 2 … 𝑥 𝑖 𝑞/2 2 1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3 1 𝑛=3 𝑞=4 𝑛 2 rows 𝝀 𝒎𝒂𝒙 𝑨 =𝟏 𝑹𝒂𝒏𝒌(𝑨)=𝒏 𝑛 2 columns

( 𝑖 𝑥 𝑖 2 ) 𝑞/2 = 𝑖 1 ,…, 𝑖 𝑞/2 𝑥 𝑖 1 2 … 𝑥 𝑖 𝑞/2 2 1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3 1 𝑛=3 𝑞=4 𝑛 2 rows 𝝀 𝒎𝒂𝒙 𝑨 =𝒏 𝑹𝒂𝒏𝒌(𝑨)=𝟏 𝑛 2 columns

Our “Relaxation” Let 𝐴∈ℝ 𝑛 𝑞/2 × 𝑛 𝑞/2 be a matrix representing 𝑔. For any 𝑥∈ 𝕊 𝑛−1 , 𝑔 𝑥 = 𝑥 ⊗𝑑/2 𝑇 𝐴 𝑥 ⊗𝑑/2 ≤ 𝐴 . ∀𝐴~𝑔, 𝑔 2 ≤ 𝐴 Relaxation: Variable: 𝐵∈ℝ 𝑛 𝑞/2 × 𝑛 𝑞/2 Inf 𝐵 s.t. 𝐵~𝑔. Optimal Value called 𝑔 𝑠𝑝 [BKS14].

Strategy Given 𝑞-form 𝑔(𝑥)= 𝛾∈ ℕ 𝑞 𝑛 𝑔 𝛾 ∙ 𝑥 𝛾 , we will show: max 𝛾 |𝑔 𝛾 | |Ο(𝛾)| ≤ 𝑔 2 ≤ 𝑔 𝑠𝑝 ≲ 𝑞 max 𝛾 |𝑔 𝛾 | Ο 𝛾 ∙ 𝑛 𝑞 𝑞/2 Intuition: 𝑔 𝛾 𝑥 𝛾 2 ≈ 𝑞 𝑔 𝛾 𝑥 𝛾 𝑠𝑝 ≈ 𝑞 max 𝛾 |𝑔 𝛾 | Ο 𝛾 Ο 𝛾 ≈ 𝑞 𝛾 𝛾 /2 𝛾 1 𝛾 1 /2 ∙∙∙ 𝛾 𝑛 𝛾 𝑛 /2 . Set 𝑥 𝑖 ≔ 𝛾 𝑖 |𝛾|

Detour: Method of moments for 𝑓 2 Consider any 𝑑-form 𝑓, and let 𝐹= 𝑓 𝑛/𝑑 . Our result implies: 𝑓 2 ≈ 𝑑 max 𝛾 |𝐹 𝛾 | Ο 𝛾 𝑑/𝑛 Similar to Method of Trace moments for Random Matrices albeit involving the estimation of much higher degree objects. Can be used as a generic tool to estimate ∙ 2 of random polynomial ensembles.

When 𝑔 is multilinear Can get 𝑂 𝑛 𝑞 𝑞/2 approximation Let 𝐵 be unique supersymmetric matrix representing 𝑓. 𝐵 𝑖 1 ,…, 𝑖 𝑞 = 𝑐 𝑥 𝑖 1 … 𝑥 𝑖 𝑞 /𝑞! By Gershgorin-disk-theorem, 𝑔 𝑠𝑝 ≤ 𝐵 ≤𝑛 𝑞/2 ∙ max-entry 𝑛 𝑞/2 rows 𝑛 𝑞/2 columns

When 𝑔 is multilinear Can get 𝑂 𝑛 𝑞 𝑞/2 approximation Let 𝐵 be unique supersymmetric matrix representing 𝑓. 𝐵 𝑖 1 ,…, 𝑖 𝑞 = 𝑐 𝑥 𝑖 1 … 𝑥 𝑖 𝑞 /𝑞! By Gershgorin-disk-theorem, 𝑔 𝑠𝑝 ≤ 𝐵 ≤𝑛 𝑞/2 ∙ max-entry 𝑛 𝑞/2 rows 𝑛 𝑞/2 columns

When 𝑔 is multilinear Consider 𝑦 𝑇 𝐵𝑧 when 𝑦,𝑧∈ 𝕊 [𝑛] 𝑞/2 −1 where 𝑦⊗𝑧= 𝑒 𝑖 1 ⊗ … ⊗ 𝑒 𝑖 𝑞 𝑔 2 ≳ max-entry Can get 𝑂 𝑛 𝑞 𝑞/2 approximation Let 𝐵 be unique supersymmetric matrix representing 𝑓. 𝐵 𝑖 1 ,…, 𝑖 𝑞 = 𝑐 𝑥 𝑖 1 … 𝑥 𝑖 𝑞 /𝑞! By Gershgorin-disk-theorem, 𝑔 𝑠𝑝 ≤ 𝐵 ≤𝑛 𝑞/2 ∙ max-entry

When 𝑔 is multilinear Consider 𝑧 ⊗𝑞/2 𝑇 𝐵 𝑧 ⊗𝑞/2 where 𝑧= (𝑒 𝑖 1 +…+ 𝑒 𝑖 𝑞 )/ 𝑞 𝑔 2 ≳ 𝑞 max-entry ∙ 𝑞 𝑞/2 Can get 𝑂 𝑛 𝑞 𝑞/2 approximation Let 𝐵 be unique supersymmetric matrix representing 𝑓. 𝐵 𝑖 1 ,…, 𝑖 𝑞 = 𝑐 𝑥 𝑖 1 … 𝑥 𝑖 𝑞 /𝑞! By Gershgorin-disk-theorem, 𝑔 𝑠𝑝 ≤ 𝐵 ≤𝑛 𝑞/2 ∙ max-entry

Non-multilinear 𝑔 For general 𝑔, Idea: “Decompose” 𝑔 into multilinear parts Write 𝑔 uniquely as 𝛼 𝑥 𝛼 2 ∙ 𝐺 2𝛼 (𝑥) (deg (𝑥 𝛼 )≤𝑞/2) For each monomial, take out the “maximally squared” part. If 𝑔= 𝑥 1 2 𝑥 2 𝑥 3 + 𝑥 1 2 𝑥 2 2 , then 𝐺 𝑥 1 2 = 𝑥 2 𝑥 3 and 𝐺 𝑥 1 2 𝑥 2 2 =1 𝐺 2𝛼 is a homogeneous multilinear polynomial of degree q−2 𝛼 .

Non-multilinear 𝑔 Goal: 𝑂 𝑛 𝑞 𝑞/2 -approximation. We know for every 𝛼, max 𝛽 |( 𝐺 2𝛼 ) 𝛽 | |Ο(𝛽)| ≤ 𝐺 2𝛼 2 ≤ 𝐺 2𝛼 𝑠𝑝 ≲ 𝑞 max 𝛽 |( 𝐺 2𝛼 ) 𝛽 | Ο 𝛽 ∙ 𝑛 𝑞−2|𝛼| 𝑞 2 −|𝛼| Strategy: Show that 𝑔 𝑠𝑝 𝑔 2 ≤ max 𝛼 𝐺 2𝛼 𝑠𝑝 𝐺 2𝛼 2 ≤ max 0≤𝑡≤ 𝑞 2 𝑂(𝑛) 𝑞−2𝑡 𝑞 2 −𝑡 ≤ 𝑂(𝑛) 𝑞 𝑞/2

Multilinear Decomposition Inequality New Goal: 𝑔 𝑠𝑝 𝑔 2 ≤ max 𝛼 𝐺 2𝛼 𝑠𝑝 𝐺 2𝛼 2 We will show: 𝑔 𝑠𝑝 ≲ 𝑞 max 𝛼 𝐺 2𝛼 𝑠𝑝 Ο(𝛼) 𝑔 2 ≳ 𝑞 max 𝛼 𝐺 2𝛼 2 Ο(𝛼) Which yields: max 𝛾 |𝑔 𝛾 | |Ο(𝛾)| ≤ 𝑔 2 ≤ 𝑔 𝑠𝑝 ≲ 𝑞 max 𝛾 |𝑔 𝛾 | Ο 𝛾 ∙ 𝑛 𝑞 𝑞/2

Multilinear Decomposition Inequality New Goal: 𝑔 𝑠𝑝 𝑔 2 ≤ max 𝛼 𝐺 2𝛼 𝑠𝑝 𝐺 2𝛼 2 We will show: 𝒈 𝒔𝒑 ≲ 𝒒 𝐦𝐚𝐱 𝜶 𝑮 𝟐𝜶 𝒔𝒑 𝜪(𝜶) 𝑔 2 ≳ 𝑞 max 𝛼 𝐺 2𝛼 2 Ο(𝛼)

Finding a good Matrix Representation Given 𝑔= 𝛼 𝑥 𝛼 2 𝐺 2𝛼 (𝑥) , write 𝑔= 𝑖=0 𝑞/2 ℎ 𝑖 , ℎ 𝑖 ≔ 𝛼 =𝑖 𝑥 𝛼 2 𝐺 2𝛼 (𝑥) ℎ 0 = 𝐺 0,…,0 and ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙𝐺 𝑥 𝑖 2 Our 𝐵= 𝐵 0 +…+ 𝐵 𝑞/2 , where 𝐵 𝑖 ~ ℎ 𝑖 . How to define 𝐵 𝑖 ? For 𝑖=0, ℎ 0 is just the degree-𝑞 multilinear part of 𝑔. Let 𝐵 0 be the unique supersymmetric representation of 𝑔 0 .

Finding good 𝐵 1 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 𝑛=3, 𝑑=4 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 Each 𝐺 𝑥 𝑖 2 is degree 𝑑−2. Divide rows / columns to 𝑛 blocks Depending on their first coordinate 𝑛=3, 𝑑=4 1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3

Finding good 𝐵 1 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 𝑛=3, 𝑑=4 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 Each 𝐺 𝑥 𝑖 2 is degree 𝑑−2. Divide rows / columns to 𝑛 blocks Depending on their first coordinate 𝑛=3, 𝑑=4 1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3

Finding good 𝐵 1 𝑥 1 2 ∙ 𝐺 𝑥 1 2 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 𝑛=3, 𝑑=4 ℎ 1 = 𝑖=1 𝑛 𝑥 𝑖 2 ∙ 𝐺 𝑥 𝑖 2 Each 𝐺 𝑥 𝑖 2 is degree 𝑑−2. Divide rows / columns to 𝑛 blocks Depending on their first coordinate Fill best rep. of 𝑥 𝑖 2 ∙𝐺 𝑥 𝑖 2 in the block diagonal 𝐵 1 is at most ∙ of any diagonal block. 𝑛=3, 𝑑=4 1,1 1,2 1,3 2,1 2,2 2,3 3,1 3,2 3,3 𝑥 1 2 ∙ 𝐺 𝑥 1 2 𝑥 2 2 ∙𝐺 𝑥 2 2 𝑥 3 2 ∙𝐺 𝑥 3 2

For each 𝑥 𝛼 , 𝐺 2𝛼 is multilinear. Put each 𝑥 𝛼 2 𝐺 2𝛼 to a diagonal block. Since each 𝐵 𝑖 is block-diagonal and we add (𝑞/2+1) matrices, 𝐵 ≤ 𝑞 2 +1 ⋅ max 𝛼 𝐺 2𝛼 𝑠𝑝 (Can do better by spreading 𝐺 2𝛼 amongst |Ο 𝛼 | diagonal blocks)

Saving 1 in Exponent Assume 𝑓 has degree-d. We saw 𝑂 𝑛 𝑞 𝑑/2 -approximation in 𝑛 𝑂(𝑞) -time To get 𝑂 𝑛 𝑞 𝑑/2−1 -approximation in 𝑛 𝑂(𝑞) -time Use the fact that 𝑑=2 is EASY. Treat quadratic polynomials as ‘coefficients’ and interpret 𝑓 as a polynomial of degree (𝑑−2). Reprove all previous claims for polynomials with coefficients in a Banach Space.

Conclusion Open Problem: Tight approximation ratio for general polynomials? O(1)-degree Sum-of-Squares Lower bound: 𝑛 𝑑/4−0.5 [HKPRSS’17] – random polynomial. Upper bound: 𝑛 𝑑/2−1 When 𝑑=4, 𝑛 vs 𝑛 Computational hardness: No APX-hardness without ETH

Thank you!