Matrix & Operator scaling and their many applications

Slides:



Advertisements
Similar presentations
Quantum Lower Bounds The Polynomial and Adversary Methods Scott Aaronson September 14, 2001 Prelim Exam Talk.
Advertisements

Secure Linear Algebra against Covert or Unbounded Adversaries Payman Mohassel and Enav Weinreb UC Davis CWI.
C&O 355 Mathematical Programming Fall 2010 Lecture 9
Primitives Behaviour at infinity HZ 2.2 Projective DLT alg Invariants
Matrix Algebra Matrix algebra is a means of expressing large numbers of calculations made upon ordered sets of numbers. Often referred to as Linear Algebra.
Rank Bounds for Design Matrices and Applications Shubhangi Saraf Rutgers University Based on joint works with Albert Ai, Zeev Dvir, Avi Wigderson.
Notation Intro. Number Theory Online Cryptography Course Dan Boneh
1.5 Elementary Matrices and a Method for Finding
Matrix sparsification (for rank and determinant computations) Raphael Yuster University of Haifa.
Non-commutative computation with division Avi Wigderson IAS, Princeton Pavel Hrubes U. Washington.
Lecture 6 Matrix Operations and Gaussian Elimination for Solving Linear Systems Shang-Hua Teng.
3D Geometry for Computer Graphics
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
Introduction to Linear and Integer Programming Lecture 7: Feb 1.
Arithmetic Hardness vs. Randomness Valentine Kabanets SFU.
3D Geometry for Computer Graphics
Query Lower Bounds for Matroids via Group Representations Nick Harvey.
ECIV 301 Programming & Graphics Numerical Methods for Engineers REVIEW II.
Pablo A. Parrilo ETH Zürich Semialgebraic Relaxations and Semidefinite Programs Pablo A. Parrilo ETH Zürich control.ee.ethz.ch/~parrilo.
C&O 355 Mathematical Programming Fall 2010 Lecture 17 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
Signal-Space Analysis ENSC 428 – Spring 2008 Reference: Lecture 10 of Gallager.
1 February 24 Matrices 3.2 Matrices; Row reduction Standard form of a set of linear equations: Chapter 3 Linear Algebra Matrix of coefficients: Augmented.
Recap of linear algebra: vectors, matrices, transformations, … Background knowledge for 3DM Marc van Kreveld.
Mathematics for Computer Graphics (Appendix A) Won-Ki Jeong.
Elementary Operations of Matrix
Systems of Linear Equation and Matrices
Modern Navigation Thomas Herring
Zeev Dvir Weizmann Institute of Science Amir Shpilka Technion Locally decodable codes with 2 queries and polynomial identity testing for depth 3 circuits.
Happy 60 th B’day Nati. Nati’s long view Avi Wigderson Institute for Advanced Study 2013 was a good year…
The sum-product theorem and applications Avi Wigderson School of Mathematics Institute for Advanced Study.
Review of Linear Algebra Optimization 1/16/08 Recitation Joseph Bradley.
1. Systems of Linear Equations and Matrices (8 Lectures) 1.1 Introduction to Systems of Linear Equations 1.2 Gaussian Elimination 1.3 Matrices and Matrix.
Happy 80th B’day Dick.
Ankit Garg Princeton Univ. Joint work with Leonid Gurvits Rafael Oliveira CCNY Princeton Univ. Avi Wigderson IAS Noncommutative rational identity testing.
Happy 60 th B’day Noga. Elementary problems encoding computational hardness Avi Wigderson IAS, Princeton or Some problems Noga never solved.
The sum-product theorem and applications Avi Wigderson School of Mathematics Institute for Advanced Study.
2 - 1 Chapter 2A Matrices 2A.1 Definition, and Operations of Matrices: 1 Sums and Scalar Products; 2 Matrix Multiplication 2A.2 Properties of Matrix Operations;
Linear Algebra Engineering Mathematics-I. Linear Systems in Two Unknowns Engineering Mathematics-I.
Lower Bounds on Extended Formulations Noah Fleming University of Toronto Supervised by Toniann Pitassi.
Mathematics-I J.Baskar Babujee Department of Mathematics
Lap Chi Lau we will only use slides 4 to 19
Lecture 2 Matrices Lat Time - Course Overview
Topics in Algorithms Lap Chi Lau.
Polynomial Norms Amir Ali Ahmadi (Princeton University) Georgina Hall
Amir Ali Ahmadi (Princeton University)
Umans Complexity Theory Lectures
Proving Algebraic Identities
Proving Algebraic Identities
Algorithms and Complexity
Systems of First Order Linear Equations
Proving Analytic Inequalities
Proving Analytic Inequalities
GROUPS & THEIR REPRESENTATIONS: a card shuffling approach
Polynomial Optimization over the Unit Sphere
Hiroshi Hirai University of Tokyo
Matrix Martingales in Randomized Numerical Linear Algebra
Maximum vanishing subspace problem, CAT(0)-space relaxation, and block-triangularization of partitioned matrix Hiroshi Hirai University of Tokyo
Vijay V. Vazirani Georgia Tech
Chapter 2 Determinants Basil Hamed
Richard Anderson Lecture 29 Complexity Theory
Chapter 3 Linear Algebra
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
Elementary Linear Algebra
Basics of Linear Algebra
Panorama of scaling problems and algorithms
3.IV. Change of Basis 3.IV.1. Changing Representations of Vectors
PIT Questions in Invariant Theory
Much Faster Algorithms for Matrix Scaling
Presentation transcript:

Matrix & Operator scaling and their many applications Avi Wigderson IAS, Princeton

Matrix scaling applications Numerical analysis Making linear systems numerically stable Signal processing CAT scans,… Approximating the permanent Deterministically! Combinatorial geometry Incidence theorems

Operator scaling applications Non-commutative Algebra Word problem in free skew fields Invariant Theory Nullcone membership for Left-Right action Quantum Information Theory Rank decrease of completely positive operators Analysis Brascamp-Lieb inequalities Computational complexity (?) Rank of symbolic matrices, lower bounds Optimization (?) Solving certain general families of - Quadratic systems of equations - Exponentially large linear programs

Perfect Matchings (PMs) 1 V AG U Bipartite graphs G(U,V;E). |U|=|V|=n U V G U V G’ Fact: G has a PM iff Per(AG)>0 Pern(A)=Sn i[n] Xi(i) [Hall’35] G has a PM iff G has no axb minor with a+b>n [Jacobi‘90] PM  P (P = polynomial time)

PMs & symbolic matrices [Edmonds‘67] G U V 1 AG x31 x21 x11 x12 x13 AG(X) Of course, this is in PPM [Edmonds ‘67] G has a PM iff Det(AG(X))  0 ( P)

Symbolic matrices [Edmonds‘67] L(X) X = {x1,x2,… } F field (F=Q) Lij(X) = ax1+bx2+… :linear forms SINGULAR: Is Det(L(X)) = 0 ? [Edmonds ‘67] SINGULAR  P ?? [Lovasz ‘79] SINGULAR  RP (random alg.) [Valiant ‘79] SING  Word Prob for arithmetic formulas [Kabanets-Impagliazzo‘01] SING  P  circuit lower bds SINGULAR: max rank matrix in a linear space of matrices Special cases: Module isomorphism, graph rigidity,… Of course, this is in PPM

Symbolic matrices dual life X = {x1,x2,… xm} F field L(X) = A1x1+A2x2+…+Amxm Input: A1,A2,…,Am  Mn(F) SING: Is L(X) singular? xi commute in F(x1,x2,…,xm) [Edmonds ’67] SING  P? [Lovasz ’79] SING  RP! L31 L32 L33 L21 L22 L23 L11 L12 L13 xi do not commute in F<(x1,x2,…,xm)> (free skew field) [Cohn’75] NC-SING decidable! [CR’99] NC-SING  EXP! [GGOW’15] NC-SING  P! (F=Q) [IQS’16] NC-SING P! (F large) Of course, this is in PPM

Matrix scaling algorithm [Sinkhorn’64,LSW’01,GY’03] A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) Allowed: Multiply rows & columns by scalars Find (if exists?) R,C diagonal s.t. RAC has row-sums & col-sums =1  1 1

Scaling algorithm A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) 1 1/3

Scaling algorithm A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) 3/7 1/7 1

Scaling algorithm A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) 1 1/15 7/15

Scaling algorithm A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) 1/2 1

Scaling algorithm [LSW ‘01] A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) R(A) = diag(row sums)-1 C(A) = diag(column sums)-1 Repeat n3 times: Normalize rows A  R(A)A Normalize cols A  A C(A) 1

Scaling algorithm [LSW ‘01] A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) R(A) = diag(row sums)-1 C(A) = diag(column sums)-1 Repeat n3 times: Normalize rows A  R(A)A Normalize cols A  A C(A) 1 1/2 1/3

Scaling algorithm [LSW ‘01] A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) R(A) = diag(row sums)-1 C(A) = diag(column sums)-1 Repeat n3 times: Normalize rows A  R(A)A Normalize cols A  A C(A) 6/11 3/11 3/5 2/11 2/5 1

Scaling algorithm [LSW ‘01] A non-negative matrix. Try making it doubly stochastic. (e.g. the adjacency matrix A=AG of a bipartite graph G) R(A) = diag(row sums)-1 C(A) = diag(column sums)-1 Repeat n3 times: Normalize rows A  R(A)A Normalize cols A  A C(A) 1 15/48 33/48 10/87 22/87 55/87

Scaling algorithm [LSW ‘01] A non-negative (0,1) matrix. “Make it doubly stochastic” (e.g. the adjacency matrix A=AG of a bipartite graph G) R(A) = diag(row sums)-1 C(A) = diag(column sums)-1 Repeat n3 times: Normalize rows A  R(A)A Normalize cols A  A C(A) (C(A)=I) Test if R(A) I (up to 1/n) Yes: Per(A) > 0. No: Per(A) = 0. Analysis: - Initially Per(A) > exp(-n) (easy) - Per(A) grows by (1+1/n) (AMGM) - Per(A) ≤1 - |R(A)-I|< 1/n  Per(A) >0 (Cauchy-Schwarz) 1

Operator scaling algorithm [Gurvits ’04, GGOW’15] L=(A1,A2,…,Am). Try making it “doubly stochastic” iAiAit=I iAitAi=I. Allowed: L RLC, R,C invertible R(L) = (iAiAit)-1/2 C(L) = (iAitAi)-1/2 Repeat nc times: Normalize rows L  R(L)L Normalize cols L  L C(L) Test if C(L) I (up to 1/n) Yes: L NC-nonsingular No: L NC-singular Analysis: Capacity(L) > exp(-n) [G’04, GGOW’15] Cap(A) grows by (1+1/n) (AMGM) Cap(A) ≤ 1 |R(L)-I|< 1/n  Cap(A) 1 (Cauchy-Schwarz) Invertible [GGOW’15] - NC-rank  P - Computes Cap(L) & scaling factors - Alg continuous in L

Origins & Applications Non-commutative Algebra (Commutative) Invariant Theory Quantum Information Theory ------ Analysis Brascamp-Lieb inequalities

Non-commutative algebra Word problem for free skew fields X = {x1,x2,…} non-commutative, F (commutative) field F<X> polynomials, e.g. p(X) = 1+ xy+ yx F<(X)> rational expressions, e.g. r(X) = x-1 + y-1 r(X) = (x + zy-1w)-1 [Reutenauer’96] Nested inversion ∞ r(X) = (x + xy-1x)-1 = (x + y)-1 - x-1 Hua’s identity r(X) = 0 ? Word Problem [Amitsur’66] r(x1,x2,…) = 0  det r(D1,D2,…) =0 d, Di  Md(F) [Cohn’71] r  A1,A2,…,Am  Mn(F) m,n ≤ |r| such that r(x1,x2,…,xm)=0  L(X)= A1x1+A2x2+…+Amxm SINGULAR  det(i AiDi) = 0 d, Di  Md(F) [FR’04] Crank(L) ≤ NCrank(L) ≤ 2Crank(L) not pq-1 or p-1q In P Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg In P

Invariant theory Left-Right action Invariant ring G acts on V=Fk , and so on F[z1,z2,…,zk] VG = { p  F[z] : p(gZ) = p(Z) for all g  G } Ex1: G=Sn acts on V=Fn by permuting coordinates VG = < elementary symmetric polynomials > Ex2: G=SLn(F)2 acts on V=Mn(F) by Z RZC VG = < det(Z) > Left-Right action: L=(A1,A2,…,Am)  Mn(F)m = V. G=GLn(F)2 acts on V: L RLC =(RA1C,RA2C,…,RAmC) Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg Quiver representations A1 A2 A3 A4 A5 Fn Fn

Invariant theory Left-Right action L=(A1,A2,…,Am)  Mn(F)m = Fmn =V G=GLn(F)2 acts on V: RLC =(RA1C,RA2C,…,RAmC) Polynomial (semi-) invariants: (Zi)jk mn2 (commuting) vars VG = {p polynomial : p(RZC) = p(Z) for all R,C  SLn(F) } [DW,DZ,SV’00] VG = < det(i ZiDi) : d N, Di dd > Nullcone: Given L, does p(L)=0 for all pVG ? Orbit: Given L, L’ does p(L)=p(L’) for all pVG ? In RP Degree bounds [Hilbert’90] d< ∞ VG finitely generated [Popov’81] d< exp(exp(n)) [Derksen’01] d< exp(n)  [GGOW’15] (capacity analysis) [DM’15] d< poly(n)  [IQS’16] (combinatorial alg.) 2 In P Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg

Quantum information theory Completely positive maps [Gurvits ’04] L=(A1,A2,…,Am) Ai Mn(C) completely positive map: L(P)=iAiPAit (P psd  L(P) psd) L rank-decreasing if exists P psd s.t. rk(L(P)) < rk(P) [Cohn’71] L rank-decreasing  L NC-singular. L doubly stochastic if iAiAit=I iAitAi=I Capacity(L) = inf { det(L(P)) / det(P) : P psd } [Gurvits] L rank-decreasing  Cap(L) = 0 L doubly stochastic  Cap(L) = 1 L’ = RLC  Cap(L’) = Cap(L) det(R)2 det(C)2 Capacity tensorizes: Cap(A1D1, …,AmDm)=… [GGOW’15] Use degree bounds in capacity analysis Quantum (noise) operator “Hall Condition” breaking entanglements In P Quasi- convex

Analysis (and beyond!) Brascamp-Lieb Inequalities [BL’76,Lieb’90] B = (B1,B2,…,Bm) Bj:RnRnj p = (pj,p2,…,pm) pj ≥0 ∫xRn (∏j fj(Bj(x)))pj ≤ C ∏j (∫xjRnj fj(xj))pj fixed C  [0,∞) f = (f1,f2,…,fm) fj:Rnj R integrable Cauchy-Schwarz,Holder Precopa-Leindler Loomis-Whitney Nelson Hypercontractive Young’s convolution Brunn-Minkowski Bennett-Nez Nonlinear BL quantitative Helly Lieb’s Non-commutative BL Barthe Reverse BL Bourgain et al Multilinear estimates, decoupling, Kakeya, restriction, Weyl sums, maximal functions,… (B,p): BL data Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg

Quiver representations B= (B1,B2,…,Bm) Bj:RnRnj, p=(pj,p2,…,pm) pj=cj/d integers ∫xRn (∏j fj(Bj(x)))pj ≤ C ∏j (∫xjRnj fj(xj))pj (*) fixed C  [0,∞) f = (f1,f2,…,fm) fj:Rnj R integrable BL(B,p)= inf C in (*). When is BL(B,p)<∞? What is it? [BCCT’07] BL(B,p)<∞  p PB (the BL-polytope of B): (1) n= j pjnj, (2) For every V≤Rn dim(V) ≤ j pj dim(BjV) [Lieb’90] BL(B,p)= inf {(∏j det Mj)/(det j pj BjtMjBj): Mj>0} [GGOW’16] Efficient reduction BL  Operator Scaling: Computes BL(B,p) & optimizes over PB in poly(n,m,d) Directly thms ↑ & continuity of BL(B,p) in B [BBFL’15] Exp LP Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg Quasi- convex Quiver representations

B= (B1,B2,…,Bm) Bj:RnRnj, b: bit length p=(pj,p2,…,pm) pj=cj/d integers BL(B,p)= inf {(∏j det Mj)/(det j pj BjtMjBj): Mj>0} PB: (1) n=j pjnj, (2) For every V≤Rn dim(V) ≤ j pj dim(BjV) [GGOW’16] Efficient reduction BL  Operator Scaling: Computes BL(B,p) in poly(n,m,b,d) Separation oracle for PB in poly(n,m,b,d) BL(B,p) < exp(n,m,b,d) |B-B’|<δ  |BL(B,p)/BL(B’,p) -1|< δ.exp(n,m,b,d) A1 A2 A3 A4 A5 B1 B2 B3 Quiver reduction Derksen-Weyman, Domokos-Zubkov, Schofield-Van-der-berg

Conclusions & Open Problems Cross-cutting notions - Existence vs. algorithms vs. efficient algorithms - Symbolic matrices and their ranks - Operator scaling Open Problems - Compute or (1+ε)-approximate C-rk(L) in P - New applications of Op scaling to optimization? - Convex formulation of Capacity? - Does the Permanent have polynomial size formulae? - Does the Determinant have polynomial size formulae?