Presentation is loading. Please wait.

Presentation is loading. Please wait.

Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.

Similar presentations


Presentation on theme: "Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A."— Presentation transcript:

1 Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A

2 The Problem Given any random n x n, symmetric matrices Y 1,…,Y k. Show that  i Y i is probably “close” to E[  i Y i ]. Why? A matrix generalization of the Chernoff bound. Much research on eigenvalues of a random matrix with independent entries. This is more general.

3 Chernoff/Hoeffding Bound Theorem: Let Y 1,…,Y k be independent random scalars in [0,R]. Let Y =  i Y i. Suppose that ¹ L · E[Y] · ¹ U. Then

4 Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Example: Balls and bins – Throw k balls uniformly into n bins – Y i = Uniform over – If k = O(n log n / ² 2 ), all bins same up to factor 1 § ²

5 Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only

6 Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only

7 Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I. Let Y=  i Y i, so E[Y]=k ¢ I. Assume Y i ¹ R ¢ I. Then Notation: A ¹ B, B-A is PSD ® I ¹ A ¹ ¯ I, all eigenvalue of A lie in [ ®, ¯ ] Mild issue: We assume E[Y i ] = I. E[Y i ]=I

8 Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ], Y=  i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Apply previous theorem to { Z -1/2 Y i Z -1/2 : i=1,…,k }. Use the fact that A ¹ B, Z -1/2 A Z -1/2 ¹ Z -1/2 B Z -1/2 So (1- ² ) k Z ¹  i Y i ¹ (1+ ² ) k Z, (1- ² ) k I ¹  i Z -1/2 Y i Z -1/2 ¹ (1+ ² ) k I

9 Ahlswede-Winter Inequality Theorem: [Ahlswede-Winter ‘02] Let Y 1,…,Y k be i.i.d. PSD matrices of size n x n. Let Z=E[Y i ], Y=  i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Pros: – We’ve removed the rank-1 assumption. – Proof is much easier than Rudelson’s proof. Cons: – Still need Y i ’s to be identically distributed. (More precisely, poor results unless E[Y a ] = E[Y b ].)

10 Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y=  i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Pros: – Y i ’s do not need to be identically distributed – Poisson-like bound for the right-tail – Proof not difficult (but uses Lieb’s inequality) Mild issue: Poor results unless ¸ min (E[Y]) ¼ ¸ max (E[Y]).

11 Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. Let Y=  i Y i. Let Z=E[Y]. Suppose Y i ¹ R ¢ Z. Then

12 Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y=  i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Example: Balls and bins – For b=1,…,n – For t=1,…,8 log(n)/ ² 2 – With prob ½, throw a ball into bin b – Let Y b,t = with prob ½, otherwise 0.

13 Additive Error Previous theorems give multiplicative error: (1- ² ) E[  i Y i ] ¹  i Y i ¹ (1+ ² ) E[  i Y i ] Additive error also useful: k  i Y i - E[  i Y i ] k · ² Theorem: [Rudelson & Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ]. Suppose k Z k· 1, k Y i k· R. Then Theorem: [Magen & Zouzias ‘11] If instead rank Y i · k := £ (R log(R/ ² 2 )/ ² 2 ), then

14 Proof of Ahlswede-Winter Key idea: Bound matrix moment generating function Let S k =  i=1 Y i k Golden-Thompson Inequality By induction, tr e A+B · tr e A ¢ e B Weakness: This is brutal

15 How to improve Ahlswede-Winter? Golden-Thompson Inequality tr e A+B · tr e A ¢ e B for all symmetric matrices A, B. Does not extend to three matrices! tr e A+B+C · tr e A ¢ e B ¢ e C is FALSE. Lieb’s Inequality: For any symmetric matrix L, the map f : PSD Cone ! R defined by f(A) = tr exp( L + log(A) ) is concave. – So f interacts nicely with Expectation and Jensen’s inequality

16 Beyond the basics Hoeffding (non-uniform bounds on Y i ’s) [Tropp ‘12] Bernstein (use bound on Var[Y i ]) [Tropp ‘12] Freedman (martingale version of Bernstein) [Tropp ‘12] Stein’s Method (slightly sharper results) [Mackey et al. ‘12] Pessimistic Estimators for Ahlswede-Winter inequality [Wigderson-Xiao ‘08]

17 Summary We now have beautiful, powerful, flexible extension of Chernoff bound to matrices. Ahlswede-Winter has a simple proof; Tropp’s inequality is very easy to use. Several important uses to date; hopefully more uses in the future.


Download ppt "Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A."

Similar presentations


Ads by Google