Download presentation
Presentation is loading. Please wait.
Published byAlyson Jenkins Modified over 9 years ago
1
Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A
2
The Problem Given any random n x n, symmetric matrices Y 1,…,Y k. Show that i Y i is probably “close” to E[ i Y i ]. Why? A matrix generalization of the Chernoff bound. Much research on eigenvalues of a random matrix with independent entries. This is more general.
3
Chernoff/Hoeffding Bound Theorem: Let Y 1,…,Y k be independent random scalars in [0,R]. Let Y = i Y i. Suppose that ¹ L · E[Y] · ¹ U. Then
4
Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y = i Y i, so E[Y]=k ¢ I. Then Example: Balls and bins – Throw k balls uniformly into n bins – Y i = Uniform over – If k = O(n log n / ² 2 ), all bins same up to factor 1 § ²
5
Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y = i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only
6
Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I, k Y i k· R. Let Y = i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only
7
Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I. Let Y= i Y i, so E[Y]=k ¢ I. Assume Y i ¹ R ¢ I. Then Notation: A ¹ B, B-A is PSD ® I ¹ A ¹ ¯ I, all eigenvalue of A lie in [ ®, ¯ ] Mild issue: We assume E[Y i ] = I. E[Y i ]=I
8
Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ], Y= i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Apply previous theorem to { Z -1/2 Y i Z -1/2 : i=1,…,k }. Use the fact that A ¹ B, Z -1/2 A Z -1/2 ¹ Z -1/2 B Z -1/2 So (1- ² ) k Z ¹ i Y i ¹ (1+ ² ) k Z, (1- ² ) k I ¹ i Z -1/2 Y i Z -1/2 ¹ (1+ ² ) k I
9
Ahlswede-Winter Inequality Theorem: [Ahlswede-Winter ‘02] Let Y 1,…,Y k be i.i.d. PSD matrices of size n x n. Let Z=E[Y i ], Y= i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Pros: – We’ve removed the rank-1 assumption. – Proof is much easier than Rudelson’s proof. Cons: – Still need Y i ’s to be identically distributed. (More precisely, poor results unless E[Y a ] = E[Y b ].)
10
Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y= i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Pros: – Y i ’s do not need to be identically distributed – Poisson-like bound for the right-tail – Proof not difficult (but uses Lieb’s inequality) Mild issue: Poor results unless ¸ min (E[Y]) ¼ ¸ max (E[Y]).
11
Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. Let Y= i Y i. Let Z=E[Y]. Suppose Y i ¹ R ¢ Z. Then
12
Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y= i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Example: Balls and bins – For b=1,…,n – For t=1,…,8 log(n)/ ² 2 – With prob ½, throw a ball into bin b – Let Y b,t = with prob ½, otherwise 0.
13
Additive Error Previous theorems give multiplicative error: (1- ² ) E[ i Y i ] ¹ i Y i ¹ (1+ ² ) E[ i Y i ] Additive error also useful: k i Y i - E[ i Y i ] k · ² Theorem: [Rudelson & Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ]. Suppose k Z k· 1, k Y i k· R. Then Theorem: [Magen & Zouzias ‘11] If instead rank Y i · k := £ (R log(R/ ² 2 )/ ² 2 ), then
14
Proof of Ahlswede-Winter Key idea: Bound matrix moment generating function Let S k = i=1 Y i k Golden-Thompson Inequality By induction, tr e A+B · tr e A ¢ e B Weakness: This is brutal
15
How to improve Ahlswede-Winter? Golden-Thompson Inequality tr e A+B · tr e A ¢ e B for all symmetric matrices A, B. Does not extend to three matrices! tr e A+B+C · tr e A ¢ e B ¢ e C is FALSE. Lieb’s Inequality: For any symmetric matrix L, the map f : PSD Cone ! R defined by f(A) = tr exp( L + log(A) ) is concave. – So f interacts nicely with Expectation and Jensen’s inequality
16
Beyond the basics Hoeffding (non-uniform bounds on Y i ’s) [Tropp ‘12] Bernstein (use bound on Var[Y i ]) [Tropp ‘12] Freedman (martingale version of Bernstein) [Tropp ‘12] Stein’s Method (slightly sharper results) [Mackey et al. ‘12] Pessimistic Estimators for Ahlswede-Winter inequality [Wigderson-Xiao ‘08]
17
Summary We now have beautiful, powerful, flexible extension of Chernoff bound to matrices. Ahlswede-Winter has a simple proof; Tropp’s inequality is very easy to use. Several important uses to date; hopefully more uses in the future.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.