Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.

Slides:



Advertisements
Similar presentations
The Future (and Past) of Quantum Lower Bounds by Polynomials Scott Aaronson UC Berkeley.
Advertisements

The Contest between Simplicity and Efficiency in Asynchronous Byzantine Agreement Allison Lewko The University of Texas at Austin TexPoint fonts used in.
C&O 355 Mathematical Programming Fall 2010 Lecture 6 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A.
Matroid Bases and Matrix Concentration
Occupancy Problems m balls being randomly assigned to one of n bins. (Independently and uniformly) The questions: - what is the maximum number of balls.
Quantum One-Way Communication is Exponentially Stronger than Classical Communication TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Sparse Approximations
Solving Laplacian Systems: Some Contributions from Theoretical Computer Science Nick Harvey UBC Department of Computer Science.
An Efficient Membership-Query Algorithm for Learning DNF with Respect to the Uniform Distribution Jeffrey C. Jackson Presented By: Eitan Yaakobi Tamar.
C&O 355 Mathematical Programming Fall 2010 Lecture 20 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
C&O 355 Mathematical Programming Fall 2010 Lecture 15 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
1 Truthful Mechanism for Facility Allocation: A Characterization and Improvement of Approximation Ratio Pinyan Lu, MSR Asia Yajun Wang, MSR Asia Yuan Zhou,
1 of 9 ON ALMOST LYAPUNOV FUNCTIONS Daniel Liberzon University of Illinois, Urbana-Champaign, U.S.A. TexPoint fonts used in EMF. Read the TexPoint manual.
Visual Recognition Tutorial
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo Department of Combinatorics and Optimization Joint work with Isaac.
Learning Submodular Functions Nick Harvey University of Waterloo Joint work with Nina Balcan, Georgia Tech.
Learning Submodular Functions Nick Harvey, Waterloo C&O Joint work with Nina Balcan, Georgia Tech.
Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey University of Waterloo Department of Combinatorics and Optimization Joint.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo C&O Joint work with Isaac Fung TexPoint fonts used in EMF. Read.
Dimensionality Reduction
Maximum Likelihood (ML), Expectation Maximization (EM)
Lattices for Distributed Source Coding - Reconstruction of a Linear function of Jointly Gaussian Sources -D. Krithivasan and S. Sandeep Pradhan - University.
The moment generating function of random variable X is given by Moment generating function.
Eigenvectors of random graphs: nodal domains James R. Lee University of Washington Yael Dekel and Nati Linial Hebrew University TexPoint fonts used in.
The Multivariate Normal Distribution, Part 2 BMTRY 726 1/14/2014.
Modern Navigation Thomas Herring
1 of 12 COMMUTATORS, ROBUSTNESS, and STABILITY of SWITCHED LINEAR SYSTEMS SIAM Conference on Control & its Applications, Paris, July 2015 Daniel Liberzon.
C&O 355 Mathematical Programming Fall 2010 Lecture 17 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.
Graph Sparsifiers Nick Harvey University of British Columbia Based on joint work with Isaac Fung, and independent work of Ramesh Hariharan & Debmalya Panigrahi.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.
C&O 355 Mathematical Programming Fall 2010 Lecture 4 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
An Algorithmic Proof of the Lopsided Lovasz Local Lemma Nick Harvey University of British Columbia Jan Vondrak IBM Almaden TexPoint fonts used in EMF.
Ran El-Yaniv and Dmitry Pechyony Technion – Israel Institute of Technology, Haifa, Israel Transductive Rademacher Complexity and its Applications.
Approximating Submodular Functions Part 2 Nick Harvey University of British Columbia Department of Computer Science July 12 th, 2015 Joint work with Nina.
Submodular Functions Learnability, Structure & Optimization Nick Harvey, UBC CS Maria-Florina Balcan, Georgia Tech.
Approximating Hit Rate Curves using Streaming Algorithms Nick Harvey Joint work with Zachary Drudi, Stephen Ingram, Jake Wires, Andy Warfield TexPoint.
1 CS546: Machine Learning and Natural Language Discriminative vs Generative Classifiers This lecture is based on (Ng & Jordan, 02) paper and some slides.
Graph Sparsifiers Nick Harvey Joint work with Isaac Fung TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
Probability Theory Overview and Analysis of Randomized Algorithms Prepared by John Reif, Ph.D. Analysis of Algorithms.
C&O 355 Mathematical Programming Fall 2010 Lecture 18 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A A A A A A Image:
Spectrally Thin Trees Nick Harvey University of British Columbia Joint work with Neil Olver (MIT  Vrije Universiteit) TexPoint fonts used in EMF. Read.
C&O 355 Mathematical Programming Fall 2010 Lecture 16 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
CPSC 536N Sparse Approximations Winter 2013 Lecture 1 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA.
C&O 355 Lecture 24 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A A A A A A A.
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
Tight Bound for the Gap Hamming Distance Problem Oded Regev Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint.
Chebyshev’s Inequality Markov’s Inequality Proposition 2.1.
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
13.3 Product of a Scalar and a Matrix.  In matrix algebra, a real number is often called a.  To multiply a matrix by a scalar, you multiply each entry.
PROBABILITY AND COMPUTING RANDOMIZED ALGORITHMS AND PROBABILISTIC ANALYSIS CHAPTER 1 IWAMA and ITO Lab. M1 Sakaidani Hikaru 1.
Path Coupling And Approximate Counting
Spectral Clustering.
Structural Properties of Low Threshold Rank Graphs
Lecture 4: CountSketch High Frequencies
Continuous Random Variable
Matrix Martingales in Randomized Numerical Linear Algebra
Aviv Rosenberg 10/01/18 Seminar on Experts and Bandits
The Multivariate Normal Distribution, Part 2
Simple Sampling Sampling Methods Inference Probabilistic Graphical
On Solving Linear Systems in Sublinear Time
Presentation transcript:

Matrix Concentration Nick Harvey University of British Columbia TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A

The Problem Given any random n x n, symmetric matrices Y 1,…,Y k. Show that  i Y i is probably “close” to E[  i Y i ]. Why? A matrix generalization of the Chernoff bound. Much research on eigenvalues of a random matrix with independent entries. This is more general.

Chernoff/Hoeffding Bound Theorem: Let Y 1,…,Y k be independent random scalars in [0,R]. Let Y =  i Y i. Suppose that ¹ L · E[Y] · ¹ U. Then

Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Example: Balls and bins – Throw k balls uniformly into n bins – Y i = Uniform over – If k = O(n log n / ² 2 ), all bins same up to factor 1 § ²

Rudelson’s Sampling Lemma Theorem: [Rudelson ‘99] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices of size n x n s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only

Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I, k Y i k· R. Let Y =  i Y i, so E[Y]=k ¢ I. Then Pros: We’ve generalized to PSD matrices Mild issue: We assume E[Y i ] = I. Cons: – Y i ’s must be identically distributed – rank-1 matrices only

Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices s.t. E[Y i ]=I. Let Y=  i Y i, so E[Y]=k ¢ I. Assume Y i ¹ R ¢ I. Then Notation: A ¹ B, B-A is PSD ® I ¹ A ¹ ¯ I, all eigenvalue of A lie in [ ®, ¯ ] Mild issue: We assume E[Y i ] = I. E[Y i ]=I

Rudelson’s Sampling Lemma Theorem: [Rudelson-Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ], Y=  i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Apply previous theorem to { Z -1/2 Y i Z -1/2 : i=1,…,k }. Use the fact that A ¹ B, Z -1/2 A Z -1/2 ¹ Z -1/2 B Z -1/2 So (1- ² ) k Z ¹  i Y i ¹ (1+ ² ) k Z, (1- ² ) k I ¹  i Z -1/2 Y i Z -1/2 ¹ (1+ ² ) k I

Ahlswede-Winter Inequality Theorem: [Ahlswede-Winter ‘02] Let Y 1,…,Y k be i.i.d. PSD matrices of size n x n. Let Z=E[Y i ], Y=  i Y i, so E[Y]=k ¢ Z. Assume Y i ¹ R ¢ Z. Then Pros: – We’ve removed the rank-1 assumption. – Proof is much easier than Rudelson’s proof. Cons: – Still need Y i ’s to be identically distributed. (More precisely, poor results unless E[Y a ] = E[Y b ].)

Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y=  i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Pros: – Y i ’s do not need to be identically distributed – Poisson-like bound for the right-tail – Proof not difficult (but uses Lieb’s inequality) Mild issue: Poor results unless ¸ min (E[Y]) ¼ ¸ max (E[Y]).

Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. Let Y=  i Y i. Let Z=E[Y]. Suppose Y i ¹ R ¢ Z. Then

Tropp’s User-Friendly Tail Bound Theorem: [Tropp ‘12] Let Y 1,…,Y k be independent, PSD matrices of size n x n. s.t. k Y i k· R. Let Y=  i Y i. Suppose ¹ L ¢ I ¹ E[Y] ¹ ¹ U ¢ I. Then Example: Balls and bins – For b=1,…,n – For t=1,…,8 log(n)/ ² 2 – With prob ½, throw a ball into bin b – Let Y b,t = with prob ½, otherwise 0.

Additive Error Previous theorems give multiplicative error: (1- ² ) E[  i Y i ] ¹  i Y i ¹ (1+ ² ) E[  i Y i ] Additive error also useful: k  i Y i - E[  i Y i ] k · ² Theorem: [Rudelson & Vershynin ‘07] Let Y 1,…,Y k be i.i.d. rank-1, PSD matrices. Let Z=E[Y i ]. Suppose k Z k· 1, k Y i k· R. Then Theorem: [Magen & Zouzias ‘11] If instead rank Y i · k := £ (R log(R/ ² 2 )/ ² 2 ), then

Proof of Ahlswede-Winter Key idea: Bound matrix moment generating function Let S k =  i=1 Y i k Golden-Thompson Inequality By induction, tr e A+B · tr e A ¢ e B Weakness: This is brutal

How to improve Ahlswede-Winter? Golden-Thompson Inequality tr e A+B · tr e A ¢ e B for all symmetric matrices A, B. Does not extend to three matrices! tr e A+B+C · tr e A ¢ e B ¢ e C is FALSE. Lieb’s Inequality: For any symmetric matrix L, the map f : PSD Cone ! R defined by f(A) = tr exp( L + log(A) ) is concave. – So f interacts nicely with Expectation and Jensen’s inequality

Beyond the basics Hoeffding (non-uniform bounds on Y i ’s) [Tropp ‘12] Bernstein (use bound on Var[Y i ]) [Tropp ‘12] Freedman (martingale version of Bernstein) [Tropp ‘12] Stein’s Method (slightly sharper results) [Mackey et al. ‘12] Pessimistic Estimators for Ahlswede-Winter inequality [Wigderson-Xiao ‘08]

Summary We now have beautiful, powerful, flexible extension of Chernoff bound to matrices. Ahlswede-Winter has a simple proof; Tropp’s inequality is very easy to use. Several important uses to date; hopefully more uses in the future.