Matroid Bases and Matrix Concentration Nick Harvey University of British Columbia Joint work with Neil Olver (Vrije Universiteit) TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA
Scalar concentration inequalities Theorem: [Chernoff / Hoeffding Bound] Let Y1,…,Ym be independent, non-negative scalar random variables. Let Y=i Yi and ¹=E [ Y ]. Suppose Yi · 1 a.s. Then
Scalar concentration inequalities Theorem: [Panconesi-Srinivasan ‘92, Dubhashi-Ranjan ‘96, etc.] Let Y1,…,Ym be negatively dependent, non-negative scalar rvs. Let Y=i Yi and ¹=E [ Y ]. Suppose Yi · 1 a.s. Then Negative cylinder dependence: Yi 2 {0,1}, Stronger notions: negative association, determinantal distributions, strongly Rayleigh measures, etc.
Matrix concentration inequalities Theorem: [Tropp ‘12, etc.] Let Y1,…,Ym be independent, PSD matrices of size nxn. Let Y=i Yi and M=E [ Y ]. Suppose ¹¢Yi ¹ M a.s. Then
Extensions of Chernoff Bounds Independent Negative Dependent Scalars Chernoff-Hoeffding Panconesi-Srinivasan, etc. Matrices Tropp, etc. ? This talk: a special case of the missing common generalization, where the negatively dependent distribution is a certain random walk in a matroid base polytope.
Knowing that e2T decreases probability that f2T Negative Dependence Arises in many natural scenarios. Random spanning trees: Let Ye indicate if edge e is in tree. Knowing that e2T decreases probability that f2T e f
Negative Dependence Arises in many natural scenarios. Random spanning trees: Let Ye indicate if edge e is in tree. Balls and bins: Let Yi be number of balls in bin i. Sampling without replacement, random permutations, random cluster models, etc.
Thin trees S S Cut ±(S) = { edge st : s2S, tS } A spanning tree T is ®-thin if |±T(S)| · ®¢|±G(S)| 8S Global connectivity: K = min {|±G(S)| : ;(S(V } Conjecture [Goddyn ’80s]: Every n-vertex graph has an ®-thin tree with ®=O(1/K). Would have deep consequences in graph theory. Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones.
Thin trees S S Cut ±(S) = { edge st : s2S, tS } A spanning tree T is ®-thin if |±T(S)| · ®¢|±G(S)| 8S Global connectivity: K = min {|±G(S)| : ;(S(V } Theorem [Asadpour et al ‘10]: Every n-vertex graph has an ®-thin spanning tree with ®= . Uses negative dependence and Chernoff bounds. Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones.
Asymmetric Traveling Salesman Problem [Julia Robinson, 1949] Let D=(V,E,w) be a weighted, directed graph. Goal: Find a tour sequence v1,v2,…,vk=v1 of vertices that visits every vertex in V at least once, has vivi+12E for every i, and minimizes total weight §1·i·k w(vivi+1).
Asymmetric Traveling Salesman Problem [Julia Robinson, 1949] Let D=(V,E,w) be a weighted, directed graph. Goal: Find a tour sequence v1,v2,…,vk=v1 of vertices that visits every vertex in V at least once, has vivi+12E for every i, and minimizes total weight §1·i·k w(vivi+1). Reduction [Oveis Gharan, Saberi ‘11]: If you can efficiently find an ®/K-thin spanning tree in any n-vertex graph, then you can find a tour whose weight is within O(®) of optimal.
Graph Laplacians Laplacian of edge bc Lbc = a c d b a b c d a 1 -1 b c 1 -1 b Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones. Lbc = c d
Graph Laplacians Laplacian of graph G LG = §e2E Le = -1 for every edge b Laplacian of graph G a b c d a 2 -1 3 1 -1 for every edge b LG = §e2E Le = Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones. c degree of node d
Spectrally-thin trees 5 -1 4 6 7 6 -1 -5 5 -3 2 8 -8 1 10 A spanning tree T is ®-spectrally-thin if LT ¹ ®¢LG Effective Resistance from s to t: Rst = voltage difference when a 1-amp current source placed between s and t Theorem [Harvey-Olver '14]: Every n-vertex graph has an ®-spectrally-thin spanning tree with ®= . Uses matrix concentration bounds. Algorithmic. Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones.
Spectrally-thin trees 5 -1 4 6 7 6 -1 -5 5 -3 2 8 -8 1 10 A spanning tree T is ®-spectrally-thin if LT ¹ ®¢LG Effective Resistance from s to t: Rst = voltage difference when a 1-amp current source placed between s and t Theorem: Every n-vertex graph has an ®-spectrally-thin spanning tree with ®= . Follows from Kadison-Singer solution of MSS'13. Not algorithmic. Let me begin the talk by motivating what sparsification is and why it’s important. Sparsification is a concept that’s familiar from our daily lives: you want to take some object that’s dense, and replace by an object that’s sparser, but almost as good as the original object. Here’s an example I learned about when I bought my first house a few years ago: floor joists used to be made of solid wood, but now they’re “engineered”, often with a truss structure, often with other materials. Another familiar example is image compression: we express the image in a particular basis (perhaps a wavelet basis), then keep the most important components and throw away the less important ones.
Asymmetric Traveling Salesman Problem Recent breakthrough: [Ansari, Oveis-Gharan Dec 2014] Show how to build on the O(1)-spectrally-thin tree result to approximate optimal weight of an ATSP solution to within poly(log log n) of optimal. But, no algorithm to find the actual sequence of vertices!
Our Main Result Let P½[0,1]m be a matroid base polytope (e.g., convex hull of characteristic vectors of spanning trees) Let A1,…, Am be PSD matrices of size nxn. Define and Q;. There is an extreme point Â(S) of P with
Our Main Result Let P½[0,1]m be a matroid base polytope. Let A1,…, Am be PSD matrices of size nxn. Define and Q;. There is an extreme point Â(S) of P with What is dependence on ®? Easy: ® ¸ 1.5, even with n=2. Standard random matrix theory: ® = O(log n). Our result: Ideally: ®<2. This would solve Kadison-Singer problem. MSS ‘13: Solved Kadison-Singer, achieve ® = O(1).
Our Main Result Let P½[0,1]m be a matroid base polytope. Let A1,…, Am be PSD matrices of size nxn. Define and Q;. There is an extreme point Â(S) of P with , Furthermore, there is a random process that starts at any x02Q and terminates after m steps at such a point Â(S), whp. each step of this process can be performed algorithmically. the entire process can be derandomized.
Pipage rounding Let P be any matroid polytope. Given fractional x [Ageev-Svirideno ‘04, Srinivasan ‘01, Calinescu et al. ‘07, Chekuri et al. ‘09] Let P be any matroid polytope. Given fractional x Find coordinates a and b s.t. line z x + z ( ea – eb ) stays in current face Find two points where line leaves P Randomly choose one of those points s.t. expectation is x Repeat until x = ÂT is integral x is a martingale: expectation of final ÂT is original fractional x. ÂT1 ÂT6 ÂT2 x ÂT3 ÂT5 ÂT4
Pessimistic estimators Definition: “Pessimistic Estimator” Let E µ {0,1}m be an event. Let D(x) be the product distribution on {0,1}m with expectation x. Then g : [0,1]m ! R is a pessimistic estimator for E if Example: If E is the event { x : wT x>t } then Chernoff bounds give the pessimistic estimator
Concavity under swaps Definition: A functionf : Rm ! R is concave under swaps if z ! f( x + z(ea-eb) ) is concave 8x2P, 8a, b2[m]. Example: is concave under swaps. Pipage Rounding: Let X0 be initial point and ÂT be final point visited by pipage rounding. Claim: If f concave under swaps then E[f(ÂT)] · f(X0). [by Jensen] Pessimistic Estimators: Let E be an event and g a pessimistic estimator for E. Claim: Suppose g is concave under swaps. Then Pr[ ÂT 2 E ] · g(X0).
Matrix Pessimistic Estimators Special case of Tropp ‘12: Let A1,…,Am be nxn PSD matrices. Let D(x) be the product distribution on {0,1}m with expectation x. Let Suppose ¹¢Ai ¹ M. Let Then and . Pessimistic estimator Main Technical Result: gt,µ is concave under swaps. ) Tropp’s bound for independent sampling also achieved by pipage rounding
Our Variant of Lieb’s Theorem: PD Our Variant of Lieb’s Theorem:
Questions Does Tropp’s matrix concentration bound hold in a negatively dependent scenario? Does our variant of Lieb’s theorem have other uses? O(maxe Re)-spectrally thin trees exist by MSS’13. Can they be constructed algorithmically?