Algorithm Design Using Spectral Graph Theory Richard Peng Joint Work with Guy Blelloch, HuiHan Chin, Anupam Gupta, Jon Kelner, Yiannis Koutis, Aleksander.

Slides:



Advertisements
Similar presentations
05/11/2005 Carnegie Mellon School of Computer Science Aladdin Lamps 05 Combinatorial and algebraic tools for multigrid Yiannis Koutis Computer Science.
Advertisements

Great Theoretical Ideas in Computer Science
Fast Regression Algorithms Using Spectral Graph Theory Richard Peng.
Solving Laplacian Systems: Some Contributions from Theoretical Computer Science Nick Harvey UBC Department of Computer Science.
The Combinatorial Multigrid Solver Yiannis Koutis, Gary Miller Carnegie Mellon University TexPoint fonts used in EMF. Read the TexPoint manual before you.
Uniform Sampling for Matrix Approximation Michael Cohen, Yin Tat Lee, Cameron Musco, Christopher Musco, Richard Peng, Aaron Sidford M.I.T.
Solving linear systems through nested dissection Noga Alon Tel Aviv University Raphael Yuster University of Haifa.
Siddharth Choudhary.  Refines a visual reconstruction to produce jointly optimal 3D structure and viewing parameters  ‘bundle’ refers to the bundle.
An Efficient Parallel Solver for SDD Linear Systems Richard Peng M.I.T. Joint work with Dan Spielman (Yale)
Information Networks Graph Clustering Lecture 14.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Interchanging distance and capacity in probabilistic mappings Uriel Feige Weizmann Institute.
SDD Solvers: Bridging theory and practice Yiannis Koutis University of Puerto Rico, Rio Piedras joint with Gary Miller, Richard Peng Carnegie Mellon University.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
1 Representing Graphs. 2 Adjacency Matrix Suppose we have a graph G with n nodes. The adjacency matrix is the n x n matrix A=[a ij ] with: a ij = 1 if.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo Department of Combinatorics and Optimization Joint work with Isaac.
Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers: A Survey Nick Harvey UBC Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey University of Waterloo Department of Combinatorics and Optimization Joint.
Graph Sparsifiers by Edge-Connectivity and Random Spanning Trees Nick Harvey U. Waterloo C&O Joint work with Isaac Fung TexPoint fonts used in EMF. Read.
Sampling from Gaussian Graphical Models via Spectral Sparsification Richard Peng M.I.T. Joint work with Dehua Cheng, Yu Cheng, Yan Liu and Shanghua Teng.
Sampling: an Algorithmic Perspective Richard Peng M.I.T.
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves
Great Theoretical Ideas in Computer Science.
Avoiding Communication in Sparse Iterative Solvers Erin Carson Nick Knight CS294, Fall 2011.
Randomness in Computation and Communication Part 1: Randomized algorithms Lap Chi Lau CSE CUHK.
Solving SDD Linear Systems in Nearly mlog 1/2 n Time Richard Peng M.I.T. A&C Seminar, Sep 12, 2014.
Introduction Outline The Problem Domain Network Design Spanning Trees Steiner Trees Triangulation Technique Spanners Spanners Application Simple Greedy.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Yiannis Koutis , U of Puerto Rico, Rio Piedras
Fast, Randomized Algorithms for Partitioning, Sparsification, and
Graph Sparsifiers Nick Harvey University of British Columbia Based on joint work with Isaac Fung, and independent work of Ramesh Hariharan & Debmalya Panigrahi.
Complexity of direct methods n 1/2 n 1/3 2D3D Space (fill): O(n log n)O(n 4/3 ) Time (flops): O(n 3/2 )O(n 2 ) Time and space to solve any problem on any.
Institute for Advanced Study, April Sushant Sachdeva Princeton University Joint work with Lorenzo Orecchia, Nisheeth K. Vishnoi Linear Time Graph.
Graph Sparsifiers Nick Harvey Joint work with Isaac Fung TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A.
Spanning and Sparsifying Rajmohan Rajaraman Northeastern University, Boston May 2012 Chennai Network Optimization WorkshopSpanning and Sparsifying1.
Learning Spectral Clustering, With Application to Speech Separation F. R. Bach and M. I. Jordan, JMLR 2006.
A deterministic near-linear time algorithm for finding minimum cuts in planar graphs Thank you, Steve, for presenting it for us!!! Parinya Chalermsook.
Multifaceted Algorithm Design Richard Peng M.I.T..
Direct Methods for Sparse Linear Systems Lecture 4 Alessandra Nardi Thanks to Prof. Jacob White, Suvranu De, Deepak Ramaswamy, Michal Rewienski, and Karen.
Topics in Algorithms 2007 Ramesh Hariharan. Tree Embeddings.
Discretization Methods Chapter 2. Training Manual May 15, 2001 Inventory # Discretization Methods Topics Equations and The Goal Brief overview.
CS 290H Administrivia: May 14, 2008 Course project progress reports due next Wed 21 May. Reading in Saad (second edition): Sections
Graph Partitioning using Single Commodity Flows
Graphs, Vectors, and Matrices Daniel A. Spielman Yale University AMS Josiah Willard Gibbs Lecture January 6, 2016.
CS 290H 31 October and 2 November Support graph preconditioners Final projects: Read and present two related papers on a topic not covered in class Or,
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Chapter 13 Backtracking Introduction The 3-coloring problem
1 Algebraic and combinatorial tools for optimal multilevel algorithms Yiannis Koutis Carnegie Mellon University.
Conjugate gradient iteration One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage x 0.
Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.
11/25/03 3D Model Acquisition by Tracking 2D Wireframes Presenter: Jing Han Shiau M. Brown, T. Drummond and R. Cipolla Department of Engineering University.
Sparsified Matrix Algorithms for Graph Laplacians Richard Peng Georgia Tech.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
Laplacian Matrices of Graphs: Algorithms and Applications ICML, June 21, 2016 Daniel A. Spielman.
High Performance Linear System Solvers with Focus on Graph Laplacians
Spectral partitioning works: Planar graphs and finite element meshes
Richard Peng Georgia Tech Michael Cohen Jon Kelner John Peebles
Lap Chi Lau we will only use slides 4 to 19
Resparsification of Graphs
Topics in Algorithms Lap Chi Lau.
Efficient methods for finding low-stretch spanning trees
Solving Linear Systems Ax=b
Parallel Algorithm Design using Spectral Graph Theory
Density Independent Algorithms for Sparsifying
Matrix Martingales in Randomized Numerical Linear Algebra
A Numerical Analysis Approach to Convex Optimization
On Solving Linear Systems in Sublinear Time
Optimization on Graphs
Presentation transcript:

Algorithm Design Using Spectral Graph Theory Richard Peng Joint Work with Guy Blelloch, HuiHan Chin, Anupam Gupta, Jon Kelner, Yiannis Koutis, Aleksander M ą dry, Gary Miller and Kanat Tangwongsan

OUTLINE Motivating problem: image denoising Fast solvers for SDD linear systems Using solver for L 1 minimization and graph problems.

IMAGE DENOISING Given image + noise, recover image.

IMAGE DENOISING: THE MODEL ‘original’ noiseless image. noise from some distribution added. input: original + noise, s. goal: recover original, x. Denoised Image: Noise : Input: s-x s x

EXPLICIT VS. IMPLICIT APPROACHES ExplicitImplicit GoalRecover x directlyDefine conditions on x and s, solve for x Basic OperationAveraging a set of pixels Filtering Minimize objective function RuntimeO(n)O(n 2 ) or higher QualityReasonableHigh n > 10 6 for most images First give a simplified objective that can be optimized fast

Solution recovered has quality issues, will come back to this later. SIMPLE OBJECTIVE FUNCTION Gradient: 2 A x – 2s Optimal: 0 = 2 A x – 2s A x = s Equal to x T A x-2s T x where x, s are length n vectors, A is n-by-n matrix x = A -1 s minimizeΣ i (x i -s i ) 2 + Σ i~j (x i -x j ) 2

SPECIAL STRUCTURE OF A A is Symmetric Diagonally Dominant (SDD) if: It’s symmetric In each row, diagonal entry at least sum of absolute values of all off diagonal entries

OUTLINE Motivating problem: image denoising Fast solvers for SDD linear systems Using solver for L 1 minimization and graph problems.

FUNDAMENTAL PROBLEM: SOLVING LINEAR SYSTEMS Given matrix A, vector b Find vector x such that A x=b Size of A : n-by-n m non-zero entries

SOLVING LINEAR SYSTEMS: EXPLICIT AND IMPLICIT Direct (explicit)Iterative (implicit) ‘Unit’ OperationModifying entryMatrix-vector multiply Main goalOperations applied on matrix are reversible Explored large portion of rank space Cost per stepO(1)O(m) Numer of StepsO(n ω )O(n) Total RuntimeO(n ω )O(nm)

EXPLICIT ALGORITHMS [1 st century CE] Gaussian Elimination: O(n 3 ) [Strassen `69] O(n 2.8 ) [Coppersmith-Winograd `90] O(n ) [Stothers `10] O(n ) [Vassilevska Williams`11] O(n )

SDD LINEAR SYSTEMS Direct (explicit)Iterative (implicit) ‘Unit’ OperationModifying entryMatrix-vector multiply Main ideaOperations applied on matrix are reversible Explored large portion of rank space Cost per stepO(1)O(m) Numer of StepsO(n ω )O(n) Total RuntimeO(n ω )O(nm) [Vaidya `91]: Hybrid methods

NEARLY LINEAR TIME SOLVERS [SPIELMAN-TENG ‘04] Input : n by n SDD matrix A with m non-zeros vector b Where : b = A x for some x Output : Approximate solution x’ s.t. |x-x’| A <ε|x| A Runtime : Nearly Linear O(mlog c n log(1/ε)) expected

THEORETICAL APPLICATIONS OF SDD SOLVERS: MANY ITERATIONS [Zhu-Ghahramani-Lafferty `03][Zhou-Huang-Scholkopf `05] learning on graphical models. [Tutte `62] Planar graph embeddings. [Boman-Hendrickson-Vavasis `04] Finite Element PDEs [Kelner-Mądry `09] Random spanning trees [Daitsch-Spielman `08] [Christiano-Kelner-Mądry- Spielman-Teng `11] maximum flow, mincost flow [Cheeger, Alon-Millman `85, Sherman `09, Orecchia- Sachedeva-Vishnoi `11] graph partitioning

SDD SOLVERS IN IMAGE DENOISING? Optical Coherence Tomography (OCT) scan of retina.

LOGS Runtime : O(mlog c n log(1/ ε)) Estimates on c: [Spielman]: c≤70 [Koutis]: c≤15 [Miller]: c≤32 [Teng]: c≤12 [Orecchia]: c≤6 When n = 10 6, log 6 n > 10 6

PRACTICAL NEARLY LINEAR TIME SOLVERS [KOUTIS-MILLER-P `10, `11] Input : n by n SDD matrix A with m non-zeros vector b Where : b = A x for some x Output : Approximate solution x’ s.t. |x-x’| A <ε|x| A Runtime : O(mlogn log(1/ε)) [Blelloch-Gupta-Koutis-Miller-P-Tangwongsan. `11]: Parallel solver, O(m 1/3 ) depth and nearly-linear work

GRAPH LAPLACIAN A symmetric matrix A is a Graph Laplacian if: All off-diagonal entries are non-positive. All rows and columns sum to 0. [Gremban-Miller `96]: solving SDD linear systems reduces to solving graph Laplacians `

HIGH LEVEL OVERVIEW Iterative Methods / Recursive Solver Spectral Sparsifiers Low Stretch Spanning Trees

PRECONDITIONING FOR LINEAR SYSTEM SOLVES Can solve linear systems A by iterating and solving a ‘similar’ one, B Needs a way to measure and bound similiarity [Vaidya `91]: Since A is a graph, B should be as well. Apply graph theoretic techniques!

PROPERTIES B NEEDS Easier to solve Similar to A Will only focus on reducing edge count while preserving similarity 2 ways of easier: Fewer vertices Fewer edges Can reduce vertex count if edge count is small

GRAPH SPARSIFIERS Sparse Equivalents of Dense Graphs that preserve some property Spanners: distance, diameter. [Benczur-Karger ‘96] Cut sparsifier: weight of all cuts. We need spectral sparsifiers

WHAT WE NEED: ULTRASPARSIFIERS [Spielman-Teng `04]: ultrasparsifiers with n-1+O(mlog p n/k) edges imply solvers with O(mlog p n) running time. Given graph G with n vertices, m edges, and parameter k Return graph H with n vertices, n- 1+O(mlog p n/k) edges Such that G≤H≤kG `` Spectral ordering

EXAMPLE: COMPLETE GRAPH O(nlogn) random edges (after scaling) suffice!

GENERAL GRAPH SAMPLING MECHANISM For each edge, flip coin with probability of ‘keep’ as P(e). If coin says ‘keep’, scale it up by 1/P(e). Expected value of an edge: same Number of edges kept: ∑ e P(e) Only need to concentration.

EFFECTIVE RESISTANCE View the graph as a circuit Measure effective resistance between uv, R(u,v), by passing 1 unit of current between them `

SPECTRAL SPARSIFICATION BY EFFECTIVE RESISTANCE [Spielman-Srivastava `08]: Setting P(e) to W(e)R(u,v)O(logn) gives G≤H≤2G *Ignoring probabilistic issues Fact: ∑ e W(e)R(e) = n-1Spectral sparsifier with O(nlogn) edges Ultrasparsifier? Solver???

THE CHICKEN AND EGG PROBLEM How To Calculate Effective Resistance? [Spielman-Srivastava `08]: Use Solver[Spielman-Teng `04]: Need Sparsifier Workaround: upper bound effective resistances

RAYLEIGH’S MONOTONICITY LAW Rayleigh’s Monotonicity Law: As we remove edges, the effective resistances between two vertices can only increase. ` Calculate effective resistance w.r.t. a spanning tree T Resistors in series: effective resistance of a path with resistances r 1 … r k is ∑ i r i

SAMPLING PROBABILITIES ACCORDING TO TREE Sample Probability: edge weight times effective resistance of tree path ` Number of edges kept: ∑ e P(e) Need to keep total stretch small stretch

LOW STRETCH SPANNING TREES [Alon-Karp-Peleg-West ‘91]: A low stretch spanning tree with Total stretch O(m 1+ε ) can be found in O(mlog n) time. [Elkin-Emek-Spielman-Teng ‘05]: A low stretch spanning tree with Total stretch O(mlog 2 n) can be found in O(mlog n + n log 2 n) time. [Abraham-Bartal-Neiman ’08, Koutis-Miller-P `11, Abraham- Neiman `12]: A low stretch spanning tree with Total stretch O(mlogn) can be found in O(mlog n) time. Way too big! Number of edges: O(mlog 2 n)

WHAT ARE WE MISSING? What we need: H with n-1+O(mlog p n/k) edges G≤H≤kG What we generated: H with n-1+O(mlog 2 n) edges G≤H≤2G Too many edges, but, too good of an approximation Haven’t used k yet

WORK AROUND Scale up the tree in G by factor of k, copy over off-tree edges to get graph G’. G≤G’≤kG Stretch of Tree edge: 1 Stretch of non-tree edge: reduce by factor of k. Expected number in H: Tree edges: n-1 Off tree edges: O(mlog 2 n/k) H has n-1+O(mlog 2 n/k) edges G’≤H≤2G’ H has n-1+O(mlog 2 n/k) edges G≤H≤2kG O(mlog 2 n) time solver

SOLVER IN ACTION ` Find a good spanning treeScale up the treeSample off tree edges

SOLVER IN ACTION ` Eliminate degree 1 or 2 nodes

SOLVER IN ACTION ` Eliminate degree 1 or 2 nodes

SOLVER IN ACTION ` Eliminate degree 1 or 2 nodes

SOLVER IN ACTION ` Eliminate degree 1 or 2 nodes

SOLVER IN ACTION Eliminate degree 1 or 2 nodes Recurse

QUADRATIC MINIMIZATION IN PRACTICE OCT scan of retina, denoised using the combinatorial multigrid (CMG) solver by Koutis and Miller Bad News: Missing boundaries between layers. Good News: Fast

OUTLINE Motivating problem: image denoising Fast solvers for SDD linear systems Using solver for L 1 minimization and graph problems.

TOTAL VARIATION OBJECTIVE [RUDIN-OSHER-FATEMI, 92] minimizeΣ i (x i -s i ) 2 + Σ i~j |x i -x j | Isotropic variant: partition edges into k groups, take L 2 of each group Encompasses many graph problems

TV USING L 2 MINIMIZATION [Chin-Mądry-Miller-P `12]: approximate total variation with k groups can be approximated in Õ(mk 1/3 ε -8/3 ) time. Generalization of the approximate maximum flow / minimum cut algorithm from [Christiano-Kelner- Mądry-Spielman-Teng `11]. Minimize (x i -x j ) 2 /w ij instead of |x i -x j | Equal when |x i -x j |=w ij Measure difference using the Kullback-Leibler (KL) divergence Decrease KL-divergence between w ij and differences in the optimum x

L 2 2 -L 1 MINIMIZATION IN PRACTICE L 2 2 -L 2 2 minimizer:

DUAL OF ISOTROPIC TV: GROUPED FLOW Partition edges into k groups. Given a flow f, energy of a group S equals to √(∑ eεS f(e) 2 ) Minimize the maximum energy over all groups Running time: Õ(mk 1/3 )

APPLICATION OF GROUPED FLOW Natural intermediate problem. [Kelner-Miller-P ’12]: k-commodity maximum concurrent flow in time Õ(m 4/3 poly(k,ε -1 )) [Miller-P `12]: approximate maximum flow on graphs with separator structures in Õ(m 6/5 ) time.

FUTURE WORK Faster SDD linear system solver? Higher accuracy algorithms for L 1 problems using solvers? Solvers for other classes of linear systems?

THANK YOU! Questions?