James Demmel www.cs.berkeley.edu/~demmel/cs170_Spr10 11/30/2018 Graph Partitioning James Demmel www.cs.berkeley.edu/~demmel/cs170_Spr10 04/30/2010 CS267,

Slides:



Advertisements
Similar presentations
Great Theoretical Ideas in Computer Science
Advertisements

Multilevel Hypergraph Partitioning Daniel Salce Matthew Zobel.
Midwestern State University Department of Computer Science Dr. Ranette Halverson CMPS 2433 – CHAPTER 4 GRAPHS 1.
Great Theoretical Ideas in Computer Science for Some.
Graphs Graphs are the most general data structures we will study in this course. A graph is a more general version of connected nodes than the tree. Both.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
10/11/2001Random walks and spectral segmentation1 CSE 291 Fall 2001 Marina Meila and Jianbo Shi: Learning Segmentation by Random Walks/A Random Walks View.
Great Theoretical Ideas in Computer Science.
CS267 L17 Graph Partitioning III.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 17: Graph Partitioning - III James Demmel
CS 584. Review n Systems of equations and finite element methods are related.
VLSI Layout Algorithms CSE 6404 A 46 B 65 C 11 D 56 E 23 F 8 H 37 G 19 I 12J 14 K 27 X=(AB*CD)+ (A+D)+(A(B+C)) Y = (A(B+C)+AC+ D+A(BC+D)) Dr. Md. Saidur.
Great Theoretical Ideas in Computer Science.
CS267 L14 Graph Partitioning I.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 14: Graph Partitioning - I James Demmel
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.
A scalable multilevel algorithm for community structure detection
SubSea: An Efficient Heuristic Algorithm for Subgraph Isomorphism Vladimir Lipets Ben-Gurion University of the Negev Joint work with Prof. Ehud Gudes.
CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel
CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel
Tutorial 10 Iterative Methods and Matrix Norms. 2 In an iterative process, the k+1 step is defined via: Iterative processes Eigenvector decomposition.
Partitioning 1 Outline –What is Partitioning –Partitioning Example –Partitioning Theory –Partitioning Algorithms Goal –Understand partitioning problem.
02/23/2005CS267 Lecture 101 CS 267: Applications of Parallel Computers Graph Partitioning James Demmel
15-853Page :Algorithms in the Real World Separators – Introduction – Applications.
Multilevel Graph Partitioning and Fiduccia-Mattheyses
03/01/2011CS267 Lecture 131 CS 267: Applications of Parallel Computers Graph Partitioning James Demmel and Kathy Yelick
Domain decomposition in parallel computing Ashok Srinivasan Florida State University COT 5410 – Spring 2004.
Graph partition in PCB and VLSI physical synthesis Lin Zhong ELEC424, Fall 2010.
GRAPH Learning Outcomes Students should be able to:
296.3:Algorithms in the Real World
Graph Partitioning Donald Nguyen October 24, 2011.
Graph Partitioning Problem Kernighan and Lin Algorithm
15-853Page1 Separatosr in the Real World Practice Slides combined from
CS 584. Load Balancing Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Domain decomposition in parallel computing Ashok Srinivasan Florida State University.
Data Structures and Algorithms in Parallel Computing Lecture 7.
Spectral Partitioning: One way to slice a problem in half C B A.
 In the previews parts we have seen some kind of segmentation method.  In this lecture we will see graph cut, which is a another segmentation method.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Spectral Clustering Shannon Quinn (with thanks to William Cohen of Carnegie Mellon University, and J. Leskovec, A. Rajaraman, and J. Ullman of Stanford.
1 GRAPH Learning Outcomes Students should be able to: Explain basic terminology of a graph Identify Euler and Hamiltonian cycle Represent graphs using.
CS 140: Sparse Matrix-Vector Multiplication and Graph Partitioning
High Performance Computing Seminar
Algorithms and Networks
Parallel Hypergraph Partitioning for Scientific Computing
Chapter 9 (Part 2): Graphs
Copyright © Zeph Grunschlag,
A Continuous Optimization Approach to the Minimum Bisection Problem
CS 290H Administrivia: April 16, 2008
CS 267: Applications of Parallel Computers Graph Partitioning
Great Theoretical Ideas in Computer Science
CS 240A: Graph and hypergraph partitioning
CS 267: Applications of Parallel Computers Graph Partitioning
Computer Science cpsc322, Lecture 14
Degree and Eigenvector Centrality
Discrete Mathematics for Computer Science
Grouping.
CS 267: Applications of Parallel Computers Graph Partitioning
CS 267: Applications of Parallel Computers Graph Partitioning
Haim Kaplan and Uri Zwick
Lectures on Graph Algorithms: searching, testing and sorting
CS 267: Applications of Parallel Computers Graph Partitioning
CS 267: Applications of Parallel Computers Graph Partitioning
Spectral Clustering Eric Xing Lecture 8, August 13, 2010
CS 267: Applications of Parallel Computers Graph Partitioning
Algorithm Course Dr. Aref Rashad
Algorithms CSCI 235, Spring 2019 Lecture 32 Graphs I
James Demmel CS 267 Applications of Parallel Computers Lecture 14: Graph Partitioning - I James Demmel.
Concepts of Computation
Presentation transcript:

James Demmel www.cs.berkeley.edu/~demmel/cs170_Spr10 11/30/2018 Graph Partitioning James Demmel www.cs.berkeley.edu/~demmel/cs170_Spr10 04/30/2010 CS267, Yelick

Outline of Graph Partitioning Lecture Review definition of Graph Partitioning problem Overview of heuristics Partitioning with Nodal Coordinates Ex: node at point in (x,y) or (x,y,z) space Partitioning without Nodal Coordinates Ex: In model of WWW, nodes are web pages Multilevel Acceleration BIG IDEA, appears often in computing Available software Comparison of Methods and Applications 04/30/2010

Definition of Graph Partitioning 11/30/2018 Definition of Graph Partitioning Given a graph G = (N, E, WN, WE) N = nodes (or vertices), WN = node weights E = edges WE = edge weights Ex: N = {tasks}, WN = {task costs}, edge (j,k) in E means task j sends WE(j,k) words to task k Goal: Choose a partition N = N1 U N2 U … U NP such that The sum of the node weights in each Nj is “about the same” The sum of all edge weights of edges connecting all different pairs Nj and Nk is minimized “Balance the work load, while minimizing communication” Special case of N = N1 U N2: Graph Bisection 2 (2) 1 3 (1) 4 1 (2) 2 4 (3) 2 3 1 2 5 (1) 8 (1) 1 5 6 6 (2) 7 (3) Give motivation based on work at nodes, comm on edges 04/30/2010 CS267, Yelick

Definition of Graph Partitioning Given a graph G = (N, E, WN, WE) N = nodes (or vertices), WN = node weights E = edges WE = edge weights Ex: N = {tasks}, WN = {task costs}, edge (j,k) in E means task j sends WE(j,k) words to task k Goal: Choose a partition N = N1 U N2 U … U NP such that The sum of the node weights in each Nj is “about the same” The sum of all edge weights of edges connecting all different pairs Nj and Nk is minimized (shown in black) “Balance the work load, while minimizing communication.” Special case of N = N1 U N2: Graph Bisection 2 (2) 1 3 (1) 4 1 (2) 2 4 (3) 2 3 1 2 5 (1) 8 (1) 1 5 6 6 (2) 7 (3) 04/30/2010

Some Applications Telephone network design 11/30/2018 Some Applications Telephone network design Original application, algorithm due to Kernighan Load Balancing while Minimizing Communication Operations on Sparse Matrices Analogous to graphs with |E| << |N|2 Important kernel in scientific computing VLSI Layout N = {units on chip}, E = {wires}, WE(j,k) = wire length Data mining and clustering Physical Mapping of DNA Image Segmentation Invented by Kernighan, who also wrote the first “hello world” program, Coined the name “UNIX”, and cowrote the standard book on C What else did Kernighan invent? Not C (just book) Name of Unix “hello, world” AWK (K = Kernighan) 04/30/2010 CS267, Yelick

A Sparse Matrix/Graph You Use Every Day You probably use a sparse matrix everyday, albeit indirectly. Here, I show a million x million submatrix of the web connectivity graph, constructed from an archive at the Stanford WebBase.

Cost of Graph Partitioning 11/30/2018 Cost of Graph Partitioning Many possible partitionings to search Just to divide n = |N| vertices into 2 equal subsets there are: n choose n/2 = n!/((n/2)!)2 ~ sqrt(2/(np))*2n possibilities Choosing optimal partitioning is NP-complete We need good heuristics Goal: O(|E|) May want to animate edge counts and ask “which is best” without giving away the answer. 04/30/2010 CS267, Yelick

Outline of Graph Partitioning Lecture Review definition of Graph Partitioning problem Overview of heuristics Partitioning with Nodal Coordinates Ex: node at point in (x,y) or (x,y,z) space Partitioning without Nodal Coordinates Ex: In model of WWW, nodes are web pages Multilevel Acceleration BIG IDEA, appears often in computing Available software Comparison of Methods and Applications 04/30/2010

First Heuristic: Repeated Graph Bisection To partition N into 2k parts bisect graph recursively k times Henceforth discuss mostly graph bisection 04/30/2010

Overview of Bisection Heuristics Partitioning with Nodal Coordinates Each node has x,y coordinates and (mostly) connected just to nearest neighbors in space  partition space Partitioning without Nodal Coordinates E.g., Sparse matrix of Web documents A(j,k) = # times keyword j appears in URL k Multilevel acceleration (BIG IDEA) Approximate problem by “coarse graph,” do so recursively 04/30/2010

Outline of Graph Partitioning Lecture Review definition of Graph Partitioning problem Overview of heuristics Partitioning with Nodal Coordinates Ex: In finite element models, node at point in (x,y) or (x,y,z) space Partitioning without Nodal Coordinates Ex: In model of WWW, nodes are web pages Multilevel Acceleration BIG IDEA, appears often in scientific computing Available software Comparison of Methods and Applications 04/30/2010

Coordinate-Free: Kernighan/Lin 11/30/2018 Coordinate-Free: Kernighan/Lin Take a initial partition and iteratively improve it Kernighan/Lin (1970), cost = O(|N|3) but easy to understand Fiduccia/Mattheyses (1982), cost = O(|E|), much better, but more complicated Given G = (N,E,WE) and a partitioning N = A U B, where |A| = |B| T = cost(A,B) = S {W(e) where e connects nodes in A and B} Find subsets X  A and Y  B with |X| = |Y| Consider swapping X and Y if it decreases cost: newA = (A – X) U Y and newB = (B – Y) U X newT = cost(newA , newB) < T = cost(A,B) Need to compute newT efficiently for many possible X and Y, choose smallest (best) What else did Kernighan invent? Not C (just book) Name of Unix “hello, world” AWK (K = Kernighan) 04/30/2010 CS267, Yelick

Coordinate-Free: Kernighan/Lin 11/30/2018 Coordinate-Free: Kernighan/Lin Def: Suppose N = A U B, a  A, b  B. Then gain(a,b) = gain from swapping a and b = sum of edge weights crossing between A and B - sum of edge weights crossing between A-{a}U{b} and B-{b}U{a} If gain(a,b) > 0, swapping a and b helps, else not What else did Kernighan invent? Not C (just book) Name of Unix “hello, world” AWK (K = Kernighan) 04/30/2010 CS267, Yelick

Coordinate-Free: Kernighan/Lin 11/30/2018 Coordinate-Free: Kernighan/Lin Repeat … greedily try n/2 pairs (a,b) to swap, where aA, bB, n = |N| unmark all vertices while there are unmarked vertices pick unmarked pair (a,b) maximizing gain(a,b) … note that gain(a,b) might be negative mark a and b update all values of gain(x,y) for all unmarked x, y as though we had swapped a and b (but don’t swap them!) end while … at this point we have a sequence of possible pairs to swap … (a1,b1), (a2,b2),….,(an/2,bn/2), with gain1, gain2,…,gainn/2 … Suppose we swap X = {a1,…,am} and Y = {b1,…,bm}, … then the total gain would be Gain(m) = gain1 + gain2 + … + gainm Choose m to maximize Gain(m) If Gain(m) > 0, swap X = {a1,…,am} and Y = {b1,…,bm} until Gain(m)  0 … no improvement possible What else did Kernighan invent? Not C (just book) Name of Unix “hello, world” AWK (K = Kernighan) CS267, Yelick

Comments on Kernighan/Lin Algorithm Some gain(k) may be negative, but if later gains are large, then final Gain(m) may be positive can escape “local minima” where switching no pair helps How many times do we Repeat? K/L tested on very small graphs (|N|<=360) and got convergence after 2-4 sweeps For random graphs (of theoretical interest) the probability of convergence in one step appears to drop like 2-|N|/30 Better: multilevel approach (next) 04/30/2010

Outline of Graph Partitioning Lectures Review definition of Graph Partitioning problem Overview of heuristics Partitioning with Nodal Coordinates Ex: node at point in (x,y) or (x,y,z) space Partitioning without Nodal Coordinates Ex: In model of WWW, nodes are web pages Multilevel Acceleration BIG IDEA, appears often in computing Available software Comparison of Methods and Applications 04/30/2010

Introduction to Multilevel Partitioning If we want to partition G(N,E), but it is too big to do efficiently, what can we do? 1) Replace G(N,E) by a coarse approximation Gc(Nc,Ec), and partition Gc instead 2) Use partition of Gc to get a rough partitioning of G, and then iteratively improve it What if Gc still too big? Apply same idea recursively 04/30/2010

Multilevel Partitioning - High Level Algorithm (A,B ) = Multilevel_Partition( N, E ) … recursive partitioning routine returns A and B where N = A U B if |N| is small (1) Partition G = (N,E) directly to get N = A U B Return (A, B ) else (2) Coarsen G to get an approximation Gc = (Nc, Ec) (3) (Ac , Bc ) = Multilevel_Partition( Nc, Ec ) (4) Expand (Ac , Bc ) to a partition (A , B ) of N (5) Improve the partition ( A , B ) Return ( A , B ) endif (5) “V - cycle:” (2,3) How do we Coarsen? Expand? Improve? (4) (5) (2,3) (4) (5) (2,3) (4) 04/30/2010 (1)

Multilevel Kernighan-Lin Coarsen graph and expand partition using maximal matchings Improve partition using Kernighan-Lin 04/30/2010

11/30/2018 Maximal Matching Definition: A matching of a graph G(N,E) is a subset Em of E such that no two edges in Em share an endpoint Definition: A maximal matching of a graph G(N,E) is a matching Em to which no more edges can be added and remain a matching A simple greedy algorithm computes a maximal matching (recall section 9.2.1): Em = { } for i = 1 to |E| if i-th edge shares no endpoint with edges in Em, add i-th edge to Em end 04/30/2010 let Em be empty mark all nodes in N as unmatched for i = 1 to |N| … visit the nodes in any order if i has not been matched mark i as matched if there is an edge e=(i,j) where j is also unmatched, add e to Em mark j as matched endif endfor CS267, Yelick

Maximal Matching: Example 04/30/2010

Example of Coarsening 04/30/2010

Expanding a partition of Gc to a partition of G 04/30/2010

Coordinate-Free: Spectral Bisection Based on theory of Fiedler (1970s), popularized by Pothen, Simon, Liou (1990) Motivation, by analogy to a vibrating string Basic definitions Vibrating string, revisited Implementation using linear algebra 04/20/2010

Motivation for Spectral Bisection 11/30/2018 Motivation for Spectral Bisection Vibrating string Think of G = 1D mesh as masses (nodes) connected by springs (edges), i.e. a string that can vibrate Vibrating string has modes of vibration, or harmonics Label nodes by whether mode - or + to partition into N- and N+ Same idea for other graphs (eg planar graph ~ trampoline) A above Middle C = 440Hz -> high A = 880 Hz -> high E (1319) 04/30/2010 CS267, Yelick

2nd eigenvector of L(planar mesh) 04/30/2010

4th eigenvector of L(planar mesh) 02/11/2010

Details for Vibrating String Analogy: Recall F = ma 11/30/2018 Details for Vibrating String Analogy: Recall F = ma Force on mass j = k*[x(j-1) - x(j)] + k*[x(j+1) - x(j)] = -k*[-x(j-1) + 2*x(j) - x(j+1)] F=ma yields m*x’’(j) = -k*[-x(j-1) + 2*x(j) - x(j+1)] (*) Writing (*) for j=1,2,…,n yields x(1) x(1) - x(2) 1 -1 x(1) x(1) x(2) -x(1) + 2*x(2) - x(3) -1 2 -1 x(2) x(2) m * d2 … =-k* … =-k* … * … =-k*L* … dx2 x(j) -x(j-1) + 2*x(j) - x(j+1) -1 2 -1 x(j) x(j) … … … … … x(n) x(n-1) - x(n) -1 1 x(n) x(n) X(0) = x(n+1) = 0 are fixed boundary conditions (-m/k) x’’ = L*x 04/30/2010 CS267 Lecture 23 CS267, Yelick

Details for Vibrating String (continued) -(m/k) x’’ = L*x, where x = [x1,x2,…,xn ]T Seek solution of form x(t) = sin(a*t) * x0 L * x0 = (m/k)*a2 * x0 ≡ l * x0 So l is an eigenvalue of L and x0 is its eigenvector l determines frequency of vibration x0 is its “shape” L = 1 -1 called Laplacian of the graph -1 2 -1 (a 1D mesh in this example) …. -1 2 -1 -1 1 04/30/2010

Laplacian Matrix of a Graph Definition: The Laplacian matrix L(G) of a graph G(N,E) is an |N| by |N| symmetric matrix, with one row and column for each node. It is defined by L(G) (i,i) = degree of node i (number of incident edges) L(G) (i,j) = -1 if i ≠ j and there is an edge (i,j) L(G) (i,j) = 0 otherwise 04/30/2010

Example of L(G) for Simple Meshes 04/30/2010

Properties of the Laplacian Matrix The eigenvalues of L(G) are real and nonnegative: 0 = l1  l2  …  ln The number of connected components of G is equal to the number of li equal to 0. Definition: l2 is the algebraic connectivity of G The magnitude of l2 measures connectivity In particular, l2 ≠ 0 if and only if G is connected. Theorem (Fiedler, 75) The number of edges cut by any partitioning of G (including optimal one) is at least l2 * n / 4 Spectral bisection algorithm to partition N = N- U N+ Compute eigenvector v2 for l2 : L(G) * v2 = l2 * v2 For each node n of G if v2(n) < 0 put node n in N-, else put node n in N+ Theorem (Fiedler): N- and N+ are connected (if no v2(n) = 0) 04/30/2010

How do we compute v2 and l2 ? Implementation via the Lanczos Algorithm Approximation, requiring a number of matrix-vector multiplies by L Studied in linear algebra classes Have we made progress? To optimize matrix-vector-multiply for a sparse matrix, we partition the graph of the matrix To partition graph, we find an eigenvector of Laplacian matrix L associated with the graph To find an eigenvector, we do matrix-vector-multiplies by L Sounds like chicken and egg… The first matrix-vector multiplies are slow, but we use them to learn how to make the rest faster Useful when we want to do many sparse-matrix (or similar) operations Useful when goal is partition itself (eg for image segmentation) 04/30/2010

Multilevel Spectral Bisection Coarsen graph and expand partition using maximal independent sets 04/30/2010

Maximal Independent Sets Definition: An independent set of a graph G(N,E) is a subset Ni of N such that no two nodes in Ni are connected by an edge Definition: A maximal independent set of a graph G(N,E) is an independent set Ni to which no more nodes can be added and remain an independent set A simple greedy algorithm computes a maximal independent set: let Ni be empty for k = 1 to |N| … visit the nodes in any order if node k is not adjacent to any node already in Ni add k to Ni endif endfor 04/30/2010

Example of Coarsening - encloses domain Dk = node of Nc 04/30/2010

Expanding a partition of Gc to a partition of G Need to convert an eigenvector vc of L(Gc) to an approximate eigenvector v of L(G) Use interpolation: For each node j in N if j is also a node in Nc, then v(j) = vc(j) … use same eigenvector component else v(j) = average of vc(k) for all neighbors k of j in Nc end if endif 04/30/2010

Example: 1D mesh of 9 nodes 04/30/2010

Available Software Implementations Multilevel Kernighan/Lin METIS (www.cs.umn.edu/~metis) ParMETIS - parallel version Multilevel Spectral Bisection S. Barnard and H. Simon, “A fast multilevel implementation of recursive spectral bisection …”, Proc. 6th SIAM Conf. On Parallel Processing, 1993 Chaco (www.cs.sandia.gov/CRF/papers_chaco.html) Hybrids possible Ex: Using Kernighan/Lin to improve a partition from spectral bisection Recent package, collection of techniques Zoltan (www.cs.sandia.gov/Zoltan) 04/30/2010

Extra Slides 04/30/2010

Comparison of methods Compare only methods that use edges, not nodal coordinates CS267 webpage and KK95a (see below) have other comparisons Metrics Speed of partitioning Number of edge cuts Other application dependent metrics Summary No one method best Multi-level Kernighan/Lin fastest by far, comparable to Spectral in the number of edge cuts www-users.cs.umn.edu/~karypis/metis/publications/main.html see publications KK95a and KK95b Spectral give much better cuts for some applications Ex: image segmentation See “Normalized Cuts and Image Segmentation” by J. Malik, J. Shi 02/11/2010 CS267 Lecture 8

Number of edges cut for a 64-way partition For Multilevel Kernighan/Lin, as implemented in METIS (see KK95a) # of Nodes 144649 15606 4960 448695 38744 74752 10672 267241 17758 76480 201142 # of Edges 1074393 45878 9462 3314611 993481 261120 209093 334931 54196 152002 1479989 # Edges cut for 64-way partition 88806 2965 675 194436 55753 11388 58784 1388 17894 4365 117997 Expected # cuts for 2D mesh 6427 2111 1190 11320 3326 4620 1746 8736 2252 4674 7579 Expected # cuts for 3D mesh 31805 7208 3357 67647 13215 20481 5595 47887 7856 20796 39623 Graph 144 4ELT ADD32 AUTO BBMAT FINAN512 LHR10 MAP1 MEMPLUS SHYY161 TORSO Description 3D FE Mesh 2D FE Mesh 32 bit adder 2D Stiffness M. Lin. Prog. Chem. Eng. Highway Net. Memory circuit Navier-Stokes Expected # cuts for 64-way partition of 2D mesh of n nodes n1/2 + 2*(n/2)1/2 + 4*(n/4)1/2 + … + 32*(n/32)1/2 ~ 17 * n1/2 Expected # cuts for 64-way partition of 3D mesh of n nodes = n2/3 + 2*(n/2)2/3 + 4*(n/4)2/3 + … + 32*(n/32)2/3 ~ 11.5 * n2/3 02/11/2010 CS267 Lecture 8

Speed of 256-way partitioning (from KK95a) Partitioning time in seconds # of Nodes 144649 15606 4960 448695 38744 74752 10672 267241 17758 76480 201142 # of Edges 1074393 45878 9462 3314611 993481 261120 209093 334931 54196 152002 1479989 Multilevel Spectral Bisection 607.3 25.0 18.7 2214.2 474.2 311.0 142.6 850.2 117.9 130.0 1053.4 Multilevel Kernighan/ Lin 48.1 3.1 1.6 179.2 25.5 18.0 8.1 44.8 4.3 10.1 63.9 Graph 144 4ELT ADD32 AUTO BBMAT FINAN512 LHR10 MAP1 MEMPLUS SHYY161 TORSO Description 3D FE Mesh 2D FE Mesh 32 bit adder 2D Stiffness M. Lin. Prog. Chem. Eng. Highway Net. Memory circuit Navier-Stokes Kernighan/Lin much faster than Spectral Bisection! 02/11/2010 CS267 Lecture 8

Beyond Simple Graph Partitioning Undirected graphs model symmetric matrices, not unsymmetric ones More general graph models include: Hypergraph: nodes are computation, edges are communication, but connected to a set (>= 2) of nodes HMETIS package Bipartite model: use bipartite graph for directed graph Multi-object, Multi-Constraint model: use when single structure may involve multiple computations with differing costs For more see Bruce Hendrickson’s web page www.cs.sandia.gov/~bahendr/partitioning.html “Load Balancing Myths, Fictions & Legends” 02/11/2010 CS267 Lecture 8

Coordinate-Free Partitioning: Summary Several techniques for partitioning without coordinates Breadth-First Search – simple, but not great partition Kernighan-Lin – good corrector given reasonable partition Spectral Method – good partitions, but slow Multilevel methods Used to speed up problems that are too large/slow Coarsen, partition, expand, improve Can be used with K-L and Spectral methods and others Speed/quality For load balancing of grids, multi-level K-L probably best For other partitioning problems (vision, clustering, etc.) spectral may be better Good software available 03/09/2009 CS267 Lecture 13

Is Graph Partitioning a Solved Problem? Myths of partitioning due to Bruce Hendrickson Edge cut = communication cost Simple graphs are sufficient Edge cut is the right metric Existing tools solve the problem Key is finding the right partition Graph partitioning is a solved problem Slides and myths based on Bruce Hendrickson’s: “Load Balancing Myths, Fictions & Legends” 03/09/2009 CS267 Lecture 13

Myth 1: Edge Cut = Communication Cost Myth1: The edge-cut deceit edge-cut = communication cost Not quite true: #vertices on boundary is actual communication volume Do not communicate same node value twice Cost of communication depends on # of messages too (a term) Congestion may also affect communication cost Why is this OK for most applications? Mesh-based problems match the model: cost is ~ edge cuts Other problems (data mining, etc.) do not 03/09/2009 CS267 Lecture 13

Myth 2: Simple Graphs are Sufficient Graphs often used to encode data dependencies Do X before doing Y Graph partitioning determines data partitioning Assumes graph nodes can be evaluated in parallel Communication on edges can also be done in parallel Only dependence is between sweeps over the graph More general graph models include: Hypergraph: nodes are computation, edges are communication, but connected to a set (>= 2) of nodes Bipartite model: use bipartite graph for directed graph Multi-object, Multi-Constraint model: use when single structure may involve multiple computations with differing costs 03/09/2009 CS267 Lecture 13

Myth 3: Partition Quality is Paramount When structure are changing dynamically during a simulation, need to partition dynamically Speed may be more important than quality Partitioner must run fast in parallel Partition should be incremental Change minimally relative to prior one Must not use too much memory Example from Touheed, Selwood, Jimack and Bersins 1 M elements with adaptive refinement on SGI Origin Timing data for different partitioning algorithms: Repartition time from 3.0 to 15.2 secs Migration time : 17.8 to 37.8 secs Solve time: 2.54 to 3.11 secs 03/09/2009 CS267 Lecture 13

References Details of all proofs on Jim Demmel’s 267 web page A. Pothen, H. Simon, K.-P. Liou, “Partitioning sparse matrices with eigenvectors of graphs”, SIAM J. Mat. Anal. Appl. 11:430-452 (1990) M. Fiedler, “Algebraic Connectivity of Graphs”, Czech. Math. J., 23:298-305 (1973) M. Fiedler, Czech. Math. J., 25:619-637 (1975) B. Parlett, “The Symmetric Eigenproblem”, Prentice-Hall, 1980 www.cs.berkeley.edu/~ruhe/lantplht/lantplht.html www.netlib.org/laso 03/09/2009 CS267 Lecture 13

Summary Partitioning with nodal coordinates: Inertial method Projection onto a sphere Algorithms are efficient Rely on graphs having nodes connected (mostly) to “nearest neighbors” in space Partitioning without nodal coordinates: Breadth-First Search – simple, but not great partition Kernighan-Lin – good corrector given reasonable partition Spectral Method – good partitions, but slow Today: Spectral methods revisited Multilevel methods 03/109/2009 CS267 Lecture 13

Another Example Definition: The Laplacian matrix L(G) of a graph G(N,E) is an |N| by |N| symmetric matrix, with one row and column for each node. It is defined by L(G) (i,i) = degree of node I (number of incident edges) L(G) (i,j) = -1 if i != j and there is an edge (i,j) L(G) (i,j) = 0 otherwise 1 4 2 -1 -1 0 0 -1 2 -1 0 0 -1 -1 4 -1 -1 0 0 -1 2 -1 0 0 -1 -1 2 G = L(G) = 2 3 5 Hidden slide 03/09/2009 CS267 Lecture 13