Download presentation
Presentation is loading. Please wait.
Published byStanley Nichols Modified over 6 years ago
A Hypergraph-Partitioning Approaches for Workload Decomposition
Ümit V. Çatalyürek and Cevdet Aykanat Department of Biomedical Informatics The Ohio State University Department of Computer Engineering Bilkent University
What do we mean by Decomposition
Decomposition/partitioning of computation into smaller works/work groups = Workload partitioning = Workload assignment (but not mapping) Divide the work and data for efficient parallel computation CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Outline Partitioning-based Decomposition Models / Why Hypergraphs Standard Graph Model for SpMxV Hypergraph Models for 1D Decomposition for SpMxV Fine-Grain Hypergraph Model for 2D Decomposition Proposed Two-phase Coarse-grain Decomposition for both graph and hypergraph models Application: SpMxV Coarse-grain (checkerboard) decomposition of SpMxV SpMxV: Experiment Results Conclusion CSCAPES Seminar - March 1st, 2007
What People Have Done: Existing Graph Models
Standard graph model Multi-constraint graph partitioning Skewed partitioning Bipartite graph model Are they any good? They might be sufficient for some cases but not for all CSCAPES Seminar - March 1st, 2007
Why they are not sufficient
Flaws Wrong cost metric for communication volume Latency: # messages also important Minimize the maximum volume and/or # messages Processor distance: # of switches etc All of above? Some of above? Limitations Standard Graph Model can express symmetric dependencies Directed graph, convert to undirected graph, weighting Symmetric=identical partitioning of input and output data Multiple computation phases (a solution: multi-constraint partitioning: we developed multi-constraint hypergraph partitioner) From: Bruce Hendrickson’s shortcomings of standard graph partitioning approachs Blue text: our work helps on these issues Light blue: not explicit minimization but we put upper bound on #msgs. Directed graph, convert to undirected graph, weighting convert directed edges to undirected edges give weight 1 to one-way dependency, 2 to two-way dependency CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Hypergraph Models We proposed use of hypergraphs for the workload partitioning (PhD thesis, Bilkent’99) Coarser-grain: owner computes (mapping [HiPC’95], LP decomposition [Para’95], SpMxV [Irr’96, TPDS’99]) 1D partition in SpMxV Fine-grain: assign each operation between input&output [Irr’01,PPSC’01] 2D fine-grain decomposition in SpMxV (yi=aij xj) Coarse-grain: 2D checkerboard partitioning in SpMxV Advantages Correct cost for communication volume Naturally handles asymmetry Practical to use Public tools Good tools Insures a better upper bound on the number of messages CSCAPES Seminar - March 1st, 2007
Application Parallel Matrix-Vector Multiplication y=Ax
Parallel iterative solvers 1D rowwise or columnwise partitioning of A symmetric partitioning processor Pk computes linear vector operations on k-th blocks of vectors. rowwise: Pk computes yk = Ark x entries of the x-vector are communicated columnwise : Pk computes yk = Ack xk, where y = yk entries of the yk vectors are communicated CSCAPES Seminar - March 1st, 2007
Graph Model for Representing Sparse Matrices
1 2 v 5 3 4 6 8 9 10 7 edge (vi, vj) E yi yi + aij xj and yj yj + ajixi exchange of xi and xj values before local matrix-vector products CSCAPES Seminar - March 1st, 2007
Graph Model Minimizes the Wrong Metric
k l m h j Vi Vk Vj Vm Vh Vl cost() = 2 5 = 10 words, but actual communication volume is 7 words P1 sends xi to both P2 and P4; P2 and P4 send {xj, xk, xl } and {xm, xh }, respectively, to P1 CSCAPES Seminar - March 1st, 2007
Hypergraph Model for Representing Sparse Matrices for 1D Decomposition
Each {vertex, net} pair represents unique nonzero net-cut metric: cutsize() = n NE w(ni) connectivity - 1 metric: cutsize() = n NE w(ni) (c(nj) - 1) CSCAPES Seminar - March 1st, 2007
Hypergraph models the Correct Metric
nj P2 i j k l m h nk Vj P1 Vi Vk i nl j Vl k P2 ni l nm P3 Vm nh Vh m P4 h P3 P4 connectivity values: c (ni) = 2, c (nj ) = c (nk ) = c (nl ) = c (nm ) = c (nh ) = 1 connectivity - 1 metric: cutsize() = 1 1 = 7 words CSCAPES Seminar - March 1st, 2007
Fine-Grain Hypergraph Model for 2D Decomposition
M x M matrix A with Z nonzeros is represented by H=(V, N) Z vertices: one vertex vij for each aij 0 2 M nets: one net for each row and for each column of A N =NR NC row nets: NR = {m1, m2, …, mM } column nets: NC = {n1, n2, …, nM } vij mi and vij nj iff aij 0 column-net nj represents dependency of atomic tasks to xj row-net mi represents dependency of computing yi to partial y'i results vih vii vik vij vjj vlj mi ( ri / yi ) nj ( cj / xj ) CSCAPES Seminar - March 1st, 2007
Fine-Grain Hypergraph Model
nonzero-vertex 1,1 2,5 7,4 4,4 3,3 2,3 3,2 2,2 1,2 4,1 3,5 5,5 6,6 7,6 4,7 5,7 7,7 6,8 8,8 8,4 column-net 4 n 2 7 m 1 3 5 8 6 1,6 n 6 2 3 4 5 6 7 8 1 m 8 one vertex for each nonzero row-net CSCAPES Seminar - March 1st, 2007
Fine-Grain Hypergraph Model for 2D Decomposition
unit net weighting: w(n) = 1 for each net n N use connectivity-1 metric: cutsize() = n NE (c(nj) - 1) minimizing cutsize corresponds to minimizing total volume of communication consistency of the model : exact correspondence between cutsize and communication volume maintain symmetric partitioning: yi, xi assigned to the same processor consistency condition : vii ni and vii mi for each vertex vii (holds iff aii 0 ) consider a K-way partition {V1, V2, … , VK} H=(V, N) induces a partition on nonzeros of matrix A decode vii Vk assign yi and xi to processor Pk CSCAPES Seminar - March 1st, 2007
Fine-Grain Hypergraph Model for 2D Decomposition
1,6 1,1 2,5 7,4 4,4 3,3 2,3 3,2 2,2 1,2 4,1 3,5 5,5 6,6 7,6 4,7 5,7 7,7 6,8 8,8 8,4 4 n 2 7 m 1 3 5 8 6 P 1 2 3 4 5 6 7 8 1 1 2 1 2 2 2 2 3 2 2 3 4 1 3 3 5 3 3 6 1 1 x2 7 3 1 2 8 3 1 cutsize() = 8 Communication Volume=8 CSCAPES Seminar - March 1st, 2007
Two-phase Coarse-grain Decomposition
Decompose domain along one dimension to a group of processors SpMxV: rowwise decomposition graph/hypergraph partitioning: minimize communication volume during expand phase of reduction Phase 2: Decompose domain way along the other dimension within each group SpMxV: columnwise decomposition multiconstraint graph/hypergraph partitioning: minimize communication volume during gather phase of reduction maintains computational balance while preserving coherence among decompositions within different processor groups. SpMxV: checkerboard decomposition Applicable to both graph and hypergraph models CSCAPES Seminar - March 1st, 2007
Two-phase in SpMxV: Phase 1 Rowwise decomposition thru HP
CSCAPES Seminar - March 1st, 2007
Two-phase in SpMxV: Phase 1 Rowwise decomposition thru HP
13 5 1 6 14 11 3 2 15 10 7 9 8 16 12 4 R 1 P & P 11 12 P & P 21 22 R 2 CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Two-phase in SpMxV: Phase 2 Columnwise decomposition thru Multi-constraint HP 13 5 1 6 14 11 3 2 15 10 7 9 8 16 12 4 R 1 P & P 11 12 P & P 21 22 R 2 CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Two-phase in SpMxV: Phase 2 Columnwise decomposition thru Multi-constraint HP 13 5 1 6 14 11 3 2 15 10 7 9 8 16 12 4 P , W =12 11 11 P , W =11 12 12 P , W =12 21 21 P , W =12 22 22 CSCAPES Seminar - March 1st, 2007
Experimental Results: Communication Volume
CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Experimental Results CSCAPES Seminar - March 1st, 2007
Experimental Results: Communication Volume
CSCAPES Seminar - March 1st, 2007
Experimental Results: Maximum # messages
CSCAPES Seminar - March 1st, 2007
Experimental Results: Partitioning Time
CSCAPES Seminar - March 1st, 2007
Experimental Results: Summary
CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Conclusion A suite of models/approaches for workload partitioning 1D decomposition: Coarse-grain (owner computes) 2D decomposition: Fine-grain Doesn’t restrict the place of computation to the owner of input or output 2D decomposition: Coarse-grain (checkerboard) Two-phase with better upper bound on the number of messages Two-phase is applicable to both graph and hypergraph models Which one to use For better balanced workload and/or comm vol min Fine-grain If latency is important use proposed two-phase CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
End of Talk CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Backup slides CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Graph Partitioning Graph G=(V, E) : set of vertices V and set of edges E every edge eij E connects pair of distinct vertices vi and vj K-way graph partition by edge separator: ={V1, V2, …, VK} Vk is nonempty subset of V, i.e., Vk V, parts are pairwise disjoint, i.e., Vk Vl = , union of K parts is equal to V, i.e., k=1K Vk = V. an edge eij is said to be cut if vi Vk and vj Vl and kl uncut if vi Vk and vj Vk a partition is said to be balanced if Wk Wavg (1 + ) Wk : weight of part Vk, : maximum imbalance ratio cost of a partition cutsize() = eij EE w(eij) where EE is set of cut edges CSCAPES Seminar - March 1st, 2007
CSCAPES Seminar - March 1st, 2007
Graph Partitioning Part Weights: W1=16, W2=16 Balance equation: Wk Wavg (1 + ) this is a balanced partition with = 0 cut edges: Ee= {{v1 , v6}, {v4 , v8}, {vv , v7}, {v5 , v7}} cutsize() = e EE w(e) cutsize() = 7 P P 1 2 v v 2 v 1 1 1 3 6 3 3 1 2 1 1 2 v v 1 2 8 2 v 3 3 5 3 3 9 v 4 1 1 1 2 2 4 3 v 2 v 1 v 5 7 10 CSCAPES Seminar - March 1st, 2007
Hypergraph Partitioning
Hypergraph H=(V,N): a set of vertices V and a set of nets N nets (hyperedges) connect two or more vertices every net nj N is a subset of vertices, i.e., nj V graph is a special instance of hypergraph K-way hypergraph partition: {V1, V2, … , VK} a net that has at least one pin in a part is said to connect that part connectivity set C(nj) of a net nj : set of parts connected by nj connectivity c(nj) = | C(nj) | of a net nj : number of parts connected by nj. a net nj is said to be cut if c(nj) > 1 uncut if c(nj) = 1 two cutsize definitions widely used in VLSI community: net-cut metric: cutsize() = n NE w(ni) connectivity - 1 metric: cutsize() = n NE w(ni) (c(nj) - 1) CSCAPES Seminar - March 1st, 2007
Hypergraph Partitioning
18 17 16 15 14 13 12 11 10 9 1 8 7 6 5 4 3 2 V cut nets: NE = {n1, n8, n15} connectivity sets: C(n1) = {V1,V2}, C(n8) = C(n15) = {V1,V2,V3} connectivity values: c (n1 ) = 2, c (n8 ) = c (n15 ) = 3 cutsize values assuming unit net weights: net-cut metric: cutsize() = |NE| = 3 connectivity - 1 metric: cutsize() = = 5 CSCAPES Seminar - March 1st, 2007
Graph Model for Representing Sparse Matrices
standard graph model G=(V, E) for matrix A vertex set : one vertex vi for each row/column i of A vi V task i of computing inner product yi = < ri, x> edge set E : (vi, vj) E aij 0 and aji 0 each edge denotes bidirectional interaction between tasks i and j edge (vi, vj) E yi yi + aij xj and yj yj + ajixi exchange of xi and xj values before local matrix-vector products communication of two words edge weighting: w (vi, vj) = 2 vi (ri / ci ) vj (rj / cj ) aij, aji CSCAPES Seminar - March 1st, 2007
Similar presentations
© 2025 Inc.
All rights reserved.