Parallel Programming – Graph Algorithms David Monismith CS599 Notes are primarily based upon Introduction to Parallel Programming, Second Edition by Grama, Gupta, Karypis, and Kumar
Graphs – Directed vs. Undirected Undirected graph – pair of edges and vertices (V, E). – V is a finite set of points, representing vertices. – E is a finite set of unordered pairs of points, representing edges. Directed graph – also a pair of edges and vertices (V, E). – Edges are ordered pairs
Graphs – Adjacency and Paths We will primarily deal with undirected graphs in this class. Adjacency exists between a pair of points u and v in an undirected graph if an edge exists between u and v. A path from vertex v to vertex u exists if there exists a set of edges that connect vertex v to vertex u. The path length is defined as the number of edges in the path. Provided all vertices in the path are distinct, the path is called “simple”.
Graphs – Cyclic vs. Acyclic A path is called a cycle if the starting and ending points are the same. A graph with no cycles is called an acyclic graph. – Trees are an example of a connected acyclic graph. An undirected graph where every pair of vertices is connected by a path is called connected. A complete graph is a graph wherein every pair of vertices is adjacent. Within a tree, |E| = |V| - 1. Note that the pipes indicate cardinality in this context.
Representing Graphs on Computers Two general methods exist to represent graphs on computers. – Adjacency matrix – Adjacency list An adjacency matrix is an N*N matrix wherein each element of the matrix represents an edge within a graph. – The existence of an edge is represented by a 1 – Non-existence is represented by a zero. – An example of an adjacency graph will be drawn in class.
Representing Graphs on Computers The adjacency matrix can be modified to include weights as follows: – A i,j = 0, if i == j – A i,j = w i,j, if an edge exists between vertices i and j – A i,j = ∞, if no edge exists between vertices i and j Similarly, an adjacency list can be created by storing only the values for which edges exist.
Adjacency List vs. Matrix Generally, the choice to use a list over a matrix is based upon a single factor. – Are the edges sparse or dense? If edges are sparse, a list has a lower space complexity. If edges are dense, it will be faster to use a matrix. Sparseness can be determined by comparing |E| to |V| 2. If |E| << |V| 2, then the edges are sparse.
Goals Find the shortest path through a graph Find the shortest path using a parallel algorithm Determine which parallel algorithms are efficient and have high performance (under certain circumstances).
Minimum Spanning Tree (MST) Spanning tree – undirected graph containing all vertices of a graph G, and is a subgraph of G. Minimum Spanning Tree (MST) – a spanning tree for a weighted undirected graph with minimum weight. Example: find the shortest path between Maryville, MO and Benbrook, TX
Prim’s Algorithm – Finding the MST Greedy Algorithm First, select an arbitrary vertex. Grow the MST by selecting a new vertex and edge guaranteed to be in a MST. Repeat until all vertices have been selected.
Prim’s Algorithm MST(V, E, weight, root) Vtemp = {root}; d[root] = 0; for each vertex v not in Vtemp if an edge exists between v and root, d[v] = weight(root, v); else d[v] = ∞; endif endfor while Vtemp does not contain all vertices, select vertex u where d[u] is the minimum weight for all vertices not in Vtemp; Add u to Vtemp; for each vertex not in Vtemp, d[v] = min(d[v], weight(u,v)); endfor endwhile
Parallelization It is not straightforward to partition the while loop in Prim’s algorithm. Given p processes and n vertices, give each process n/p vertices. Let V i be the vertices for process P i. – Each P i stores the part of array d for V i. – P i computes d i [u] as the minimum weight for all vertices in V i that are not yet in V temp. – A global minimum is computed using an all-to-one reduction over d i [u] into P 0. – P 0 broadcasts u to all processes. – u is added to V temp.
Complexity of Prim’s Algorithm The serial version of Prim’s Algorithm is O(n 2 ). Determining the work and step complexity of the parallelized version is not straightforward. – Over each iteration the computations performed are O(n/p) by each process. – The communications performed by each process over each iteration are O(log p) – This does result in a work efficient algorithm, however, the cost of communication is somewhat high, therefore it is recommended that Prim’s Algorithm use at most O(n/lg n) processes
Dijkstra’s Algorithm Dijkstra’s algorithm is a single source, shortest path algorithm. It finds the shortest paths from one vertex to all other vertices in a graph. A shortest path is defined as a minimum weight path. Edge weights could represent distance, cost, penalty, loss, etc.
Dijkstra’s Algorithm This algorithm is very similar to Prim’s Algorithm. Incrementally finds the shortest paths from a source to other vertices in the graph. The algorithm is greedy – it chooses the edge to a vertex that appears closest. Its runtime is O(n 2 )
Dijkstra’s Algorithm DijkstraAlgorithm(V, E, weights, source) Vtemp = {source}; for each vertex v not in Vtemp, if edge(s,v) exists, l[v] = weights(s, v); else l[v] = ∞; endif endfor while Vtemp does not contain all vertices find a vertex u where l[u] is the smallest weight to a vertex not in Vtemp for each vertex v not in Vtemp l[v] = min(l[v], l[u] + weights(u,v)); endfor endwhile
Parallelization Parallelization of Dijkstra’s algorithm is almost identical to that of Prim’s algorithm. Thus, this parallelization results in the same time and space complexity.
All-Pairs Shortest Path Rather than finding the shortest path from one vertex to all other vertices, we may be interested in finding the shortest paths between all pairs of vertices. Next time we will investigate Floyd’s Algorithm, which is one method for finding an all-pairs shortest path. Thereafter, we will investigate depth-first search.