I/O-Efficient Graph Algorithms Norbert Zeh Duke University EEF Summer School on Massive Data Sets Århus, Denmark June 26 – July 1, 2002.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Graph Algorithms Algorithm Design and Analysis Victor AdamchikCS Spring 2014 Lecture 11Feb 07, 2014Carnegie Mellon University.
Chapter 23 Minimum Spanning Tree
Comp 122, Spring 2004 Greedy Algorithms. greedy - 2 Lin / Devi Comp 122, Fall 2003 Overview  Like dynamic programming, used to solve optimization problems.
22C:19 Discrete Structures Trees Spring 2014 Sukumar Ghosh.
IKI 10100: Data Structures & Algorithms Ruli Manurung (acknowledgments to Denny & Ade Azurat) 1 Fasilkom UI Ruli Manurung (Fasilkom UI)IKI10100: Lecture10.
Breadth-First Search Seminar – Networking Algorithms CS and EE Dept. Lulea University of Technology 27 Jan Mohammad Reza Akhavan.
Graphs Chapter 20 Data Structures and Problem Solving with C++: Walls and Mirrors, Carrano and Henry, © 2013.
BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm.
Advanced Topics in Algorithms and Data Structures 1 Rooting a tree For doing any tree computation, we need to know the parent p ( v ) for each node v.
Chapter 23 Minimum Spanning Trees
Data Structures & Algorithms Graph Search Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Greedy Algorithms Reading Material: Chapter 8 (Except Section 8.5)
Greedy Algorithms Like dynamic programming algorithms, greedy algorithms are usually designed to solve optimization problems Unlike dynamic programming.
External Memory Algorithms Kamesh Munagala. External Memory Model Aggrawal and Vitter, 1988.
Graphs Chapter 20 Data Structures and Problem Solving with C++: Walls and Mirrors, Frank Carrano, © 2012.
External-Memory MST (Arge, Brodal, Toma). Minimum-Spanning Tree Given a weighted, undirected graph G=(V,E), the minimum-spanning tree (MST) problem is.
Backtracking.
TECH Computer Science Graph Optimization Problems and Greedy Algorithms Greedy Algorithms  // Make the best choice now! Optimization Problems  Minimizing.
Graph Algorithms Using Depth First Search Prepared by John Reif, Ph.D. Distinguished Professor of Computer Science Duke University Analysis of Algorithms.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
IS 2610: Data Structures Graph April 5, 2004.
Theory of Computing Lecture 10 MAS 714 Hartmut Klauck.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Algorithms for Enumerating All Spanning Trees of Undirected and Weighted Graphs Presented by R 李孟哲 R 陳翰霖 R 張仕明 Sanjiv Kapoor and.
A Survey of Techniques for Designing I/O-Efficient Algorithm S.Fahimeh Moosavi Fall 1389.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Spring 2015 Lecture 11: Minimum Spanning Trees
BCT 2083 DISCRETE STRUCTURE AND APPLICATIONS
COSC 2007 Data Structures II Chapter 14 Graphs III.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
Lectures on Greedy Algorithms and Dynamic Programming
Introduction to Graph Theory
Data Structures and Algorithms in Parallel Computing Lecture 2.
Graphs. Graphs Similar to the graphs you’ve known since the 5 th grade: line graphs, bar graphs, etc., but more general. Those mathematical graphs are.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
© 2006 Pearson Addison-Wesley. All rights reserved 14 A-1 Chapter 14 Graphs.
Laura TomaSimplified External memory Algorithms for Planar DAGs Simplified External Memory Algorithms for Planar DAGs July 2004 Lars Arge Laura Toma Duke.
Graph and Digraph Sung Yong Shin TC Lab. CS Dept., KAIST.
Graphs 2015, Fall Pusan National University Ki-Joune Li.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
Lecture 12 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
CSE 589 Applied Algorithms Spring 1999 Prim’s Algorithm for MST Load Balance Spanning Tree Hamiltonian Path.
1 GRAPHS – Definitions A graph G = (V, E) consists of –a set of vertices, V, and –a set of edges, E, where each edge is a pair (v,w) s.t. v,w  V Vertices.
Lecture 20. Graphs and network models 1. Recap Binary search tree is a special binary tree which is designed to make the search of elements or keys in.
CSC317 1 At the same time: Breadth-first search tree: If node v is discovered after u then edge uv is added to the tree. We say that u is a predecessor.
Greedy Algorithms General principle of greedy algorithm
Lecture ? The Algorithms of Kruskal and Prim
Introduction to Algorithms
Chapter 5 : Trees.
C.Eng 213 Data Structures Graphs Fall Section 3.
Chapter 14 Graph Algorithms
Lecture 12 Algorithm Analysis
CISC 235: Topic 10 Graph Algorithms.
CS202 - Fundamental Structures of Computer Science II
CS120 Graphs.
Graph Algorithms Using Depth First Search
Short paths and spanning trees
Graph Algorithm.
Graphs Chapter 13.
Chapter 11 Graphs.
Lecture 12 Algorithm Analysis
Greedy Algorithms Comp 122, Spring 2004.
Autumn 2016 Lecture 10 Minimum Spanning Trees
CSE 417: Algorithms and Computational Complexity
Lecture 12 Algorithm Analysis
Chapter 14 Graphs © 2011 Pearson Addison-Wesley. All rights reserved.
Presentation transcript:

I/O-Efficient Graph Algorithms Norbert Zeh Duke University EEF Summer School on Massive Data Sets Århus, Denmark June 26 – July 1, 2002

Motivation For theoreticians: Graph problems are neat, often difficult, hence interesting For practitioners: Massive graphs arise in GIS, web modelling,... Problems in computational geometry can be expressed as graph problems Many abstract problems best viewed as graph problems Extreme: Pointer-based data structures = graphs with extra information at their nodes

Outline Fundamental graph problems List ranking Algorithms for trees Euler tour Tree labelling Graph searching BFS/DFS Connectivity Connected components Minimum spanning tree Single source shortest paths

Outline Techniques and data structures Graph contraction Time-forward processing Tournament tree Buffered repository tree Lower bounds List ranking Connectivity Planar graphs

Introduction and “Simple” Problems List ranking Euler tour Tree labelling Evaluating directed acyclic graphs Greedy graph algorithms

List Ranking

Why Is List Ranking Non-Trivial?  The internal memory algorithm spends  (N) I/Os in the worst case.

An Efficient List Ranking Algorithm Assume an independent set of size at least N/3 can be found efficiently (in O(sort(N)) I/Os)

An Efficient List Ranking Algorithm Compressing L: Sort elements in L \ I Sort elements in I by their successor pointers Scan the two lists to update the label of succ(v), for every element v  I The I/O-complexity of this procedure is Theorem: A list of size N can be ranked in O(sort(N)) I/Os.

The Euler Tour Technique Goal: Given a tree T, represent it by a list L so that certain computations on T can be performed by ranking L. r

The Euler Tour Technique Theorem: Given the adjacency lists of the vertices in T, an Euler tour can be constructed in O(scan(N)) I/Os. Let {v,w 1 },…,{v,w r } be the edges incident to v Then succ((w i,v)) = (v,w i+1 )) v w4w4 w3w3 w2w2 w1w1

Rooting a Tree Choosing a vertex r as the root of a tree T defines parent-child relationships between adjacent nodes Rooting tree T = computing for every edge {v,w} who is the parent and who is the child v = p(w) if and only if rank((v,w)) < rank((w,v)) Theorem: A tree can be rooted in O(sort(N)) I/Os.

Computing a Preorder Numbering Theorem: A preorder numbering of a rooted tree T can be computed in O(sort(N)) I/Os preorder#(v) = rank((p(v),v))

Computing Subtree Sizes Theorem: The nodes of T can be labelled with their subtree sizes in O(sort(N)) I/Os

Evaluating a Directed Acyclic Graph More general: Given a labelling , compute a labelling  so that  (v) is computed from  (v) and  (u 1 ),…,  (u r ), where u 1,…,u r are v’s in-neighbors

Q: Time-Forward Processing Assume nodes are given in topologically sorted order ØUse priority queue Q to send data along the edges. (6,1,0)(4,2,1) (5,2,1) (6,1,0)(4,2,1) (4,3,0) (5,2,1) (5,3,0) (6,1,0) (5,2,1) (5,3,0) (6,1,0)(5,2,1) (5,3,0) (6,1,0) (7,4,0) (8,4,0) (6,1,0) (7,4,0) (8,4,0)(6,1,0) (6,5,1) (7,4,0) (7,5,1) (8,4,0) (8,5,1) (7,4,0) (7,5,1) (8,4,0) (8,5,1)(7,4,0) (7,5,1) (8,4,0) (8,5,1) (10,6,0) (8,4,0) (8,5,1) (10,6,0)(8,4,0) (8,5,1) (9,7,1) (10,6,0) (10,7,1) (9,7,1) (10,6,0) (10,7,1)(9,7,1) (9,8,0) (10,6,0) (10,7,1) (10,6,0) (10,7,1)(10,6,0) (10,7,1) (11,9,1) (12,9,1) (11,9,1) (12,9,1)(11,9,1) (11,10,0) (12,9,1) (12,10,0) (12,9,1) (12,10,0)

Time-Forward Processing Analysis: Vertex set + adjacency lists scanned ØO(scan(|V| + |E|)) I/Os Priority queue: Every edge inserted into and deleted from Q exactly once ØO(|E|) priority queue operations ØO(sort(|E|)) I/Os

Time-Forward Processing Analysis: Vertex set + adjacency lists scanned ØO(scan(|V| + |E|)) I/Os Priority queue: Every edge inserted into and deleted from Q exactly once ØO(|E|) priority queue operations ØO(sort(|E|)) I/Os Theorem: A directed acyclic graph G = (V,E) can be evaluated in O(sort(|V| + |E|)) I/Os.

Maximal Independent Set (MIS) Algorithm G REEDY MIS: 1. I  0 2. for every vertex v  G do 3. if no neighbor of v is in I then 4. Add v to I 5. end if 6. end for

Maximal Independent Set (MIS) Algorithm G REEDY MIS: 1. I  0 2. for every vertex v  G do 3. if no neighbor of v is in I then 4. Add v to I 5. end if 6. end for Observation: It suffices to consider all neighbors of v which have been visited in a previous iteration.

Maximal Independent Set (MIS)

Maximal Independent Set (MIS) Theorem: A maximal independent set of a graph G = (V,E) can be computed in O(sort(|V|+|E|)) I/Os.

Large Independent Set of a List Corollary: An independent set of size at least N/3 for a list L of size N can be found in O(sort(N)) I/Os. Every vertex in an MIS I prevents two other vertices from being in I: ØEvery MIS has size at least N/3.

Graph Connectivity Connected components Minimum spanning tree

Connectivity A Semi-External Algorithm

Analysis: Scan vertex set to load vertices into main memory Scan edge set to carry out algorithm O(scan(|V| + |E|)) I/Os Theorem: The connected components of a graph can be computed in O(scan(|V| + |E|)) I/Os, provided that |V|  M.

Connectivity The General Case Idea: If |V|  M Use semi-external algorithm If |V| > M Identify simple connected subgraphs of G Contract these subgraphs to obtain graph G’ = (V’,E’) with |V’|  c|V|, c < 1 Recursively compute connected components of G’ Obtain labelling of connected components of G from labelling of components of G’

A B C D E Connectivity The General Case a b c d e f g h i j k l m n A B C D E

Main steps: Find smallest neighbors (easy) Compute connected components of graph H induced by selected edges Contract each component into a single vertex (easy) Call the procedure recursively Copy label of every vertex v  G’ to all vertices in G represented by v (easy)

Connectivity The General Case Every connected component of H has size at least 2 Ø|V’|  |V|/2 Ø recursive calls Theorem: The connected components of a graph G = (V,E) can be computed in I/Os.

Connectivity The General Case Later: BFS in O(|V| + sort(|E|)) I/Os ØCan be used to identify connected components When |V| = |E|/B, algorithm takes O(sort(|E|)) I/Os Can stop recursion after recursive calls Theorem: The connected components of a graph G = (V,E) can be computed in I/Os.

a b c d e f g h i j k abc d j h k g e f i Biconnectivity Theorem: The biconnected components of a graph G = (V,E) can be computed in I/Os. a b c d e f g h i j k

Minimum Spanning Tree (MST) Observation: Connectivity algorithm can be augmented to produce a spanning tree of G. a b c d e f g h i j k l m n A B C D E

Minimum Spanning Tree (MST) To obtain a minimum spanning tree: Choose edge of minimum weight incident to v Some book-keeping: The weight of an edge e in the compressed graph = the min weight of all edges represented by e When “e is added” to T, add in fact this minimum edge v a b c d

Minimum Spanning Tree (MST) a b c d e f g h i j k l m n A B C D E Theorem: A MST of a graph G = (V,E) can be computed in I/Os.

A Fast MST Algorithm Idea: Assume MST can be computed in O(|V| + sort(|E|)) I/Os Again recursion can be stopped after iterations Prim’s algorithm:

A Fast MST Algorithm Maintain superset of blue edges in priority queue Q When edge {v,w} of minimum weight is retrieved, test whether v,w are both in T Yes  discard edge No  Add edge to MST and add all edges incident to w to Q, except {v,w} (assuming that w  T) Problem: How to test whether v,w  T.

A Fast MST Algorithm If v,w  T, but {v,w}  T, then both v and w have inserted edge {v,w} into Q ØThere are two copies of {v,w} in Q They are consecutive ØPerform two D ELETE M IN operations If {v,w} = {y,z}, discard both Otherwise, add {v,w} to T and re-insert {y,z} v w

A Fast MST Algorithm Analysis: O(|V| + scan(|E|)) I/Os for retrieving adjacency lists O(sort(|E|)) I/Os for priority queue operations Theorem: A MST of a graph G = (V,E) can be found in O(|V| + sort(|E|)) I/Os. Corollary: A MST of a graph G = (V,E) can be found in I/Os.

Graph Contraction and Sparse Graphs A graph G = (V,E) is sparse if for any graph H obtainable from G through a series of edge contractions, |E(H)| = O(|V(H)|). For a sparse graph, the number of vertices and edges in G reduces by a constant factor in each iteration of the connectivity and MST algorithms. Theorem: The connected components or a MST of a sparse graph with N vertices can be computed in O(sort(N)) I/Os.

Three Techniques for Graph Algorithms Time-forward processing: Express graph problems as evaluation problems of DAGs Graph contraction: Reduce the size of G while maintaining the properties of interest Solve problem recursively on compressed graph Construct solution for G from solution for compressed graph Bootstrapping: Switch to generally less efficient algorithm as soon as (part of the) input is small enough