BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm.

BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm

The Buffered Repository Tree (BRT) Stores key-value pairs (k,v) Supported operations: I NSERT (k,v) inserts a new pair (k,v) into T E XTRACT (k) extracts all pairs with key k Complexity: I NSERT : O((1/B)log 2 (N/B)) amortized E XTRACT : O(log 2 (N/B) + K/B) amortized (K = number of reported elements)

The Buffered Repository Tree (BRT) (2,4)-tree Leaves store between B/4 and B elements Internal nodes have buffers of size B Root in main memory, rest on disk Main memory Disk

Main memory Disk I NSERT (k,v) O(X/B) I/Os to empty buffer of size X  B Amortized charge per element and level: O(1/B) Height of tree: O(log 2 (N/B)) Insertion cost: O((1/B)log 2 (N/B)) amortized Main memory Disk

Elements with key k E XTRACT (k) Number of traversed nodes: O(log 2 (N/B) + K/B) I/Os per node: O(1) Cost of operation: O(log 2 (N/B) + K/B) But careful with removal of extracted elements Main memory Disk Main memory Disk

Cost of Rebalancing O(N/B) leaf creations and deletions ØO(N/B) node splits, fusions, merges Each such operation costs O(1) I/Os ØO(N/B) I/Os for rebalancing Theorem: The BRT supports I NSERT and E XTRACT operations in O((1/B)log 2 (N/B)) and O(log 2 (N/B) + K/B) I/Os amortized.

Directed DFS Algorithm proceeds as internal memory algorithm: Use stack to determine order in which vertices are visited For current vertex v: Find unvisited out-neighbor w Push w on the stack Continue search at w If no unvisited out-neighbor exists Remove v from stack Continue search at v’s parent Stack operations cost O(N/B) I/Os Problem: Finding an unvisited vertex

Directed DFS Data structures: BRT T Stores directed edges (v,w) with key v Priority queues P(v), one per vertex Stores unexplored out-edges of v Invariant: Not in P(v) In P(v) and in T In P(v), but not in T

Directed DFS Finding next vertex after vertex v: v E XTRACT (v): Retrieve red edges from T Remove these edges from P(v) using D ELETE Retrieve next edge using D ELETE M IN on P(v) Insert in-edges of w into T w Push w on the stack O(log 2 (|E|/B) + K 1 /B) O(sort(K 1 )) O(1 + (K 2 /B)log 2 (|E|/B)) O(1/B) amortized O((1/B)log m (|E|/B)) O(|V| log 2 (|E|/B) + |E|/B) O(|V| + sort(|E|)) O((|E|/B)log 2 (|E|/B)) O(|V|/B) O(sort(|E|)) Total: O((|V| + |E|/B)log 2 (|E|/B))

Directed DFS + BFS BFS can be solved using same algorithm Only modification: Use queue (FIFO) instead of stack Theorem: Depth first-search and breadth-first search in a directed graph G = (V,E) can be solved in O((|V|+|E|/B)log 2 (|E|/B)) I/Os. Exercise: Convince yourself that the priority queues P(v) are not necessary in the case of BFS.

Undirected BFS Observation: For v  L(i), all its neighbors are in L(i – 1)  L(i)  L(i + 1). ØBuild BFS-tree level by level: Initially, L(0) = {r} Given levels L(i – 1) and L(i): Let X(i) = set of all neighbors of vertices in L(i) Let L(i + 1) = X(i) \ (L(i – 1)  L(i)) Partition graph into levels L(0), L(1),... around source: L(0), L(1), L(2), L(3)

Undirected BFS Constructing L(i + 1): Retrieve adjacency lists of vertices in L(i)  X(i) Sort X(i) Scan L(i – 1), L(i), and X(i) to Remove duplicates from X(i) Compute X(i) \ (L(i – 1)  L(i)) Complexity: O(|L(i)| + sort(|L(i – 1)| + |X(i)|)) I/Os O( ) I/Os|V| +sort(|E|) Theorem: Breadth-first search in an undirected graph G = (V,E) can be solved in O(|V| + sort(|E|)) I/Os.

A Faster BFS-Algorithm Problem with simple BFS-algorithm: Random accesses to retrieve adjacency lists Idea for a faster algorithm: Load more than one adjacency list at a time Reduces number of random accesses Causes edges to be involved in more than one iteration of the algorithm ØTrade-off

A Faster BFS-Algorithm (Randomized) Let 0 <  < 1 be a parameter (specified later) Two phases: Build  |V| disjoint clusters of diameter O(1/  ) Perform modified version of S IMPLE B FS Clusters C 1,...,C q formed using BFS from randomly chosen set V’ = {r 1,...,r q } of masters Vertex is chosen as a master with probability  (coin flip) Observation: E[|V’|] =  |V|. That is, the expected number of clusters is  |V|.

Forming Clusters (Randomized) Apply S IMPLE B FS to form clusters L(0) = V’ v  C i if v is descendant of r i s

Forming Clusters (Randomized) Lemma: The expected diameter of a cluster is 2/ . E[k]  1/  Corollary: The clusters are formed in expected O((1/  )sort(|E|)) I/Os. x v1v1 v2v2 v3v3 v4v4 v5v5 s vkvk

Forming Clusters (Randomized) Form files F 1,...,F q, one per cluster F i = concatenation of adjacency lists of vertices in C i Augment every edge (v,w)  F i with the start position of file F j s.t. w  C j : Edge = triple (v,w,p j ) s

The BFS-Phase Maintain a sorted pool H of edges s.t. adjacency lists of vertices in L(i) are contained in H Scan L(i) and H to find vertices in L(i) whose adjacency lists are not in H Form list of start positions of files containing these adjacency lists and remove duplicates Retrieve files, sort them, and merge resulting list H’ with H Scan L(i) and H to build X(i) Construct L(i + 1) from L(i – 1), L(i), and X(i) as before O((|L(i)| + |H|)/B) O(sort(|L(i)|)) O(K + sort(|H’|) + |H|/B) O((|L(i)| + |H|)/B) O(sort(|L(i)| + |L(i – 1)| + |X(i)|))

The BFS-Phase I/O-complexity of single step: O(K + |H|/B + sort(|H’| + |L(i – 1)| + |L(i)| + |X(i)|))  Expected I/O-complexity: O(  |V| + |E|/(  B) + sort(|E|)) Choose Theorem: BFS in an undirected graph G = (V,E) can be solved in I/Os.

Single Source Shortest Paths The tournament tree SSSP in undirected graphs SSSP in planar graphs

Single Source Shortest Paths Need: I/O-efficient priority queue I/O-efficient method to update only unvisited vertices

The Tournament Tree =I/O-efficient priority queue Supports: I NSERT (x,p) D ELETE (x) D ELETE M IN D ECREASE K EY (x,p) All operations take O((1/B)log 2 (N/B)) I/Os amortized Note: N = size of the universe  # elements in the tree

The Tournament Tree Static binary tree over all elements in the universe Elements map to leaves, M elements per leaf Internal nodes have signal buffers of size M Root in main memory, rest on disk Main memory Disk Internal nodes store between M/2 and M elements

Main memory Disk The Tournament Tree Elements stored at each node are sorted by priority Elements at node v have smaller priority than elements at v’s descendants Convention: x  T if and only if p(x) is finite

The Tournament Tree Deletions Operation D ELETE (x)  signal D ELETE (x) x D ELETE (x) U PDATE (x,  ) v

The Tournament Tree Insertions and Updates Operations I NSERT (x,p) and D ECREASE K EY (x,p)  signal U PDATE (x,p) x w v Current priority p’ If p < p’: Update If p  p’: Do nothing All elements < p Forward signal to w At least one element  p Insert x Send D ELETE (x) to w

The Tournament Tree Handling Overflow Let y be element with highest priority p y Send signal P USH (y,p y ) to appropriate child of v y w v

The Tournament Tree Keeping the Nodes Filled w v O(M/B) I/Os to move M/2 elements one level up the tree

Main memory Disk The Tournament Tree Signal Propagation Scan v’s signal, partition into sets X u and X w Load u into memory, apply signals in X u to u, insert signals into u’s signal buffer Do the same for w O((|X| + M)/B) = O(|X|/B) I/Os

The Tournament Tree Analysis Elements travel up the tree Cost: O(1/B) I/Os amortized per element and level O((K/B)log 2 (N/B)) I/Os for K operations Signals travel down the tree Cost: O(1/B) I/Os amortized per signal and level O(K) signals for K operations O((K/B)log 2 (N/B)) I/Os Theorem: The tournament tree supports I NSERT, D ELETE, D ELETE M IN, and D ECREASE K EY operations in O((1/B)log 2 (N/B)) I/Os amortized.

Single Source Shortest Paths Modified Dijkstra: Retrieve next vertex v from priority queue Q using D ELETE M IN Retrieve v’s adjacency list Update distances of all of v’s neighbors, except predecessor u on the path from s to v Repeat O(|V| + (E/B)log 2 (V/B)) I/Os using tournament tree

Single Source Shortest Paths Problem: Observation: If v performs a spurious update of u, u has tried to update v before. Record this update attempt of u on v by insterting u into another priority queue Q’ Priority: d(s,u) + w({u,v}) u v

Single Source Shortest Paths Second modification: Retrieve next vertex using two D ELETE M IN ’s, one on Q, one on Q’ Let (x,p x ) be the element retrieved from Q, let (y,p y ) be the element retrieved from Q’ If p x  p y : re-insert (y,p y ) into Q’ and proceed as normal If p x < p y : re-insert (x,p x ) into Q and perform a D ELETE (y) on Q

Single Source Shortest Paths Lemma: A spurious update is removed from Q before the targeted vertex can be retrieved using D ELETE M IN. Event A: Spurious update happens (“time”: d(s,v)) Event B: Vertex u is deleted byretrieval of u from Q’ (“time”: d(s,u) + w(e)) Event C: Vertex u is retrieved from Q using D ELETE M IN operation (“time”: d(s,v) + w(e)) u v

Single Source Shortest Paths Assume that all vertices have different distance from source s Ød(u) < d(v) d(v)  d(u) + w(e) < d(u) + w(e) Sequence of events: A  B  C Theorem: The single source shortest path problem on an undirected graph G = (V,E) can be solved in O(|V| + (|E|/B)log 2 (|V|/B)) I/Os.

Planar Graphs Shortest paths in planar graphs Planar separators Planar DFS

Shortest Paths in Planar Graphs s GRGR

sv vs Observation: For every separator vertex v, the distances from s to v in G and G R are the same. ØThe distances from s to all separator vertices can be computed in G R.

s Shortest Paths in Planar Graphs Observation: For every vertex v in G i, dist(s,v) = min{dist(s,x) + dist(x,v) : v   G i }. ØCan compute dist(s,v) in the following graph: vs

Shortest Paths in Planar Graphs Three main steps: Solve all-pairs shortest paths in subgraphs G i Compute shortest paths from s to separator vertices in G R Compute shortest paths from s to all remaining vertices

Shortest Paths in Planar Graphs Regular h-partition: O(N/h) subgraphs G 1,...,G r Each G i has size at most h Each G i has boundary size at most Total number of separator vertices Number of boundary sets is O(N/h)

Shortest Paths in Planar Graphs Three main steps: Solve all-pairs shortest paths in subgraphs G i Compute shortest paths from s to separator vertices in G R Compute shortest paths from s to all remaining vertices Assume the given partition is regular B 2 -partition ØSteps 1 and 3 take O(scan(N)) I/Os ØGraph G R has O(N/B) vertices and O(N) edges

Shortest Paths in Planar Graphs Data structures: List L storing tentative distances of all vertices Priority queue Q storing vertices with their tentative distances as priorities One step: Retrieve next vertex v using D ELETE M IN Get distances of v’s neighbors from L Update their distances in Q using D ELETE and I NSERT ØO(N + sort(N)) I/Os

Shortest Paths in Planar Graphs One I/O per boundary set Each boundary set is touched O(B) times: Once per vertex on the boundary of the region O(N/B 2 ) boundary sets  O(N/B) I/Os

Planar Separator Goal: Compute a separator S of size whose removal partitions G into subgraphs of size at most h. Basic idea: Compute hierarchy of log(DB) graphs of geometrically decreasing size using graph contraction Compute a separator of the smallest graph Undo the contractions and maintain the separator while doing this Assumption: M =  (h log 2 B)

G0G0 Planar Separator G1G1 G2G2

Properties: All G i are planar |G i+1 |  |G i |/2 Every vertex in G i+1 represents only a constant number of vertices in G i Every vertex in G i+1 represents at most 2 i+2 vertices in G 0 r = log 2 (DB) graphs G 0,…,G r Ø|G r | = O(N/(DB))

Planar Separator G0G0 G1G1 G2G2

Compute separator S r of G r : S r = S r  partitions G r into connected components of size at most h  log 2 (DB) Takes O(|G r |) = O(N/B) I/Os [AD96]

Planar Separator Compute S i from S i+1 : Let S i be the set of vertices in G i represented by the vertices in S i+1 Connected components of G i – S i have size at most c  h  log 2 (DB) Partition every connected components of size more than h  log 2 (DB) into components of size h  log 2 (DB)  separator S i  Takes O(sort(|G i |)) I/Os: Connected components O(sort(|G i |)) Partitioning happens in internal memory Total: O(sort(N)) I/Os

Planar Separator Separator S 0 partitions G 0 into connected components of size at most h  log 2 (DB) Size of S 0 :

Planar Separator Compute a superset S of S 0 so that no connected component of G – S has size more than h: Partition every connected component of G – S 0 separately in internal memory Total number of extra separator vertices is Extra cost: O(sort(N)) I/Os Theorem: A separator S of size whose removal partitions G into subgraphs of size at most h can be obtained in O(sort(N)) I/Os, provided that M =  (h log 2 B).

Building the Graph Hierarchy Properties: All G i are planar |G i+1 |  |G i |/2 Every vertex in G i+1 represents only a constant number of vertices in G i Every vertex in G i+1 represents at most 2 i+2 vertices in G 0 Build G i+1 from G i by Contracting edges Merging vertices of degree 2 with the same neighbors

Building the Graph Hierarchy Iterative approach: Extract set of edges that can be contracted Contract subset of these edges to reduce number of vertices by a factor of two Repeat until no contractible edges remain Problem: Standard graph contraction procedure may contract too many vertices into a single vertex.

Building the Graph Hierarchy Solution: Compute maximal matching of contractible subgraph Contract edges in the matching New problem: We may not contract sufficient number of edges to reduce number of vertices by a constant factor Two-stage contraction: Contract maximal matching Contract edges between matched and unmatched vertices

Building the Graph Hierarchy Why is this two-stage approach good? No unmatched vertex remains in contractible subgraph Every matched vertex represents at least two vertices before the contraction ØSize of graph reduces by a factor of two ØIf a single iteration takes O(sort(|G i |)) I/Os, the whole construction of G i+1 from G i takes O(sort(|G i |)) I/Os

A Single Contraction Phase Maximal matching can be computed and contracted in O(sort(|H|)) I/Os, where H is the current contractible subgraph Bipartite contraction: Takes O(sort(|H|)) I/Os using buffer tree as priority queue

Building the Graph Hierarchy Lemma: Graph G i+1 can be constructed from G i in O(sort(|G i |)) I/Os. Corollary: The whole graph hierarchy can be built in O(sort(|G 0 |)) = O(sort(N)) I/Os.

Level 0 Level 1 Level 2 Planar DFS s

Observation Observation: Every cycle in the i-th layer is a boundary cycle of graph G i. ØEvery bicomp of a layer is a cycle. Level > i Level < i

DFS in a Layer

Planar DFS DFS in a single layer H i takes O(sort(|H i |)) I/Os: Compute the bicomps Root the bicomp tree Remove one of the edges incident to parent cutpoint in each cycle ØTotal I/O-complexity: O(sort(N))

Planar DFS GiGi v

Building the Face-on-Vertex Graph

Lower Bounds and Open Problems Lower bounds List ranking, BFS, DFS, and shortest paths Connected and biconnected components Open problems

Lower Bounds Split Proximate Neighbors 1234567812345678 1234567812345678

Lemma: Split proximate neighbors requires  (perm(N)) I/Os. 1234567812345678 1234567812345678 1234567812345678 I(N) 1234567812345678 1234567812345678 O(scan(N)) Total: O(I(N) + scan(N)) = O(I(N))  I(N) =  (perm(N))

Lower Bounds List Ranking Consider general algorithms for weighted list ranking Algorithm is only allowed to use associativity of sum operator ØAlgorithm can be made to have the following property: For every vertex v, v and succ(v) are both in main memory at some point during the course of the algorithm Note: The lower bound we show does not hold for unweighted list ranking or weighted list ranking over groups.

12345678123456781234567812345678 Lower Bounds List Ranking When both copies of x are in main memory, move to buffer of size B When buffer full, flush to disk Split proximate neighbors could be solved in O(I(N) + scan(N)) I/Os  I(N) =  (perm(N))

Lower Bounds List Ranking, BFS, DFS, and Shortest Paths Theorem: List ranking requires  (perm(N)) I/Os. List ranking can be solved using BFS, DFS, or SSSP from the head of the list. Theorem: BFS, DFS, and SSSP require  (perm(N)) I/Os. Note: Again, lower bound holds only for algorithms that compute distances from source only by adding path lengths.

Lower Bounds Segmented Duplicate Elimination Let P  N  P 2 Elements drawn from interval [2P+1,3P] Construct Boolean array C[2P+1..3P] s.t. C[i] = 1 iff i  S Proposition: Segmented duplicate elimination requires  (perm(N)) I/Os. 17181920222319 20 222018231719 S: P/2

17181920222319 20 222018231719 S1S1 S2S2 S3S3 S4S4 17 18 19 20 21 22 23 24 1 2 3 4 Lower Bounds Connected Components Graph construction O(scan(N)) I/Os |V| =  (P), |E| = N

Lower Bounds Connected and Biconnected Components Theorem: Computing the connected components of a graph G = (V,E) requires  (perm(|E|)) I/Os. Theorem: Computing the biconnected components of a graph G = (V,E) requires  (perm(|E|)) I/Os.

More Classes of Sparse Graphs Grid graphs Separators: Size in O(sort(N)) I/Os BFS/SSSP: O(sort(N)) DFS: Graphs of bounded treewidth Separators: O(N/h) in O(sort(N)) I/Os BFS/SSSP: O(sort(N)) DFS: ???

Open Problems Optimal separators for grid graphs DFS Grid graphs Graphs of bounded treewidth Semi-external shortest paths Optimal connectivity Optimal BFS, DFS, and shortest paths or lower bounds Directed graphs Topological sorting Strongly connected components

BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm.

Similar presentations

Presentation on theme: "BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm.

Similar presentations

Presentation on theme: "BFS and DFS BFS and DFS in directed graphs BFS in undirected graphs An improved undirected BFS-algorithm."— Presentation transcript:

Similar presentations

About project

Feedback