Outline: Transitive Closure Compression

Slides:



Advertisements
Similar presentations
Longest Common Subsequence
Advertisements

Divide and Conquer. Subject Series-Parallel Digraphs Planarity testing.
Transitive Closure Compression Jan. 2013Yangjun Chen ACS Outline: Transitive Closure Compression Motivation DAG decomposition into node-disjoint.
Jan. 2013Dr. Yangjun Chen ACS Outline Signature Files - Signature for attribute values - Signature for records - Searching a signature file Signature.
Yangjun Chen 1 Bipartite Graphs What is a bipartite graph? Properties of bipartite graphs Matching and maximum matching - alternative paths - augmenting.
Implementation of Graph Decomposition and Recursive Closures Graph Decomposition and Recursive Closures was published in 2003 by Professor Chen. The project.
An Efficient Algorithm for Answering Graph Reachability Queries Yangjun Chen, Yibin Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage.
Introduction to Graphs
Yangjun Chen 1 Bipartite Graph 1.A graph G is bipartite if the node set V can be partitioned into two sets V 1 and V 2 in such a way that no nodes from.
Reachability Queries Sept. 2014Yangjun Chen ACS Outline: Reachability Query Evaluation What is reachability query? Reachability query evaluation.
Recursive Graph Deduction and Reachability Queries Yangjun Chen Dept. Applied Computer Science, University of Winnipeg 515 Portage Ave. Winnipeg, Manitoba,
Maximal Independent Set Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
1 Efficient Discovery of Conserved Patterns Using a Pattern Graph Inge Jonassen Pattern Discovery Arwa Zabian 13/07/2015.
Important Problem Types and Fundamental Data Structures
9.2 Graph Terminology and Special Types Graphs
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Chapter 2 Graph Algorithms.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
Union-find Algorithm Presented by Michael Cassarino.
Data Structures & Algorithms Graphs
Chapter 10 Graph Theory Eulerian Cycle and the property of graph theory 10.3 The important property of graph theory and its representation 10.4.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
PC-Trees & PQ-Trees. 2 Table of contents Review of PQ-trees –Template operations Introducing PC-trees The PC-tree algorithm –Terminal nodes –Splitting.
Chapter 8: Relations. 8.1 Relations and Their Properties Binary relations: Let A and B be any two sets. A binary relation R from A to B, written R : A.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
Iterative Improvement for Domain-Specific Problems Lecturer: Jing Liu Homepage:
Data Structures and Algorithm Analysis Graph Algorithms Lecturer: Jing Liu Homepage:
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
CS 361 – Chapter 13 Graph Review Purpose Representation Traversal Comparison with tree.
Capabilities, Minimization, and Transformation of Sequential Machines
Directed Graphs 12/7/2017 7:15 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
A Linear-Space Top-down Algorithm for Tree Inclusion Problem
Maximum Flow c v 3/3 4/6 1/1 4/7 t s 3/3 w 1/9 3/5 1/1 3/5 u z 2/2
Special Graphs By: Sandeep Tuli Astt. Prof. CSE.
Chapter 5 : Trees.
Computing Connected Components on Parallel Computers
DATA STRUCTURES AND OBJECT ORIENTED PROGRAMMING IN C++
The minimum cost flow problem
GRAPHS Lecture 16 CS2110 Fall 2017.
Graph theory Definitions Trees, cycles, directed graphs.
V12: Network Flows V12 follows closely chapter 12.1 in
Bipartite Graphs What is a bipartite graph?
Heaps © 2010 Goodrich, Tamassia Heaps Heaps
Lectures on Network Flows
Directed Graphs 9/20/2018 1:45 AM Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia,
(edited by Nadia Al-Ghreimil)
Depth-First Search.
The Taxi Scheduling Problem
Planarity Testing.
Graphs Graph transversals.
ICS 353: Design and Analysis of Algorithms
Maximum Flow c v 3/3 4/6 1/1 4/7 t s 3/3 w 1/9 3/5 1/1 3/5 u z 2/2
Graphs All tree structures are hierarchical. This means that each node can only have one parent node. Trees can be used to store data which has a definite.
Connected Components, Directed Graphs, Topological Sort
Interval Partitioning of a Flow Graph
Directed Graphs Directed Graphs Directed Graphs 2/23/ :12 AM BOS
Bipartite Graph 1. A graph G is bipartite if the node set V can be partitioned into two sets V1 and V2 in such a way that no nodes from the same set are.
Algorithms (2IL15) – Lecture 7
Longest increasing subsequence (LIS) Matrix chain multiplication
Maximum Flow c v 3/3 4/6 1/1 4/7 t s 3/3 w 1/9 3/5 1/1 3/5 u z 2/2
Text Book: Introduction to algorithms By C L R S
V12 Menger’s theorem Borrowing terminology from operations research
Advanced Algorithms Analysis and Design
On the Graph Decomposition
GRAPHS Lecture 17 CS2110 Spring 2018.
Graph Algorithms DS.GR.1 Chapter 9 Overview Representation
Important Problem Types and Fundamental Data Structures
Heaps & Multi-way Search Trees
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Minimum Spanning Trees
Presentation transcript:

Outline: Transitive Closure Compression Motivation DAG decomposition into node-disjoint chains - Graph stratification - Virtual nodes - Maximum set of node-disjoint paths Jan. 2017 Yangjun Chen ACS-7102

Motivation A simple method - store a transitive closure as a matrix G: b c d e a b c d e 1 c b a d e G: a b c d e a b c d e 1 c b a d e G*: Space overhead: O(n2) Query time: O(1) M* = Jan. 2017 Yangjun Chen ACS-7102

DAG Decomposition into Node-Disjoint Chains A DAG is a directed acyclic graph (a graph containing no cycles). A DAG is a directed acyclic graph (a graph containing no cycles). On a chain, if node v appears above node u, there is a path from v to u in G. On a chain, if node v appears above node u, there is a path from v to u in G. a f b c d e g h i g h i b c d a f e Jan. 2017 Yangjun Chen ACS-7102

Decomposition of a DAG into a set of node-disjoint chains b c d e g h i a c e f b d g h i Jan. 2017 Yangjun Chen ACS-7102

Based on such a chain decomposition, we can assign each node an index as follows: (1) Number each chain and number each node on a chain. The jth node on the ith chain will be assigned a pair (i, j) as its index. a c e f b d g h i (1, 1) (1, 2) (1, 3) (2, 1) (2, 2) (2, 3) (3, 1) (3, 2) (3, 3) Jan. 2017 Yangjun Chen ACS-7102

Each node v on the ith chain will also be associated with an index sequence of length k : (1, j1) … (i – 1, ji-1) (i + 1, ji+1) … (k, jk) such that any node with index (x, y) is a descendant of v if x = i and y < j or x  i but y  jx, where k is the number of the disjoint chains. a c e f b d g h i (1, 1) (2, 2)(3, 3) (2, 3)(3, _) (2, _)(3, _) (2, 1) (2, 2) (2, 3) (1, 2)(3, 3) (1, 3)(3, 3) (1, _)(3, _) (3, 1) (3, 2) (3, 3) (1, _)(2, 3) (1, 3)(2, _) (1, _)(2, _) (1, 2) (1, 3) The space complexity is bounded by O(kn). Jan. 2017 Yangjun Chen ACS-7102

Construction of Index Sequences Each leaf node is exactly associated with one index, which is trivially sorted. Let v1, ..., vl be the child nodes of v, associated with the index sequences L1, ..., Ll, respectively. Assume that |Li|  b (1 i  l) and the indexes in each Li are sorted according to the first element in each index. We will merge all Li’s into a new index sequence and associate it with v. This can be done as follows. First, make a copy of L1, denoted L. Then, we merge L2 into L by scanning both of them from left to right. Let (a1, b1) (from L) and (a2, b2) (from L2) be the index pair encountered. We will do the following checkings: Jan. 2017 Yangjun Chen ACS-7102

- If a2 > a1, we go to the index next to (a1, b1) and compare it with (a2, b2) in a next step. - If a1 > a2, insert (a2, b2) just before (a1, b1). Go to the index next to (a2, b2) and compare it with (a1, b1) in a next step. - If a1 = a2, we will compare b1 and b2. If b1 > b2, nothing will be done. If b2 > b1, replace b1 with b2. In both cases, we will go to the indexes next to (a1, b1) and (a2, b2), respectively. We will repeatedly merge L2, ..., Ll into L. Obviously, |L|  b and the indexes in L are sorted. The time spent on this process is O(dvk), where dv represents the outdegree of v. So the whole cost is bounded by O( dvk) = O(ke), å v where e is the number of edges of G. Jan. 2017 Yangjun Chen ACS-7102

Graph Stratification Definition (DAG stratification) Let G(V, E) be a DAG. The stratifi- cation of G is a decomposition of V into subsets V1, V2,..., Vh such that V = V1  V2  ... Vh and each node in Vi has its children appearing only in Vi-1, ..., V1 (i = 2, ..., h), where h is the height of G, i.e., the length of the longest path in G. For each node v in Vi, its level is said to be i, denoted l(v) = i. In addition, Cj(v) (j < i) represents a set of links with each pointing to one of v’s children, which appears in Vj. Therefore, for each v in Vi, there exist i1, ..., ik (il < i, l = 1, ..., k) such that the set of its children equals (v)  ...  (v). Assume that Vi = {v1, v2, ..., vk}. We use (j < i) to represent Cj(v1)  ...  Cj(vl). C i1 C ik Cj i Such a DAG decomposition can be done in O(e) time, by using the following algorithm. Jan. 2017 Yangjun Chen ACS-7102

G1/G2 - a graph obtained by deleting the edges of G2 from G1. G1  G2 - a graph obtained by adding the edges of G1 and G2 together. (v, u) - an edge from v to u. d(v) - v’s outdegree. Algorithm graph-stratification(G) begin 1. V1 := all the nodes with no outgoing edges; 2. for i = 1 to h - 1 do 3. { W := all the nodes that have at least one child in Vi; 4. for each node v in W do 5. { let v1, ..., vk be v’s children appearing in Vi; 6. Ci(v) := {links to v1, ..., vk}; 7. if d(v) > k then remove v from W; 8. G := G/{(v, v1), ..., (v, vk)}; 9. d(v) := d(v) - k;} 10. Vi+1 := W; 11. } end Jan. 2017 Yangjun Chen ACS-7102

no outgoing edges (see line 1). In the above algorithm, we first determine V1, which contains all those nodes having no outgoing edges (see line 1). In the subsequent computation, we determine V2, ..., Vh. In order to determine Vi (i > 1), we will first find all those nodes that have at least one child in Vi-1 (see line 3), which are stored in a temporary variable W. For each node v in W, we will then check whether it also has some children not appearing in Vi-1, which can be done in a constant time as demonstrated below. During the process, the graph G is reduced step by step, and so does d(v) for each v (see lines 8 and 9). First, we notice that after the jth iteration of the out-most for-loop, V1 , ..., Vj+1 are determined. Denote Gj(V, Ej) the reduced graph after the jth iteration of the out-most for-loop. Then, any node v in Gj, except those in V1  ...  Vj+1, does not have children appearing in V1  ...  Vj. Denote dj(v) the outdegree of v in Gj. Thus, in order to check whether v appearing in Gi-1 has some children not appearing in Vi, we need only to check whether di-1(v) is strictly larger than k, the number of the child nodes of v appearing in Vi (see line 7). Jan. 2017 Yangjun Chen ACS-7102

The nodes of the DAG are divided into four levels: V1 = {d, e, i}, b c d e g h i V4: a C3(a) = {c} f C3(f) = {b} V3: b C2(b) = {c} C1(b) = {i} g C2(g) = {h} C1(g) = {d} V2: c C1(c) = {d, e} h C1(h) = {e, i} V1: d e i The nodes of the DAG are divided into four levels: V1 = {d, e, i}, V2 = {c, h}, V3 = {b, g}, and V4 = {a, f}. Associated with each node at each level is a set of links pointing to its children at different levels. Jan. 2017 Yangjun Chen ACS-7102

Find a minimum set of node disjoint chains for a given DAG G such that on each chain if node v is above node u, then there is a path from v to u in G. Step 1: Stratify G into a series of bipartite graphs. Step 2: Find a maximum matching for each bipartite graph (which may contain the so-called virtual nodes.) All the matchings make up a set of node-disjoint chains. Step 3: resolve all the virtual nodes. Jan. 2017 Yangjun Chen ACS-7102

Example. V2: V1: V0: V1: M1: V0: M2: V2: V1: b e h d g b e h d g c f i j a V0: c f i j a V0: V1: b e h c f i j a b e h c f i j a M1: g V1: V2: b e h d g M2: d b e h Jan. 2017 Yangjun Chen ACS-7102

Example. a set of 6 chains a set of 5 chains M1  M2: M2: M1: b e h d g M2: M1  M2: d g M1: b e h c f i j a b e h c f i j a a set of 5 chains d g d g b e h b e h c f i j a c f i j a Jan. 2017 Yangjun Chen ACS-7102

Virtual Nodes Vi-1’. Vi’ = Vi  {virtual nodes introduced into Vi}. Ci =  {all the new edges from the nodes in Vi to the virtual nodes introduced into Vi-1}. G(Vi, Vi-1’, Ci) represents the bipartite graph containing Vj and Vi-1’. Jan. 2017 Yangjun Chen ACS-7102

Definition (virtual nodes for actual nodes) Let G(V, E) be a DAG, divided into V0, ..., Vh-1 (i.e., V = V0  ...  Vh-1). Let Mi be a maximum matching of the bipartite graph G(Vi, Vi-1’; Ci) and v be a free actual node (in Vi-1’) relative to Mi (i = 1, ..., h - 1). Add a virtual node v’ into Vi. In addition, for each node u  Vi+1, a new edge u  v’ will be created if one of the following two conditions is satisfied: 1. u  v  E; or 2. There exists an edge (v1, v2) covered by Mi such that v1 and v are connected through an alternating path relative to Mi; and u  Bi+1(v1) or u  Bi+1(v2). v is called the source of v’, denoted s(v’). Bj(v) represents a set of links with each pointing to one of v’s parents, which appears in Vj. Jan. 2017 Yangjun Chen ACS-7102

Example. V2: V1: V0: V1: M1: V0: V2: M2: V1’: b e h d g b e h d g c f i j a V0: c f i j a V0: V1: b e h M1: b e h c f i j a c f i j a V2: d g M2: d g V1’: b e i’ h a’ b e i’ h a’ Jan. 2017 Yangjun Chen ACS-7102

Example. a set of 5 chains M1: b e h c f i j a M1  M2: d g b e i’ h a’ M2: d g c f i j a b e i’ h a’ To obtain the final result, the virtual nodes have to be resolved. Jan. 2017 Yangjun Chen ACS-7102

Virtual node resolution Definition (alternating graph) Let Mi be a maximum matching of G(Vi, Vi-1’; Ci). The alternating graph with respect to Mi is a directed graph with the following sets of nodes and edges: V( ) = Vi  Vi-1’, and E( ) = {u  v | u  Vi-1’, v  Vi, and (u, v)  Mi}  {v  u | u  Vi-1’, v  Vi, and (u, v)  Ci\Mi}. Jan. 2017 Yangjun Chen ACS-7102

Example. V0: V1: V2: V1’: : : b e h c f i j b e i’ d g h a’ j h a i’ d Jan. 2017 Yangjun Chen ACS-7102

Combined graph: Combine and by connecting some nodes v’ in to some nodes u in if the following conditions are satisfied. v’ is a virtual node appearing in Vi’. (Note that V( ) = Vi+1  Vi’.) There exist a node x in Vi+1 and a node y in Vi such that (x, v’)  Mi+1, x  y  Ci+1, and (y, u)  Mi. X y b e i’ d g h a’ : X b e h : c f i j u Jan. 2017 Yangjun Chen ACS-7102

Example.  : : f e c b j h a i i’ d e a’ g i’ d e a’ g f c b j h a i Jan. 2017 Yangjun Chen ACS-7102

In order to resolve as many virtual nodes (appearing in Vi’) as possible, we need to find a maximum set of node-disjoint paths (i.e., no two of these paths share any nodes), each starting at virtual node (in ) and ending at a free node in , or ending at a free node in . i’ e a’ g f c b j h a i’ d e a’ g f c b j h a i i’ e a’ g f c b i Jan. 2017 Yangjun Chen ACS-7102

- The problem of finding a maximal set of node-disjoint paths can be solved by transforming it to a maximum flow problem. - Generally, to find a maximum flow in a network, we need O(n3) time. However, a network as constructed above is a 0-1 network. In addition, for each node v, we have either din(v)  1 or dout(v)  1, where din(v) and dout(v) represent the indegree and outdegree of v in  , respectively. It is because each path in  is an alternating path relative to Mi+1 or relative to Mi. So each node except sources and sinks is an end node of an edge covered by Mi+1 or by Mi. As shown in ([14]), it needs only O(n2.5) time to find a maximum flow in such kind of networks. Jan. 2017 Yangjun Chen ACS-7102

Virtual nodes will be removed. c f i j a d g M1  M2: h a’ d g b e i’ h a’ c f i j a i’ e a’ g f c b j h a Virtual nodes will be removed. Jan. 2017 Yangjun Chen ACS-7102

Definition (virtual nodes for virtual nodes) Let Mi be a maximum matching of the bipartite graph G(Vi, Vi-1’; Ci) and v’ be a free virtual node (in Vi-1’) relative to Mi (i = 1, ..., h - 1). Add a virtual node v’’ into Vi. Set s(v’’) to be w = s(v’). Let l(w) = j. For each node u  Vi+1, a new u  v’ will be created if there exists an edge (v1, v2) covered by Mj+1 such that v1 and w are connected through an alternating path relative to Mj+1; and u  Bi+1(v1) or u  Bi+1(v2). Example. p q c f i k V0: b d h V1: e g V2: p q V3: e g b d h c f i k Jan. 2017 Yangjun Chen ACS-7102

Example. V0: V1: M1: V1’: V2: M2 : V2’: V3: M3: c f i k b d h c f i k g V2: b f’ d h e g M2 : b f’’ d h V2’: p q V3: b’ M3: p q b’ f’’ d h Jan. 2017 Yangjun Chen ACS-7102

b f’ d h b’ f’’ e g c f i k p q b f’ d h p e g c f i k q q p e g b d h c f i k Jan. 2017 Yangjun Chen ACS-7102

Node-disjoint Paths in Combined Graphs Now we discuss an algorithm for finding a maximal set of node-disjoint paths in a combined graph  . Its time complexity is bounded by O(en1/2), where n = V(  ) and e = E(  ). It is in fact a modified version of Dinic’s algorithm [6], adapted to combined graphs, in which each path from a virtual node to a free node relative to Mi+1 or relative to Mi is an alternating path, and for each edge (u, v)  Mi+1  Mi, we have dout(u) = din(v) = 1. Therefore, for any three nodes v, v’, and v’’ on a path in  , we have dout(v) = din(v’) = 1, or dout(v’) = din(v’’) = 1. We call this property the alternating property, which enables us to do the task efficiently by using a dynamical arc-marking mechanism. An arc u  v with dout(u) = din(v) = 1 is called a bridge. Jan. 2017 Yangjun Chen ACS-7102

Our algorithm works in multiple phases. In each phase, the arcs in  will be marked or unmarked. We also call a virtual node in  an origin and a free node a terminus. An origin is said to be saturated if one of its outgoing arcs is marked; and a terminus is saturated if one of its incoming arcs is marked. In the following discussion, we denote  by A. At the very beginning of the first phase, all the arcs in A are unmarked. Jan. 2017 Yangjun Chen ACS-7102

Let V0 be the set of all the unsaturated origins (appearing in ). In the kth phase (k  1), a subgraph A(k) of A will be explored, which is defined as follows. Let V0 be the set of all the unsaturated origins (appearing in ). Define Vj (j > 0) as below: Ej-1 = { u  v  E(A) | u  Vj-1, v  V0  V1  ...  Vj-1, u  v is unmarked}  { v  u  E(A) | u  Vj-1, v  V0  V1  ...  Vj-1, v  u is marked}, Vj = { v  V(A) | for some u, u  v is unmarked and u  v  Ej-1}  { v  V(A) | for some u, v  u is marked and v  u  Ej-1}. Jan. 2017 Yangjun Chen ACS-7102

Define j* = min{j | Vj  {unsaturated terminus}  }. (Note that the terminus appearing in are the free nodes relative to Mi+1; and those appearing in are the free nodes relative to Mi.) A(k) is formed with V(A(k)) and E(A(k)) defined below. If j* = 1, then V(A(k)) = V0  (Vj*  {unsaturated terminus}), E(A(k)) = {u  v | u  Vj*-1, and v  {unsaturated terminus}}. If j* > 1, then V(A(k)) = V0  V1  ...  Vj*-1  (Vj*  {unsaturated terminus}), E(A(k)) = E0  E1  ...  Ej*-2  {u  v | u  Ej*-1, and v  {unsaturated terminus}}. The sets Vj are called levels. Jan. 2017 Yangjun Chen ACS-7102

In , a node sequence v1, ..., vj, vj+1, ..., vl is called a complete sequence if the following conditions are satisfied. (1) v1 is an origin and vl is a terminus. For each two consecutive nodes vj, vj+1 (j = 1, ..., l - 1), we have an unmaked arc vj  vj+1 in A(k), or a marked arc vj+1  vj in A(k). Our algorithm will explore to find a set of node-disjoint complete sequences (i.e., no two of them share any nodes.) Then, we mark and unmark the arcs along each complete sequence as follows. (i) If (vj, vj+1) corresponds to an arc in A(k), mark that arc. (ii) If (vj+1, vj) corresponds to an arc in A(k) , unmark that arc. Obviously, if for an A(k) there exists j such that Vj = Φ and Vi  {unsaturated terminus} = Φ for i < j, we cannot find a complete sequence in it. In this case, we set A(k) to Φ and then the kth phase is the last phase. Jan. 2017 Yangjun Chen ACS-7102

c e a c e g b h d f a g d f b h a c e g b h d f c e a g d f b h c e a Jan. 2017 Yangjun Chen ACS-7102

Algorithm subgraph-exploring() begin 1. let v be the first element in V0; 2. push(v, H); mark v ‘accessed’; 3. while H is not empty do { 4. v := top(H); (*the top element of H is assigned to v.*) 5. while neighbor(v)  F do { 6. let u be the first element in neighbor(v); 7. if u is accessed then remove u from neighbor(v) 8. else {push(u, H); mark u ‘accessed’; v := u;} } 10. if v is neither in Vj* nor in V0 then pop(H) 11. else {if v is in Vj* then output all the elements in H; (*all the elements in H make up a complete sequence.*) 12. remove all elements in H; 13. let v be the next element in V0; 14. push(v, H); mark v; 15. } end Jan. 2017 Yangjun Chen ACS-7102

Algorithm node-disjoint-paths(A) begin 1. k := 1; 2. construct A(1) ; 3. while A(k)   do { 4. call subgraph-exploring(A(k)); 5. let P1, ... Pl be all the found complete sequences; 6. for j = 1 to l do 7. { let Pj = v1, v2, ... , vm; 8. mark vi  vi+1 or unmark vi+1  vi (i = 1, ..., m - 1) according to (i) and (ii) above; 9. } 10. k := k + 1; construct A(k); 11. } end Jan. 2017 Yangjun Chen ACS-7102