Graph: representation and traversal CISC4080, Computer Algorithms

Graph: representation and traversal CISC4080, Computer Algorithms
CIS, Fordham Univ. Instructor: X. Zhang

Outline Graph Definition Graph Representation
Path, Cycle, Tree, Connectivity Graph Traversal Algorithms Breath first search/traversal Depth first search/traversal …

each edge has an direction
Graphs - Background Graphs = a set of nodes (vertices) with edges (links) between them. G = (V, E) - graph V = set of vertices (size of V = n) E = set of edges (size of E = m) 1 2 3 4 V={1,2,3,4} E={ (1,2), (2,2), (3,1), (2,3), (2,4),(3,4), (4,3)} each edge is an ordered pair Directed graph: each edge has an direction

Graphs - Background Graphs = a set of nodes (vertices) with edges (links) between them. Notations: G = (V, E) - graph V = set of vertices (size of V = n) E = set of edges (size of E = m) 1 2 3 4 V={1,2,3,4} E={ {1,2), {1,3}, {2,4}, {2,3}} each edge can be represented as a set of two nodes Undirected graph: no direction on edges

Graphs: application Applications that involve not only a set of items, but also connections between them Undirected graph: connection/relation between items is symmetric/mutual social networks such as: actors/actresses, co- staring in film relation Directed graph: relation between items is not symmetric e.g., WWW (nodes are webpages, edges are hyperlinks) <a href=“ Hypertext

Graph: State Space of Games
Tic-Tac-Toe and other board games: nodes: board configuration edges: legal moves MinMax: a decision rule in AI, game theory.. for minimizing possible loss for a worst case (maximum loss) scenario.

Graph and Puzzle Many puzzles can be modeled as graph: nodes:
edges: legal moves Such graph representation (state space of the puzzle) allows one to use graph algorithms (traversal, A* search) to solve puzzle Exercise: how many nodes is “initial State” node connected with? Is the graph directed?

Graph Representation How to store a graph, G=(V,E) in computer program? V: the set of vertices/nodes a vector or an array of nodes E: the set of edges (adjacency relation between nodes) adjacency lists, or adjacency matrix

Adjacency list, G=(V,E) For each node u in V
maintain a list Adj[u], storing all vertices v such that there is an edge from u to v Organize adjacency lists into an array Can be used for both directed and undirected graphs Adj 1 2 5 / 1 2 5 4 3 2 1 5 3 4 / 3 2 4 4 2 5 3 / 5 4 1 2 Undirected graph

Properties of Adjacency List Representation
Memory required Θ(m+n) Preferred when the graph is sparse: m << n2 Disadvantage no quick way to determine whether there is an edge between node u and v Time to determine if (u, v) exists: O(degree(u)) Time to list all vertices adjacent to u: Θ(degree(u)) 1 2 5 4 3 Undirected graph 1 2 3 4 Directed graph

Adjacency matrix representationG = (V, E)
Assume vertices are numbered 1, 2, … n Represent E using a matrix Anxn aij = if (i, j) belongs to E, if there is edge (i,j) 0 otherwise 1 2 3 4 5 For undirected graphs matrix A is symmetric: aij = aji A = AT 1 1 1 2 5 4 3 2 1 3 1 4 1 Undirected graph 5 1

Properties of Adjacency Matrix Representation
Memory required Θ(n2), independent on number of edges in G Preferred when graph is dense: |E| is close to |V|2 need to quickly determine if there is an edge between two vertices Time to list all vertices adjacent to u: Θ(n) Time to determine if (u, v) belongs to E: Θ(1)

Weighted Graphs Weighted graphs = graphs for which each edge has an associated weight w(u, v) w: E -> R, weight function Storing the weights of a graph Adjacency list: Store w(u,v) along with vertex v in u’s adjacency list Adjacency matrix: Store w(u, v) at location (u, v) in the matrix

Path, Cycle, Tree, Connectivity Graph Traversal Algorithms basis of other algorithms

Paths A Path in a graph G=(V,E) is a sequence of nodes v1,v2,…,vk , for each and every consecutive pair vi-1, vi is joined by an edge in E, i.e., (vi-1, vi) is an edge A path is simple if all nodes in the path are distinct. A cycle is a path v1,v2,…,vk where v1=vk, k>2, and the first k-1 nodes are all distinct Exercises: are following sequences of nodes path, simple path, cycle? 1 2 3 4 G2 G1 1 2 3 4 1,2,3 1,2,3,1,2,4 1,2,3,1 1,3,2 1,3,2 1,2,3,4

Trees A graph is connected: for any pair of nodes u, v, there is a path connecting u and v. A undirected graph is a tree if it is connected and does not contain a cycle. for any tree: |E|=|V|-1

Path, Cycle, Tree, Connectivity Graph Traversal Algorithms basis of other algorithms

Searching in a Graph Graph searching/traversal = systematically follow edges of graph to visit all vertices of graph Two basic graph searching: Breadth-first search Depth-first search Difference: the order in which they explore unvisited edges of the graph

Breadth-First Search (BFS)
Input: graph G = (V, E) (directed or undirected), a source vertex s from V Goal: Follow edges of G to “discover” every vertex reachable from s first discover vertices one-hop away from s then discover vertices two-hop away …, until all reachable vertices have been visited Output: d[v]: distance (shortest hop) from s to v, for all v prev[v]: previous hop in shortest path from s to v d[2]=1 prev[2]=1 S S=1 1 2 5 4 3 1 2 5 4 3 d[3]=2 prev[3]=2 d[5]=1 prev[5]=1 d[4]=2 prev[4]=2

Breadth-First Search (BFS)
Input: graph G = (V, E) (directed or undirected), a source vertex s from V Output: d[v]: distance (shortest hop) from s to v, for all v prev[v]: previous node in shortest path from s to v, s, …, prev[v], v Observation: Output (prev[]) is not unique All edges (prev[v], v) forms a BFS tree (shown in red above) (1,2), ( 1,5),(2,4), (2,3) for above example d[2]=1 prev[2]=1 S 1 2 5 4 3 d[3]=2 prev[3]=2 S=1 1 2 5 4 3 d[5]=1 prev[5]=1 d[4]=2 prev[4]=2, or prev[4]=5

BFS: Example Suppose we perform BFS on following graph, with s=1
What’s the output? d[v]: distance (shortest hop) from s to v, for all v prev[v]: previous node in shortest-hop path from s to v, s, …, prev[v], v

BFS Alg To avoid cycle (revisit vertices), color vertex source 1 2 5 4
White: not yet discovered Gray: discovered, but not explored yet Black: discovered, and explored BFS algorithm Initially, all vertices are white source node s is discovered first, so color it gray When a vertex is discovered, color it gray, and insert it into a FIFO queue Take next gray vertex u from queue, follow edges coming out from u to discover new vertices (while => gray, and insert to queue) After discovering all its adjacent vertices, u becomes black Go back to 4, until queue is empty source 1 2 5 4 3 1 2 5 4 3 1 2 5 4 3

Breadth-First Tree BFS constructs a breadth-first tree
Initially contains root (source vertex s) When vertex v is discovered while scanning adjacency list of a vertex u ⇒ vertex v and edge (u, v) are added to the tree A vertex is discovered only once ⇒ it has only one parent u is the predecessor (parent) of v in the breadth-first tree Breath-first tree contains nodes that are reachable from source node, and all edges from each node’s predecessor to the node source 1 2 5 4 3

BFS Application BFS constructs a breadth-first tree
BFS finds shortest (hop-count) path from src node to all other reachable nodes E.g., What’s shortest path from 1 to 3? perform BFS using node 1 as source node Node 2 is discovered while exploring 1’s adjacent nodes => pred. of node 2 is node 1 Node 3 is discovered while exploring node 2’s adjacent nodes => pred. of node 3 is node 2 so shortest hop count path is: 1, 2, 3 Useful when we want to find minimal steps to reach a state source 1 2 5 4 3

BFS: Implementation Detail
G = (V, E) represented using adjacency lists color[u] – color of vertex u in V pred[u] – predecessor of u If u = s (root) or node u has not yet been discovered then pred[u] = NIL d[u] – distance (hop count) from source s to vertex u Use a FIFO queue Q to maintain set of gray vertices source d=1 pred =1 1 2 5 4 3 d=2 pred =2 d=1 pred =1 d=2 pred=5

BFS(V, E, s) for each u in V - {s} do color[u] = WHITE d[u] ← ∞
x y for each u in V - {s} do color[u] = WHITE d[u] ← ∞ pred[u] = NIL color[s] = GRAY d[s] ← 0 pred[s] = NIL Q = empty Q ← ENQUEUE(Q, s) ∞ r s t u v w x y ∞ r s t u v w x y Q: s

BFS(V, E, s) while Q not empty do u ← DEQUEUE(Q) for each v in Adj[u]
∞ r s t u v w x y while Q not empty do u ← DEQUEUE(Q) for each v in Adj[u] do if color[v] = WHITE then color[v] = GRAY d[v] ← d[u] + 1 pred[v] = u ENQUEUE(Q, v) color[u] = BLACK Q: s ∞ 1 r s t u v w x y Q: w 1 ∞ r s t u v w x y Q: w, r

Example ∞ r s t u v w x y 1 ∞ r s t u v w x y v w x y 1 2 ∞ r s t u
r s t u v w x y 1 ∞ r s t u v w x y v w x y 1 2 ∞ r s t u Q: s Q: w, r Q: r, t, x 1 2 ∞ r s t u v w x y 1 2 3 ∞ r s t u v w x y 1 2 3 r s t u v w x y Q: t, x, v Q: x, v, u Q: v, u, y 1 2 3 r s t u v w x y 1 2 3 r s t u v w x y r s t u 1 2 3 v w x y Q: u, y Q: y Q: ∅ CS 477/677 - Lecture 19

Analysis of BFS for each u ∈ V - {s} do color[u] ← WHITE d[u] ← ∞
pred[u] = NIL color[s] ← GRAY d[s] ← 0 pred[s] = NIL Q ← ∅ Q ← ENQUEUE(Q, s) O(|V|) Θ(1) CS 477/677 - Lecture 19

Analysis of BFS while Q not empty do u ← DEQUEUE(Q)
for each v in Adj[u] do if color[v] = WHITE then color[v] = GRAY d[v] ← d[u] + 1 pred[v] = u ENQUEUE(Q, v) color[u] = BLACK Scan Adj[u] for all vertices u in the graph Each vertex u is processed only once, when the vertex is dequeued Sum of lengths of all adjacency lists = Θ(|E|) Scanning operations: O(|E|) Θ(1) Θ(1) Total running time for BFS: O(|V| + |E|)

BFS Application BFS constructs a breadth-first tree
BFS finds shortest (hop-count) path from src node to all other reachable nodes E.g., What’s shortest path from 1 to 3? perform BFS using node 1 as source node Node 2 is discovered while exploring 1’s adjacent nodes => pred. of node 2 is node 1 Node 3 is discovered while exploring node 2’s adjacent nodes => pred. of node 3 is node 2 so shortest hop count path is: 1, 2, 3 Useful when we want to find minimal steps to reach a state source 1 2 5 4 3

Exercise For G=(V,E), where V={1,2,3,4,5,6}, with following adjacency matrix: Is it directed? indirected? Why? Is 2,3,4,5,6 a path? Why? Perform a BFS traversal of G starting from node 4. Draw BFS tree resulted.

Exercise First two glasses (one with a volume of 3 oz., one with a volume of 5 oz.)are empty. The third glass (8 oz.) is full of water. How to get at least one glass to hold 4 oz. of water by pouring water from one glass to another (assuming no spillage). What’s best way (with minimal pouring)?

Water Pouring Puzzle

Similarity to maze walking!
Depth-First Search Input: G = (V, E) (No source vertex given!) Goal: Start explore from a vertex Explore a vertex by following edges ( if directed edge, in the direction of edge) to discover vertex in V. Then explore most recently discovered vertex (search deeper whenever possible) exploration may be repeated from multiple sources until all vertices visited u v w x y z Similarity to maze walking! Video here

Depth-First Search Search “deeper” in graph whenever possible
explore edges of most recently discovered vertex v (that still has unexplored edges) u v w x y z Say we start from u explore edges of u, to discover v (now v is mostly recently discovered node) explore edges of v to discover y (now y is…) explore edges of y to discover x (now x is…)

Depth-First Search explore edges of most recently discovered vertex v (that still has unexplored edges) After all edges of v have been explored, “backtracks” to parent of v u v w x y z 5. explore edges of x to reach v, u (already discovered) 6. x has no other (out-going) edge, backtrack to y (we discover x via y, so y is x’s parent) 7. explore edges of y, no other edges, backtrack to v 8. backtrack to u, u has another edge, leading to x, already discovered, 9. u has no parent (nowhere to backtrack). All nodes reachable from u has been explored…

Depth-First Search Continue until all vertices reachable from original source have been discovered If undiscovered vertices remain, choose one of them as a new source and repeat search from that vertex u v w x y z 9. u has no parent (no where to backtrack). All nodes reachable from u has been explored… 10. Choose w (or z) to explore next 11. Follow edge (w,y): y already discovered/visited; follow edge (w,z) to discover z 12. follow edge (z,z): z already discovered. No other edge, backtrack to w (z turns black) 13. w has no other edge (turn black), no parent 14. All nodes have been discovered, done!

Depth-First Search Start search/exploration from any node (randomly selected) Search “deeper” in graph whenever possible follow edges of most recently discovered vertex v (that still has unexplored neighbors) After all neighbors of v have been explored, “backtracks” to parent/predecessor of v Continue until all vertices reachable from original source have been discovered If undiscovered vertices remain, choose one of them as a new source and repeat search from that vertex

Depth-First Search: data structure
Color vertex to remember white: not discovered, not explored gray: discovered, in the process of being explored black: discovered, and done exploring pred[u]: predecessor/parent node of node u the previous node on the path to u i.e., we discover u via pred[u]

DFS Additional Data Structures
A virtual clock: time (an integer) initialized to 0 Incremented when something of interests happens, i.e., nodes are discovered/finished d[u]– discovery (time when u turns gray) f[u] – finish time (time when u turns black) GRAY WHITE BLACK 2|V| d[u] f[u] 1 ≤ d[u] < f [u] ≤ 2 |V| time

Recursive DFS(V, E) for each u ∈ V do color[u] ← WHITE pred[u] ← NIL
time ← 0 do if color[u] = WHITE then DFS-VISIT(u) Every time DFS-VISIT(u) is called, u becomes the root of a new tree in the depth-first forest u v w x y z

DFS-VISIT(u): DFS exploration from u
color[u] ← GRAY time ← time+1 d[u] ← time for each v ∈ Adj[u] do if color[v] = WHITE then pred[v] ← u DFS-VISIT(v) color[u] ← BLACK //done with u time ← time + 1 f[u] ← time u v w x y z time = 1 1/ u v w x y z 1/ 2/ u v w x y z

You need to fill in details.
DFS without recursion Search “deeper” in graph whenever possible explore edges of most recently discovered vertex v (that still has unexplored edges) After all edges of v have been explored, “backtracks” to parent of v Continue until all vertices reachable from original source have been discovered If undiscovered vertices remain, choose one of them as a new source and repeat search from that vertex This is a sketch only! You need to fill in details. Data Structure: use stack (Last In First Out!) to store all gray nodes Pseudocode: Start by push source node to stack Explore node at stack top, i.e., * push its next white adj. node to stack) * if all its adj nodes are black, the node turns black, pop it from stack 3. Continue (go back 2) until stack is empty 4. If there are white nodes remaining, go back to 1 using another white node as source node

Example u v w x y z u v w x y z u v w x y z u v w x y z u v w x y z u
1/ u v w x y z 1/ 2/ u v w x y z 1/ 2/ 3/ u v w x y z 1/ 2/ 4/ 3/ u v w x y z 1/ 2/ 4/ 3/ u v w x y z B 1/ 2/ 4/5 3/ u v w x y z B 1/ 2/ 4/5 3/6 u v w x y z B 1/ 2/7 4/5 3/6 u v w x y z B 1/ 2/7 4/5 3/6 u v w x y z B F

Example (cont.) The results of DFS may depend on:
1/8 2/7 4/5 3/6 u v w x y z B F 1/8 2/7 9/ 4/5 3/6 u v w x y z B F 1/8 2/7 9/ 4/5 3/6 u v w x y z B F C 1/8 2/7 9/ 4/5 3/6 10/ u v w x y z B F C 1/8 2/7 9/ 4/5 3/6 10/ u v w x y z B F C 1/8 2/7 9/ 4/5 3/6 10/11 u v w x y z B F C 1/8 2/7 9/12 4/5 3/6 10/11 u v w x y z B F C The results of DFS may depend on: The order in which nodes are explored in procedure DFS The order in which the neighbors of a vertex are visited in DFS-VISIT

Exercise Perform DFS

DFS: edge classification
color[u] ← GRAY time ← time+1 d[u] ← time for each v ∈ Adj[u] do if color[v] = WHITE then pred[v] ← u DFS-VISIT(v) color[u] ← BLACK //done with u time ← time + 1 f[u] ← time u v w x y z (u,v) is a tree edge. time = 1 1/ u v w x y z 1/ 2/ u v w x y z

Edge Classification reaches a WHITE vertex, v :
In DFS, when follow an edge of u, and … reaches a WHITE vertex, v : (u, v) is a tree edge if v was first discovered by exploring edge (u, v) reaches a GRAY vertex v: (u, v) is a back edge, connecting a vertex u to an ancestor v in a depth first tree Self loops (in directed graphs) are also back edges 1/ u v w x y z 1/ 2/ 4/ 3/ u v w x y z B

Edge Classification reaches a BLACK vertex v, d[u] < d[v]):
(u,v) is forward edge, non-tree edge that connects a vertex u to a descendant v in a depth first tree reaches a BLACK vertex v, d[u] > d[v]): (u,v) is cross edge Can go between vertices in same depth- first tree (as long as there is no ancestor / descendant relation) or between different depth-first trees e.g., (w,y) in example 1/ 2/7 4/5 3/6 u v w x y z B F 1/8 2/7 9/ 4/5 3/6 u v w x y z B F C

Analysis of DFS(V, E) for each u ∈ V do color[u] ← WHITE pred[u] ← NIL
time ← 0 do if color[u] = WHITE then DFS-VISIT(u) Θ(|V|) Θ(|V|) – without counting the time for DFS-VISIT

Analysis of DFS-VISIT(u)
color[u] ← GRAY time ← time+1 d[u] ← time for each v ∈ Adj[u] do if color[v] = WHITE then pred[v] ← u DFS-VISIT(v) color[u] ← BLACK time ← time + 1 f[u] ← time DFS-VISIT is called exactly once for each vertex Each loop takes |Adj[u]| Total: Σu∈V |Adj[u]| + Θ(|V|) = Θ(|E|) = Θ(|V| + |E|)

Properties of DFS u = pred[v] ⟺ DFS-VISIT(v) was called during a search of u’s adjacency list Vertex v is a descendant of vertex u in depth first forest ⟺ v is discovered while u is gray 1/ 2/ 3/ u v w x y z

Parenthesis Theorem In any DFS of a graph G, for all u, v, exactly one of the following holds: [d[u], f[u]] and [d[v], f[v]] are disjoint, and neither of u and v is a descendant of the other [d[v], f[v]] is entirely within [d[u], f[u]] and v is a descendant of u [d[u], f[u]] is entirely within [d[v], f[v]] and u is a descendant of v y z s t 3/6 2/9 1/10 11/16 4/5 7/8 12/13 14/15 x w v u s t z v u y w x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 (s (z (y (x x) y) (w w) z) s) (t (v v) (u u) t) Well-formed expression: parenthesis are properly nested

Other Properties of DFS*
Corollary Vertex v is a proper descendant of u ⟺ d[u] < d[v] < f[v] < f[u] Theorem (White-path Theorem) In a depth-first forest of a graph G, vertex v is a descendant of u if and only if at time d[u], there is a path u v consisting of only white vertices. 1/8 2/7 9/12 4/5 3/6 10/11 u v B F C 1/ 2/ u v

Directed Acyclic Graph
Def: A directed graph that has no cycle (i.e. a path starts and ends at the same vertex Commonly used to represent precedence of events or processes that have a partial order for some pairs, there is a precedence relation, i.e., PutOnSocks has to be done before PutOnShoes, but for some other pairs of events, there is no precedence relation between them, i.e., PutOnSocks and PutOnWatch Discussion: Algorithm to decide whether a directed graph G=(V,E) has cycle or not?

Lemma A directed graph is acyclic ⟺ a DFS on G yields no back edges, or when exploring adjacent nodes of node u, we never see a gray node. Proof: “⇒”: acyclic ⇒ no back edge Assume back edge ⇒ prove cycle Assume there is a back edge (u, v) ⇒ v is an ancestor of u, i.e., v=pred[pred[…pred[u]..] ⇒ there is a path from v to u in G (v u) ⇒ v u + the back edge (u, v) yield a cycle v u (u, v)

Topological Sort undershorts socks Topological sort of a DAG G = (V, E): is a linear order of vertices such that if there exists an edge (u, v), then u appears before v in the ordering. pants shoes shirt belt watch tie jacket jacket tie belt shirt watch shoes pants undershorts socks Topological sort: an ordering of vertices so that all directed edges go from left to right.

Topological Sort: idea
If there is a path from u to v, then u has to appear before v. e.g., pants, belt, jacket During DFS, if there is a path from u to v, then u finishes at a later time than v. if u starts first: if v starts first: Ordering nodes in descending order of their finish time yield topological ordering undershorts socks pants shoes shirt belt watch tie jacket jacket tie belt shirt watch shoes pants undershorts socks

Topological Sort TOPOLOGICAL-SORT(V, E) Running time: Θ(|V| + |E|)
Call DFS(V, E) When each vertex is finished, insert it onto the front of a linked list Return the linked list of vertices as topological order undershorts 11/ 16 17/ 18 socks 12/ 15 pants shoes 13/ 14 shirt 1/ 8 6/ 7 belt watch 9/ 10 tie 2/ 5 jacket 3/ 4 socks undershorts pants shoes watch shirt belt tie jacket Running time: Θ(|V| + |E|)

Path, Cycle, Tree, Connectivity Graph Traversal Algorithms Breath first search/traversal Depth first search/traversal Algorithms on weighted graph

Weighted Graphs Weighted graphs = graphs for which each edge has an associated weight w(u, v), a real number value w: E -> R, weight function Graph representation Adjacency list: Store w(u,v) along with vertex v in u’s adjacency list Adjacency matrix: Store w(u, v) at location (u, v) in the matrix

E’ includes edges highlighted (by wider lines)
Spanning Tree Given a connected, undirected, weighted graph G = (V, E), a spanning tree is a subgraph G’=(V, E’) where E’ ⊆ E that is acyclic and connects all vertices together. G=(V,E) E’ includes edges highlighted (by wider lines) How many different spanning trees are there for G?

Cost of Spanning Tree cost of a spanning tree T : sum of edge weights in spanning tree w(T) = ∑(u,v)∈T w(u,v) A minimum spanning tree (MST) is a spanning tree of minimum cost For a given G, there might be multiple MSTs (all with same cost) w(T1)= =19 w(T2)= =16

equal to any other spanning tree of G.
Cost of Spanning Tree A minimum spanning tree (MST) is a spanning tree of minimum cost. G This spanning tree has cost lower than or equal to any other spanning tree of G. w(T2)= =16 Can you come up another MST for G?

Problem: Finding MST Given: Connected, undirected, weighted graph, G
Find: Minimum spanning tree, T Such problems are optimization problems: there are multiple viable solutions (here, spanning trees), we want to find best (lowest cost, best perf) one. Applications: Communication networks Circuit design Layout of highway systems Acyclic subset of edges(E) that connects all vertices of G.

Greedy Algorithms A problem solving strategy
other strategies: brute force, divide and conquer, randomization, dynamic programming, … Greedy algorithms/strategies for solving optimization problem build up a solution step by step (local optimization) in each step always choose the option that offers best immediate benefits (a myopic approach) not worrying about global optimization… Performance of greedy algorithms: Sometimes yield optimal solution, sometimes yield suboptimal (i.e., not optimal) Sometimes: we can bound difference from optimal…

MST: Pick edge with lowest cost
Step 2 Step 1 Step 3.2 Add EA,B Step 3.1 Adding eA,D will introduce cycle, so go to next one… Step 5 Add EA,B Step 4 Add EA,B

Minimum Spanning Trees
// G=(V,E) is connected, undirected, weighted // return Te: a subset of E that forms MST MST (V,E): N = |V| // # of nodes Te={ } // set of tree edge Order E by weights //spanning tree needs N-1 edges for i=1 to N-1 find next lightest edge, e (next element in E) if adding e into the tree does not introduce cycle add e into Te return Te => This is Kruskal’s algorithm

Kruskal’s Algorithm Implementation detail:
Maintain sets of nodes that are connected by tree edges find(u): return the set that u belongs to find(u)=find(v) means u, v belongs to same group (i.e., u and v are already connected)

Kruskal algorithm for MST
// G=(V,E) is connected, undirected, weighted // return Te: a subset of E that forms MST MST (V,E): N = |V| // # of nodes Te={ } // set of tree edge Order E by weights //spanning tree needs N-1 edges for i=1 to N-1 find next lightest edge, e (next element in E) if adding e into the tree does not introduce cycle add e into Te return Te Add one edge in each step: need to check for cycle Prim algorithm: grow the tree from one node, in each step, connect one more node into the tree.

Prim Algorithm for MST: Connect a node with lowest cost
Step 1: pick node D (an arbitrary node) cost[u]: current cost of connect u to the tree prev[u]: via which node to connect to the tree // init cost, pred for each none tree node u, cost[u] = w(u,D) if (u,D) is an edge = inf if (u,D) is not an edge cost=2 prev=D cost=3 prev=D cost=inf cost=6 prev=D cost=4 prev=D Currently, min cost to connect F to the tree is 6, via connecting to D

Prim Algorithm for MST: Connect a node with lowest cost
Step 2 to N: grow the tree to connect a node with lowest cost u = node with lowest cost to connect to tree add edge (pred(u), u) into tree //update cost and prev all each none tree node v: if w(v,u)<cost[v] cost[v]=w(v,u) prev[v]=u cost=2 prev=D cost=1 prev=C cost=2 prev=D cost=3 prev=D cost=inf cost=inf cost=4 prev=D cost=6 prev=D cost=4 prev=C cost=4 prev=D Currently, min cost to connect F to the tree is 6, via connecting to D Currently, min cost to connect F to the tree is 4, via connecting to C

Prim Algorithm: Implementation
Step 2 to N: grow the tree to connect a node with lowest cost u <- the node with lowest cost add edge (pred(u), u) into tree //update cost and prev all each none tree node v: if w(v,u)<cost[v] cost[v]=w(v,u) prev[v]=u Discussion: How to quickly find the node with lowest cost? Recall the cost of nodes will be updated … cost=1 prev=C cost=2 prev=D cost=inf cost=4 prev=D cost=4 prev=C Add A and (A,C) into the tree, no updates …

Prim’s Algorithm H is a priority queue (implemented as min-heap, where node with lowest cost at root) deletemin() takes node v with lowest cost out from heap * this means node v is done(added to tree) // v, and edge v - prev(v) added to tree

Dijkstra Algorithm for shorts path
Edsger Wybe Dijkstra ( ), a Dutch systems scientist, programmer, software engineer, science essayist, and pioneer in computing science. Quotes: The question of whether a computer can think is no more interesting than the question of whether a submarine can swim. Program testing can be used to show the presence of bugs, but never to show their absence!

Distance of Path A path in a graph is a sequence of nodes, in which each consequent pair of nodes is connected by an edge Discussion: how many paths are there from A to E? How many simple paths? (that does not contain cycles) Distance/Cost of a path in weighted graph: is the sum of weights of all edges on the path cost of path A, B, E is w(A,B)+e(B,E) = 2+3=5 cost of path A, B, C, E is w(A,B)+w(B,C)+w(C,E)=2+1+4=7

Shortest Distance Paths
Compute shortest distance path from a node to another node? Assumption: all weights are positive This implies no cycle in shortest distance path Prove by contradiction. Suppose shortest path contain a cycle, let it be u->…v->t->..->t->x …->d, then folding the loop yields a shorter path u->…v->t->x..->d contradiction, so shortest path has no cycle Discussion: Can you find shortest distance path from A to E? Brute-force way? Is it efficient (scalable to work on large graph)?

Shortest Distance Paths
Compute shortest distance path from a node to another node? Assumption: all weights are positive This implies no cycle in shortest distance path Discussion: Can you find shortest distance path from a to q?

Dijkstra Alg: one to all
Input: G=(V,E) and (non-negative) weights for edges a source node s Output: what’s a shortest distance path from s to all other nodes in V? Observation (composition of shortest distance path) if A,B,E is shortest distance path from A to E, then A,B is shortest distance path from A to B. In general, if s, …, u, …v, …t is shortest distance path from s to t, then s, …u, is a shortest distance path from s to u, u…v is shortest path from u to v… (again prove by contradiction)

Input: weighted G=(V,E) and a source node s
Output: the shortest distance path from s to all other nodes in V dist(u) be the shortest distance from s to u For all nodes u, we want to find dist(u) and corresponding path s, … ,-> u Relation between a node u’s dist(u) and other nodes that have an edge towards u e.g., last intermediate node in shortest distance path from A to E has to be one of {B, C, D}: A,… B,E: follow shortest distance path from A to B, then follow edge (B,E) A,…C,E: follow shortest distance path from A to C, and then follow edge C-E A,…D,E: …. Shortest distance path from A to E is the shortest among the three: dist (E) = min ( dist(B)+w(B,E), dist(C)+w(C,E), dist(D)+w(D,E))

Dijkstra Alg: one to all
Input: weighted G=(V,E) and a source node s Output: the shortest distance path from s to all other nodes in V dist(u) be the shortest distance from s to u For all nodes u, we want to find dist(u) and corresponding path s, … ,-> u dist (E) = min { dist(B)+w(B,E), dist(C)+w(C,E), dist(D)+w(D,E) } Similarly for all none-source node: dist (B) = min { dist(A)+w(A,B), dist(C)+w(C,B), dist(E)+w(E,B), dist(D)+w(D,B) } … dist(A) = 0 // for source node S

Dijkstra Alg: lower dist(.)
Input: weighted G=(V,E) and a source node s Output: the shortest distance path from s to all other nodes in V dist(u): currently known shortest distance from s to u prev(u): previous internediate node on currently known shortest path from s to u, i.e., path is s … prev(u), u Initialize dist(.) to infinity for all nodes, dist(s) = 0 add all nodes into a min-heap Repeat until heap is empty: 1. remove node u with currently lowest dist(.) value (this node’s dist() cannot be lowered any more) 2. update u’s adjacent nodes’ dist and pred for each v in adj (u) if (dist(u)+w (u,v) < dist(v)) dist(v) = dist(u) + w(u,v) pred(v) = u Observation: if a node has lowest dist(.) among all, we cannot lower its cost by going through other nodes. This means we found a better way to go to v: s,….,u,v

Lower dist(.) to their true values
nodes circled in red already found true values 2, A infty 1,A update A’s adjacent nodes: B and C, use dist(A) dist(B)=min(dist(B), dist(A)+w(A,B)) infty 2, A 5, C 1,A infty update C’s adjacent nodes: B and E using dist(C) dist(B)=min(dist(B), dist(C)+w(C,B)) 2, A 5, C 1,A 4,B update B’s adjacent nodes: D and E (note that A,C are done already, but it does not hurt to update them) dist(E)=min(dist(E), dist(C)+w(C,E))

Dijkstra Alg: lower dist(.)
2, A 5, C 1,A 4,B update D’s adjacent nodes: B and E: no change for both 2, A 5, C 1,A 4,B update cost, and prev of B’s adjacent nodes: D and E (note that A,C are done already, but it does not hurt to update them: ) dist(A)=min(dist(A), dist(B)+w(B,A)) as current dist(A) is already smallest one, the above update won’t change dist(A) 2, A 5, C 1,A 4,B Done. WE have found shortest distance paths for all nodes!

Dijkstra Alg: shortest distance paths
// construct shortest dist path from s to u //using prev[] info ShortestDistPath (Node u) path = u //a string with nodes on the path curNode=u //follow the back pointer (prev) to s while curNode!=s curNode = prev[curNode] path=curNode+”,”+path 2, A 5, C 1,A 4,B Done. WE have found shortest distance paths for all nodes!

BFS and Dijkstra Alg Both explore nodes in a graph from a source node
Both update back pointer (pred[u]: the node that leads us to u) BFS keeps track d[u] (hop count of u) BFS explores depth 0 node (source), then depth 1 node, depth 2 node… expand like a ripple Dijkstra keeps track dist[u] (current best cost from s to u) Dijkstra explores node with lowest cost first

BFS(V, E, s) Dijkstra (V, E,s)
for each u in V - {s} do color[u] = WHITE d[u] ← ∞ dist[u] = infinity; pred[u] = NIL color[s] = GRAY d[s] ← dist[s]=0; pred[s] = NIL Priority queue (using dist as key, min-heap) Q = empty //FIFO Q = empty; //priority queue Q ← ENQUEUE(Q, s) Insert all nodes in Q

BFS(V, E, s) Dijkstra while Q not empty
do u ← DEQUEUE(Q) //Take node with smallest cost out for each v in Adj[u] do if color[v] = WHITE // if cost[v]>cost[u]+ w(u,v) ) then color[v] = GRAY cost[v]= cost[u]+ w(u,v); d[v] ← d[u] + 1 pred[v] = u pred[v]=u ENQUEUE(Q, v) color[u] = BLACK CS 477/677 - Lecture 19

Dijkstra’s algorithm s=A A snapshot: C will be removed from H next..
H: priority queue (min-heap in this case) C(dist=1), B(dist=2), D(dist=inf), E (dist=inf) prev=A dist=2 prev=nil dist=inf s=A prev=nil dist=0 prev=C dist=5 prev=A dist=1 prev=nil dist=inf

dist pred Dijkstra Alg Demo best paths to each node via
A: null B: A C: A D: null, E: null Q: C(2), B(4), D, E best paths to each node via nodes circled & associated distance A: null B: C C: A D: C, E: C Q: B(3), D(6), E(7) A: null B: C C: A D: B, E: B Q: D(5), E(6) A: null B: C C: A D: B, E: B Dijkstra Alg Demo Q: E(6)

dist pred Dijkstra Alg Demo best paths to each node via
A: null B: C C: A D: B, E: B best paths to each node via nodes circled & associated distance Q: D(5), E(6) A: null B: C C: A D: B, E: B Q: E(6) Dijkstra Alg Demo

Summary Graph everywhere: represent binary relation
Graph Representation Adjacent lists, Adjacent matrix Path, Cycle, Tree, Connectivity Graph Traversal Algorithm: systematic way to explore graph (nodes) BFS yields a fat and short tree App: find shortest hop path from a node to other nodes DFS yields forest made up of lean and tall tree App: detect cycles and topological sorting (for DAG)

Summary (2) Minimal Spanning Tree algorithms
two examples of greedy algorithm Shortest Distance Path algorithm: Dijkstra Alg (one to all) Appplications: Link State Routing Protocols (for Internet to build routing tables) What if weights can be negative? What about Shortest Distance Path for a given source and destination pair? A* search (maybe to be covered in AI)

Graph: representation and traversal CISC4080, Computer Algorithms

Similar presentations

Presentation on theme: "Graph: representation and traversal CISC4080, Computer Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Graph: representation and traversal CISC4080, Computer Algorithms

Similar presentations

Presentation on theme: "Graph: representation and traversal CISC4080, Computer Algorithms"— Presentation transcript:

Similar presentations

About project

Feedback