Data Structures and Algorithm Analysis Lecture 5

Data Structures and Algorithm Analysis Lecture 5
CSCI 256 Data Structures and Algorithm Analysis Lecture 5 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some by Iker Gondra

Graph Representation: Adjacency Matrix
Adjacency matrix: n-by-n matrix with Auv = 1 if (u, v) is an edge (1) Checking if (u, v) is an edge takes (1) time (2) Space proportional to n2 (3) Identifying all edges incident to a given node takes (n) time

Graph Representation: Adjacency Matrix
(2) not optimal if graph has many fewer edges than n2 (3) not optimal for nodes with significantly fewer than n edges incident to it Sparse graph – many fewer edges than n2

Graph Representation: Adjacency List
Adjacency list: Node-indexed array of lists Checking if (u, v) is an edge takes O(deg(u)) time Space proportional to m + n (much less than n2) # incident edges 1 2 3 2 1 3 4 5 3 1 2 5 7 8 4 2 5 5 2 3 4 6 6 5 7 3 8 8 3 7

Graph Representation: Adjacency List
Thus the adjacency list is the natural representation for exploring graphs.

Recall BFS algorithm BFS intuition: Explore outward from s in all possible directions, adding nodes one "layer" at a time BFS algorithm – (see precise algorithm Pg using array Discovered of length n where discovered[v] = true as soon as the search sees v) L0 = { s } L1 = all neighbors of L0 L2 = all nodes that do not belong to L0 or L1, and that have an edge incident to a node in L1 Li+1 = all nodes that do not belong to an earlier layer, and that have an edge incident to a node in Li Theorem: For each i, Li consists of all nodes at distance exactly i from s. There is a path from s to t iff t appears in some layer

Implementing Breadth First Search
Theorem. The above implementation of BFS runs in O(m + n) time if the graph is given by its adjacency representation. Pf. Easy to prove O(n2) running time: at most n lists L[i] each node occurs on at most one list; for loop runs  n times when we consider node u, there are  n incident edges (u, v), and we spend O(1) processing each edge Actually runs in O(m + n) time: when we consider node u, there are deg(u) incident edges (u, v) total time processing edges is uV deg(u) = 2m need O(n) additional time to set up the lists and manage the array Discovered

Implementing Breadth First Search
Algorithm pg used separate lists L[i] for each layer Li; we could implement the algorithm with a single list L which we maintain as a queue. In this way the algorithm processes the nodes in the order they are first discovered; each time a node is discovered it is added to the end to the queue; the algorithm processes the edges out of the node that is currently first in the queue.

Connected component containing node 1 = { 1, 2, 3, 4, 5, 6, 7, 8 }
Connected component: Find all nodes reachable from s Connected component containing node 1 = { 1, 2, 3, 4, 5, 6, 7, 8 }

Flood Fill Flood fill: Given lime green pixel in an image, change color of entire blob of neighboring lime pixels to blue Node: pixel Edge: two neighboring lime pixels Blob: connected component of lime pixels recolor lime green blob to blue recolor lime green blob to blue

Connected Component Connected component: Find all nodes reachable from s Theorem: Upon termination, R is the connected component of G containing s BFS = explore in order of distance from s DFS = explore in a different way, using paths R s u v it's safe to add v

Connected Component Theorem: Upon termination, R is the connected component of G containing s Proof: Clearly, for any node x ϵ R there is a path from s to x. Suppose R is not the connected component of G containing s. Then there is a node w /ϵ R and an s-w path P in G. Thus there is a first node v on P that does not belong to R; this node is not s so there is a node u immediately preceding v on P; so (u,v) is an edge. Since v is the first node on P not belonging to R, u ϵ R . Altogether we have that (u,v) is an edge, u ϵ R and v /ϵ R. This contradicts the stopping rule

Directed Graphs Directed graph: G = (V, E)
Edge (u, v) goes from node u to node v Ex: Web graph - hyperlink points from one web page to another. Directedness of graph is crucial Modern web search engines exploit hyperlink structure to rank web pages by importance

Directed graphs To represent a directed graph we modify the adjacency list representation and associate 2 lists with each node; one list consists of nodes to which it has edges and one list has nodes from which it has edges. BFS algorithm for directed graphs is almost the same; first layer is the nodes to which s has an edge, second layer is additional nodes to which nodes in first layer have an edge, etc. Suppose we want the set of nodes with paths to s. Let G be a directed graph and let Grev be the directed graph obtained from G by reversing the directions of all the edges. Run BFS on Grev – a node has a path from s in Grev iff it has a path to s in G

Strong Connectivity Def: Nodes u and v are mutually reachable if there is a path from u to v and also a path from v to u Def: A graph is strongly connected if every pair of nodes is mutually reachable Lemma: Let s be any node. G is strongly connected iff every node is reachable from s, and s is reachable from every node Pf:  Follows from definition Pf:  Path from u to v: concatenate u-s path with s-v path. Path from v to u: concatenate v-s path with s-u path ok if paths overlap s u v

Strong Connectivity Algorithm to determine is G is strongly connected
Pick any node s Run BFS from s in G Run BFS from s in Grev Return true iff all nodes reached in both BFS executions Correctness follows immediately from lemma on previous slide reverse orientation of every edge in G strongly connected not strongly connected

Directed Acyclic Graphs
Def: A DAG is a directed graph that contains no directed cycles Ex: Precedence constraints: edge (vi, vj) means vi must precede vj Def: A topological order of a directed graph G = (V, E) is an ordering of its nodes as v1, v2, …, vn so that for every edge (vi, vj) we have i < j v2 v3 v6 v5 v4 v1 v2 v3 v4 v5 v6 v7 v7 v1 a DAG a topological ordering

Precedence Constraints
Precedence constraints: Edge (vi, vj) means task vi must occur before vj Applications Course prerequisite graph: course vi must be taken before vj Compilation: module vi must be compiled before vj Pipeline of computing jobs: output of job vi needed to determine input of job vj

Proposition: If G has a topological order, then G is a DAG Proof – by contradiction: How to start this one?? (Proof in class)

Proof: Suppose that G has a topological ordering (v1, v2, …vn), and G is not a DAG; Since G has a topological order it must be a directed graph so if G is not a DAG, then G has a cycle; call the cycle C. Let vi be the node in C with the lowest ordering; Consider the node on C just before vi call it vj (this exists because C is a cycle); clearly (vj ,vi ) is an edge in G; The topological ordering (definition slide 17) tells us that j < i . Since vi has the lowest ordering of nodes in C, j > i; This is a contradiction.

From DAGs to Topological orderings
Proposition: If G is a DAG, then G has a topological ordering

Lemma. If G is a DAG, then G has a node with no incoming edges. Pf. (by contradiction) Suppose that G is a DAG and every node has at least one incoming edge. Let's see what happens. Pick any node v, and begin following edges backward from v. Since v has at least one incoming edge (u, v) we can walk backward to u. Then, since u has at least one incoming edge (x, u), we can walk backward to x. Repeat until we visit a node, say w, twice. Let C denote the sequence of nodes encountered between successive visits to w. C is a cycle a contradiction since by definition a DAG has no (directed) cycles▪ w x u v

Proposition. If G is a DAG, then G has a topological ordering. Pf. by induction on n [Show statement true for n=1 (base case), assume statement is true for n-1 and (use this assumption – called the induction hypothesis to) prove it is true for n] Base case: true if n = 1 (obvious). Given DAG G with n > 1 nodes, previous lemma tells us we can find a node v with no incoming edges. G - { v } is a DAG, since deleting v cannot create cycles. By induction hypothesis, G - { v } has a topological ordering. Place v first in an ordering; then place nodes of G - { v } in their topological ordering. This is a topological order (v has no incoming edges and we can use its outgoing edges to append it to G- { v }).

Find a topological ordering of a DAG

Data Structures and Algorithm Analysis Lecture 5

Similar presentations

Presentation on theme: "Data Structures and Algorithm Analysis Lecture 5"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structures and Algorithm Analysis Lecture 5

Similar presentations

Presentation on theme: "Data Structures and Algorithm Analysis Lecture 5"— Presentation transcript:

Similar presentations

About project

Feedback