Based on slides of Dan Suciu Graphs Based on slides of Dan Suciu
Read: textbook, Chapter 22 Outline Graphs: Definitions Properties Examples Topological Sort Graph Traversals Depth First Search Breadth First Search Read: textbook, Chapter 22
Graphs Graphs = formalism for representing relationships between objects A graph G = (V, E) V is a set of vertices, |V|= n E V V is a set of edges, | E|= m Han Leia Luke V = {Han, Leia, Luke} E = {(Luke, Leia), (Han, Leia), (Leia, Han)}
Directed vs. Undirected Graphs In directed graphs, edges have a specific direction: In undirected graphs, they don’t (edges are two-way): Vertices u and v are adjacent if (u, v) E Han Luke Leia Han Luke Leia
Weighted Graphs Each edge has an associated weight or cost. Clinton 20 Mukilteo Kingston 30 Edmonds How could we store weights in a matrix? List? Bainbridge 35 Seattle 60 Bremerton There may be more information in the graph as well.
Paths and Cycles A path is a list of vertices {v1, v2, …, vn} such that (vi, vi+1) E for all 0 i < n. A cycle is a path that begins and ends at the same node. Chicago Seattle Salt Lake City San Francisco Dallas p = {Seattle, Salt Lake City, Chicago, Dallas, San Francisco, Seattle}
Path Length and Cost Path length: the number of edges in the path Path cost: the sum of the costs of each edge 3.5 Chicago Seattle 2 2 Salt Lake City 2 2.5 2.5 2.5 3 San Francisco Dallas length(p) = 5 cost(p) = 11.5
Connectivity Undirected graphs are connected if there is a path between any two vertices Directed graphs are strongly connected if there is a path from any one vertex to any other Can a directed graph which is strongly connected be acyclic? NO. (Except the trivial one node case.) There must be a path from A to B and a path from B to A; so, there must be a cycle. What about a weakly connected directed graph? YES! See the one on the slide. There are also further definitions of these; for example, the concept of biconnectivity showed up in my satisfiability research: Does the graph have two distinct paths between any two vertices.
Connectivity Directed graphs are weakly connected if there is a path between any two vertices, ignoring direction A complete graph has an edge between every pair of vertices Can a directed graph which is strongly connected be acyclic? NO. (Except the trivial one node case.) There must be a path from A to B and a path from B to A; so, there must be a cycle. What about a weakly connected directed graph? YES! See the one on the slide. There are also further definitions of these; for example, the concept of biconnectivity showed up in my satisfiability research: Does the graph have two distinct paths between any two vertices.
Connectivity A (strongly) connected component is a subgraph which is (strongly) connected CC in an undirected graph: SCC in a directed graph:
Distance and Diameter The distance between two nodes, d(u,v), is the length of the shortest paths, or if there is no path The diameter of a graph is the largest distance between any two nodes Graph is strongly connected iff diameter <
An Interesting Graph The Web ! Each page is a vertice Each hyper-link is an edge How many nodes ? Is it connected ? What is its diameter ?
The Web Graph
Other Graphs Geography: Publications: Phone calls: who calls whom Cities and roads Airports and flights (diameter 20 !!) Publications: The co-authorship graph The reference graph Phone calls: who calls whom Almost everything can be modeled as a graph !
Implementing a Graph Implement a graph in three ways: Adjacency List Adjacency-Matrix Pointers/memory for each node (actually a form of adjacency list)
Adjacency List List of pointers for each vertex
Undirected Adjacency List
Adjacency List The sum of the lengths of the adjacency lists is 2|E| in an undirected graph, and |E| in a directed graph. The amount of memory to store the array for the adjacency list is O(max(V,E))=O(V+E).
Adjacency Matrix 1 2 3 4 5 1 0 1 1 0 0 2 0 0 0 0 0 3 0 0 0 1 0 4 1 0 0 0 0 5 0 1 0 1 0
Undirected Graph - Adjacency Matrix 1 2 3 4 5 1 0 1 1 1 0 2 1 0 0 0 1 3 1 0 0 1 0 4 1 0 1 0 1 5 0 1 0 1 0
Adjacency Matrix vs. List? The matrix always uses Θ(v2) memory. Usually easier to implement and perform lookup than an adjacency list. The adjacency matrix is a good way to represent a weighted graph. In a weighted graph, the edges have weights associated with them. Update matrix entry to contain the weight. Weights could indicate distance, cost, etc.
Graph Density A sparse graph has O(|V|) edges A dense graph has (|V|2) edges Anything in between is either sparsish or densy depending on the context. Graph density is a measure of the number of edges. Remember that we need to analyze graphs in terms of both |E| and |V|. If we know the density of the graph, we can dispense with that.
Trees as Graphs Every tree is a graph(can be directed or undirected) with some restrictions: there are no cycles (directed or undirected) there is a directed path from the root to every node A B C D E F Each of the red areas breaks one of these constraints. We won’t always require the constraint that the tree be directed. In fact, if it’s not directed, we can just pick a root and hang everything from that node (making the edges directed); so, it’s not a big deal. G H
Trees as Graphs Every tree is a graph(can be directed or undirected) with some restrictions: there are no cycles (directed or undirected) there is a path from the root to every node A B C D E F Each of the red areas breaks one of these constraints. We won’t always require the constraint that the tree be directed. In fact, if it’s not directed, we can just pick a root and hang everything from that node (making the edges directed); so, it’s not a big deal. G H BAD! I J
Directed Acyclic Graphs (DAGs) DAGs are directed graphs with no cycles. main() mult() if program call graph is a DAG, then all procedure calls can be in-lined add() DAGs are a very common representation for dependence graphs. The DAG here shows the non-recursive call-graph from a program. Remember topological sort? Any DAG can be topo-sorted. Any graph that’s not a DAG cannot be topo-sorted. read() access() Trees DAGs Graphs
Application of DAGs: Representing Partial Orders reserve flight check in airport call taxi pack bags take flight Not all orderings are total orderings, however. Now, everyone tell me where there should be an edge! locate gate taxi to airport
Topological Sort Given a graph, G = (V, E), output all the vertices in V such that no vertex is output before any other vertex with an edge to it. reserve flight check in airport call taxi take flight taxi to airport locate gate pack bags
Topological Sort call taxi pack bags taxi to airport reserve flight check in airport locate gate take flight
Topo-Sort Take One runtime: O(|V|2) Label each vertex’s in-degree (# of inbound edges) While there are vertices remaining Pick a vertex with in-degree of zero and output it Reduce the in-degree of all vertices adjacent to it Remove it from the list of vertices That’s a bit less efficient than we can do. Let’s try this. How well does this run? runtime: O(|V|2)
Can we use a stack instead of a queue ? Topo-Sort Take Two Label each vertex’s in-degree Initialize a queue to contain all in-degree zero vertices While there are vertices remaining in the queue Remove a vertex v with in-degree of zero and output it Reduce the in-degree of all vertices adjacent to v Put any of these with new in-degree zero on the queue Why use a queue? runtime: O(|V| + |E|) Can we use a stack instead of a queue ?
Topo-Sort Topo-Sort takes linear time: O(|V|+|E|)
Recall: Tree Traversals b c d e h i j f g k l a b f g k c d h i l j e
Depth-First Search Pre/Post/In – order traversals are examples of depth-first search Nodes are visited deeply on the left-most branches before any nodes are visited on the right-most branches Visiting the right branches deeply before the left would still be depth-first! Crucial idea is “go deep first!” In DFS the nodes “being worked on” are kept on a stack
DFS(not quite right) } void DFS(Node startNode) { Stack s = new Stack; s.push(startNode); while (!s.empty()) { x = s.pop(); /* process x */ for y in x.children() do s.push(y); }
DFS on Trees a b f g k a b c d e h i j f g k l
DFS on Graphs a b f g a b c d e h i j f g k l Problem with cycles: may run into an infinite loop
DFS- for connected graphs for v in Nodes do v.visited = false; DFS(startNode); void DFS(Node v) { Stack s = new Stack; s.push(v); while (!s.empty()) { x = s.pop(); if (!x.visited) { x.visited = true; process( x ) } for y in x.children() do if (!y.visited) s.push(y); Running time: O(|V| + |E|) Key observation: each node enters the stack at most its indegree times
DFS- for general graphs for v in V do v.visited = false; if (!v.visited) DFS(v); Running time:
Breadth-First Search Traversing tree level by level from top to bottom (alphabetic order) a i d h j b f k l e c g
Breadth-First Search BFS characteristics Nodes being worked on maintained in a FIFO Queue, not a stack Iterative style procedures often easier to design than recursive procedures
Breadth-First Search- for connected graphs for v in Nodes do v.visited = false; DFS(startNode); void DFS(Node v) { Stack s = new Stack; s.push(v); while (!s.empty()) { x = s.pop(); if (!x.visited) x.visited = true; process (x) for y in x.children() do if (!y.visited) s.push(y); } for v in Nodes do v.visited = false; BFS(startNode); void BFS(Node v) { Queue s = new Queue; s.enqueue(startNode); while (!s.empty()) { x = s.dequeue(); if (!x.visited) x.visited = true; process (x) for y in x.children() do if (!y.visited) s.enqueue(y); }
QUEUE a b c d e c d e f g d e f g e f g h i j f g h i j g h i j h i j k i j k j k l k l l a i d h j b f k l e c g
Graph Traversals DFS and BFS Either can be used to determine connectivity: Is there a path between two given vertices? Is the graph (weakly) connected? Important difference: Breadth-first search always finds a shortest path from the start vertex to any other (for unweighted graphs) Depth first search may not!