CS 336 March 19, 2012 Tandy Warnow.

Slides:



Advertisements
Similar presentations
Mathematical Preliminaries
Advertisements

Trees Chapter 11.
Cpt S 223 – Advanced Data Structures Graph Algorithms: Introduction
Chapter 8 Topics in Graph Theory
Chapter 9 Graphs.
Introduction to Graph Theory Instructor: Dr. Chaudhary Department of Computer Science Millersville University Reading Assignment Chapter 1.
Bipartite Matching, Extremal Problems, Matrix Tree Theorem.
Walks, Paths and Circuits Walks, Paths and Circuits Sanjay Jain, Lecturer, School of Computing.
De Bruijn sequences Rotating drum problem:
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Introduction to Graphs
Introduction to Graph Theory Lecture 11: Eulerian and Hamiltonian Graphs.
CompSci 102 Discrete Math for Computer Science April 19, 2012 Prof. Rodger Lecture adapted from Bruce Maggs/Lecture developed at Carnegie Mellon, primarily.
Last time: terminology reminder w Simple graph Vertex = node Edge Degree Weight Neighbours Complete Dual Bipartite Planar Cycle Tree Path Circuit Components.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Mycielski’s Construction Mycielski’s Construction: From a simple graph G, Mycielski’s Construction produces a simple graph G’ containing G. Beginning with.
Topics: 1. Finding a cycle in a graph 2. Propagation delay - example 3. Trees - properties מבנה המחשב - אביב 2004 תרגול 3#
Graphs and Trees This handout: Trees Minimum Spanning Tree Problem.
Vertex Cut Vertex Cut: A separating set or vertex cut of a graph G is a set SV(G) such that S has more than one component. Connectivity of G ((G)): The.
1 CIS /204—Spring 2008 Recitation 10 Friday, April 4, 2008.
Math Foundations Week 12 Graphs (2). Agenda Paths Connectivity Euler paths Hamilton paths 2.
CS5371 Theory of Computation Lecture 1: Mathematics Review I (Basic Terminology)
Costas Busch - RPI1 Mathematical Preliminaries. Costas Busch - RPI2 Mathematical Preliminaries Sets Functions Relations Graphs Proof Techniques.
Courtesy Costas Busch - RPI1 Mathematical Preliminaries.
Minimum Spanning Trees. Subgraph A graph G is a subgraph of graph H if –The vertices of G are a subset of the vertices of H, and –The edges of G are a.
GRAPH Learning Outcomes Students should be able to:
 Jim has six children.  Chris fights with Bob,Faye, and Eve all the time; Eve fights (besides with Chris) with Al and Di all the time; and Al and Bob.
CS 394C March 19, 2012 Tandy Warnow.
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Foundations of Discrete Mathematics
Trees and Distance. 2.1 Basic properties Acyclic : a graph with no cycle Forest : acyclic graph Tree : connected acyclic graph Leaf : a vertex of degree.
Tree A connected graph that contains no simple circuits is called a tree. Because a tree cannot have a simple circuit, a tree cannot contain multiple.
Mathematical Preliminaries. Sets Functions Relations Graphs Proof Techniques.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
5.5.3 Rooted tree and binary tree  Definition 25: A directed graph is a directed tree if the graph is a tree in the underlying undirected graph.  Definition.
Data Structures & Algorithms Graphs
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
COSC 2007 Data Structures II Chapter 14 Graphs I.
5.5.2 M inimum spanning trees  Definition 24: A minimum spanning tree in a connected weighted graph is a spanning tree that has the smallest possible.
Unit – V Graph theory. Representation of Graphs Graph G (V, E,  ) V Set of vertices ESet of edges  Function that assigns vertices {v, w} to each edge.
Graph Theory and Applications
 Quotient graph  Definition 13: Suppose G(V,E) is a graph and R is a equivalence relation on the set V. We construct the quotient graph G R in the follow.
Graphs Lecture 2. Graphs (1) An undirected graph is a triple (V, E, Y), where V and E are finite sets and Y:E g{X V :| X |=2}. A directed graph or digraph.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
CS 173, Lecture B Introduction to Genome Assembly (using Eulerian Graphs) Tandy Warnow.
1 Mathematical Preliminaries. 2 Sets Functions Relations Graphs Proof Techniques.
Great Theoretical Ideas in Computer Science for Some.
COMPSCI 102 Introduction to Discrete Mathematics.
CSE 421 Algorithms Richard Anderson Autumn 2015 Lecture 5.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Section Recursion 2  Recursion – defining an object (or function, algorithm, etc.) in terms of itself.  Recursion can be used to define sequences.
Trees.
De Bruijn sequences 陳柏澍 Novembers Each of the segments is one of two types, denoted by 0 and 1. Any four consecutive segments uniquely determine.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Hamiltonian Graphs Graphs Hubert Chan (Chapter 9.5)
Proof technique (pigeonhole principle)
Graph theory Definitions Trees, cycles, directed graphs.
Eulerian tours Miles Jones MTThF 8:30-9:50am CSE 4140 August 15, 2016.
Hamiltonian Graphs Graphs Hubert Chan (Chapter 9.5)
Introduction to Genome Assembly
CS 598AGB Genome Assembly Tandy Warnow.
CSE 421: Introduction to Algorithms
Richard Anderson Autumn 2016 Lecture 5
Discrete Mathematics for Computer Science
Richard Anderson Lecture 5 Graph Theory
Presentation transcript:

CS 336 March 19, 2012 Tandy Warnow

Basic Graph Terminology Nodes, vertices, edges, degrees, paths, cycles, connected components, adjacency, isolated vertices, trees, forests Directed graphs: indegree, outdegree, trees

Advanced terminology Cliques Independent sets Chromatic number and vertex colorings Eulerian cycles and Eulerian paths Hamiltonian paths Matchings Dominating Set Vertex Cover

Paths, Connected Components, etc. A path is a sequence of vertices v1, v2, …, vn so that vi is adjacent to vi+1 for i=1,2,…,n-1. A simple path is one that does not have repeated vertices. A graph is connected if every pair of vertices in the graph is connected by some path. A connected component is a maximal subset of the vertices that is connected.

Cycles A cycle in a graph is a path that starts and ends at the same vertex. A simple cycle is a cycle that does not have any repeated vertices (other than the start and end vertex). A graph is acylic if it has no simple cycles.

Trees Two types: rooted and unrooted Unrooted (simplest): acylic connected graph Rooted: take an unrooted tree, pick one node to be the root, and direct all edges away from the root. Voila!

Theorems about trees Let T be a connected acyclic graph (i.e., a tree) with n vertices (n>0). Then: T has at least one leaf (node with degree 0 or 1). T has n-1 edges. Every edge in T is a cut-edge. Every tree can be 2-colored.

Theorem: Every tree has at least one leaf (node of degree 1) Theorem: For any tree T with at least one vertex, T has at least one leaf (node with degree 0 or 1). Proof: If n=1, then T is a single vertex which is a leaf. Else, n>1. Let P be a longest simple path in T, so P=v1,v2,…,vk. If vk has degree 1, we are done. Otherwise, vk has at least two neighbors, and so some neighbor w other than vk-1. If w is in P, then we have a simple cycle in T, contradicting that T is a tree. If w is not in P, then we can extend P and get a longer path, contradicting that P is a longest simple path in T. Hence, vk has degree 1, and we are done.

Theorem: Any tree with n>0 nodes has n-1 edges Proof: by induction on n. Base case: n=1 (trivial) Inductive hypothesis: for some positive n, any tree on n nodes has exactly n-1 edges. Let T be a tree on n+1 nodes. We want to show T has exactly n edges.

Proof (cont’d) Let v be a node in T with degree 1. Remove v from T. The result is a tree T’ with n nodes, and hence n-1 edges (by the inductive hypothesis) T’ contains one fewer edge and one fewer vertex (node) than T, and so T has n edges.

Theorem: every edge in a tree is a cut-edge Proof (by contradiction). Suppose T is a tree, e=(v,w) is an edge in T that is not a cut-edge. Then G=T-{e} (but keeping v and w) is connected. Hence there is a simple path P from v to w in G. Since e is not in G, P does not include edge e. Therefore, we can form a simple cycle C by adding edge e to P. Since every edge in C is in T, this means that T is not acyclic, contradicting the assumption that T is a tree (connected acyclic graph).

Vertex Coloring A (proper) vertex coloring of a graph is a function c: V -> {1,2,…,k}, s.t. no two adjacent vertices are mapped to the same color. The chromatic number of a graph is the minimum number of colors needed to properly color the graph. How many colors does a tree need?

2-coloring a tree Theorem: every connected acyclic graph (i.e., tree) can be 2-colored. Proof: by induction on the number of vertices.

Proof that every tree can be 2-colored Let G be a tree on n vertices. The base case is n=1. Clearly every tree on 1 vertex can be 2-colored. The Inductive Hypothesis is that for some positive integer n, any tree on n vertices can be 2-colored. Let G be a tree with n+1 vertices. We want to show that G can be 2-colored.

Proof (cont’d) Let v be a node in G that has degree 1, and let w be its unique neighbor in G. Consider the graph G’ formed by deleting v (and its incident edge but not w) from G. G’ is also acyclic (why?) and has n-1 vertices. Therefore, by the inductive hypothesis, G’ can be 2-colored. We extend the coloring from G’ to G, by letting c(v) be 1 if c(w)=2, and c(v)=2 if c(w)=1. Note that this coloring is proper for G. Hence G can be 2-colored.

Structural Induction This was a proof by structural induction. Proofs by structural induction can be applied more generally!

Theorem about rooted trees A rooted tree in which every node has 0 or 2 children is called a “binary tree” Theorem: every binary tree with n nodes has (n-1)/2 internal nodes (defined to be nodes with more than 0 children). Proof: by strong induction on n. Base case: n=1. Such a tree has no internal nodes, so it is true.

Proof, cont’d. Strong Inductive hypothesis: for some n>0, and for all positive integers k up to n, all rooted binary trees with k nodes have (k-1)/2 internal nodes. Let T have n+1 nodes, and let the children of the root be A and B. (We know the root has two children, since if it had no children, T would have 1 node, contradicting our hypothesis.) We want to show Int(T) = n/2

We want to show Int(T) = n/2 TA, the subtree of T rooted at A, is a binary tree; let nA be the number of nodes in TA TB, the subtree of T rooted at B, is a binary tree; let nB be the number of nodes in TB Let Int(T) be the number of internal nodes of T, and Int(TA) and Int(TB) be similarly defined.

We want to show Int(T) = n/2 Then nA and nB are both at most n, and by the inductive hypothesis Int(TA) = (nA-1)/2 Int(TB ) = (nB-1)/2 Therefore Int(T) = (nA-1)/2 + (nB-1)/2 + 1

We want to show Int(T) = n/2 We have established that Int(T) = (nA-1)/2 + (nB-1)/2 + 1 Simplifying this, we get Int(T) = (nA-1 + nB -1 + 2)/2 = (nA + nB)/2 Note nT = nA + nB + 1 Therefore, Int(T) = (nT - 1)/2 Recall that nT = n+1. Therefore, Int(T) = n/2 Q.E.D.

Genome Assembly Given a DNA sequence, technology can allow you to get a collection of k-mers (substrings of length k) that come from analyses of the sequence. From these k-mers, your objective is to come up with the sequence.

Genome Assembly Let X be a very long DNA sequence Consider all k-mers in X, with k big enough so that no k-mer appears two or more times Goal: reconstruct X from its set of k-mers

Genome Assembly, attempt #1 Approach 1: Make a node for each k-mer, and put a directed edge from v to w if the k-1 suffix of v is the k-1 prefix of w. Create the graph for the following string, using k=5 ACATAGGATTCAC

Genome Assembly, attempt #1 Approach 1: Make a node for each k-mer, and put a directed edge from v to w if the k-1 suffix of v is the k-1 prefix of w. Every such graph has a Hamiltonian Path, as long as no k-mer appears more than once!

Hamiltonian Path A Hamiltonian Path in a graph visits every node exactly once

Genome Assembly Attempt #1 Create the graph for the following string, using k=5 ACATAGGATTCAC Does the graph have a Hamiltonian Path? Is it unique? Can you reconstruct the sequence from the path?

Hamiltonian Path A Hamiltonian Path in a graph visits every node exactly once Determining if a graph has a Hamiltonian Path is NP-Complete So this approach to Genome Assembly is computationally intensive (infeasible)

Eulerian Cycles An Eulerian cycle is one that goes through every edge exactly once It is easy to see that if a graph has an Eulerian cycle, then every node has even degree. The converse is also true, but a bit harder to prove. For directed graphs, the cycle will need to follow the direction of the edges (also called “arcs”). In this case, a graph has an Eulerian cycle if and only if the indegree is equal to the outdegree for every node.

Eulerian Paths An Eulerian path is one that goes through every edge exactly once It is easy to see that if a graph has an Eulerian path, then all but 2 nodes have even degree. The converse is also true, but a bit harder to prove. For directed graphs, the cycle will need to follow the direction of the edges (also called “arcs”). In this case, a graph has an Eulerian path if and only if the indegree(v)=outdegree(v) for all but 2 nodes (x and y), where indegree(x)=outdegree(x)+1, and indegree(y)=outdegree(y)-1.

de Bruijn Graph Input: the set of k-mers for the DNA sequence Output: the de Bruijn Graph Vertices: the (k-1)-mers Directed edges: from v->w if the (k-2)-suffix of v is the (k-2)-prefix of w, and the k-mer formed by starting with v and ending with w is one of the k-mers in the input

de Bruijn Graph If the k-mer set comes from a sequence and no k-mer appears more than once in the sequence, then the de Bruijn graph has an Eulerian path!

Using de Bruijn Graphs Given: set of k-mers from a DNA sequence Algorithm: Construct the de Bruijn graph Find an Eulerian path in the graph The path defines a sequence with the same set of k-mers as the original

de Bruijn Graph Create the de Bruijn graph for the following string, using k=5 ACATAGGATTCAC Find the Eulerian path Is the Eulerian path unique? Reconstruct the sequence from this path