Bioinformatics Programming 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Simplifications of Context-Free Grammars
Variations of the Turing Machine
Angstrom Care 培苗社 Quadratic Equation II
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
David Burdett May 11, 2004 Package Binding for WS CDL.
Create an Application Title 1Y - Youth Chapter 5.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
The 5S numbers game..
Graph Algorithms Algorithm Design and Analysis Victor AdamchikCS Spring 2014 Lecture 11Feb 07, 2014Carnegie Mellon University.
Media-Monitoring Final Report April - May 2010 News.
Break Time Remaining 10:00.
EE, NCKU Tien-Hao Chang (Darby Chang)
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Lecture 15. Graph Algorithms
Data structure is concerned with the various ways that data files can be organized and assembled. The structures of data files will strongly influence.
1 Linked Lists A linked list is a sequence in which there is a defined order as with any sequence but unlike array and Vector there is no property of.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
1 The Royal Doulton Company The Royal Doulton Company is an English company producing tableware and collectables, dating to Operating originally.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
BEEF & VEAL MARKET SITUATION "Single CMO" Management Committee 22 November 2012.
TESOL International Convention Presentation- ESL Instruction: Developing Your Skills to Become a Master Conductor by Beth Clifton Crumpler by.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Chapter 1: Expressions, Equations, & Inequalities
1..
Graphs, representation, isomorphism, connectivity
Adding Up In Chunks.
Discrete Mathematics Dr.-Ing. Erwin Sitompul
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
Before Between After.
Subtraction: Adding UP
: 3 00.
5 minutes.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Types of selection structures
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
CS203 Lecture 15.
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Based on slides by Y. Peng University of Maryland
CHAPTER 6 GRAPHS All the programs in this file are selected from
Graph Theory.
CS Data Structures Chapter 6 Graphs.
GRAPHS Education is what remains after one has forgotten what one has learned in school. Albert Einstein Albert Einstein Smitha N Pai.
Copyright Networking Laboratory Chapter 6. GRAPHS Horowitz, Sahni, and Anderson-Freed Fundamentals of Data Structures in C, 2nd Edition Computer.
Been-Chian Chien, Wei-Pang Yang, and Wen-Yang Lin 6-1 Chapter 6 Graphs Introduction to Data Structure CHAPTER 6 GRAPHS 6.1 The Graph Abstract Data Type.
Lecture #13. Topics 1.The Graph Abstract Data Type. 2.Graph Representations. 3.Elementary Graph Operations.
Data Structures 13th Week
Presentation transcript:

Bioinformatics Programming 1 EE, NCKU Tien-Hao Chang (Darby Chang)

Graph 2

Graph Köenigsberg bridge problem 3

Graph Definitions A graph G consists of two sets –a finite, nonempty set of vertices V(G) –a finite, possible empty set of edges E(G) –G(V,E) represents a graph An undirected graph is one in which the pair of vertices in an edge is unordered –(v 0,v 1 ) = (v 1,v 0 ) A directed graph is one in which each edge is a directed pair of vertices – ≠ 4

Graph Examples complete undirected graph: n(n-1)/2 edges complete directed graph: n(n-1) edges 5 G1G1 complete graph incomplete graph V(G 1 )={0,1,2,3}E(G 1 )={(0,1),(0,2),(0,3),(1,2),(1,3),(2,3)} V(G 2 )={0,1,2,3,4,5,6}E(G 2 )={(0,1),(0,2),(1,3),(1,4),(2,5),(2,6)} V(G 3 )={0,1,2}E(G 3 )={,, } G2G G3G3

Graph Restrictions A graph may not have an edge from a vertex, i, back to itself –such edges are known as self loops A graph may not have multiple occurrences of the same edge –if we remove this restriction, we obtain a data referred to as a multi-graph 6

feedback loops 7 multi-graph

Graph Adjacent and incident If (v 0,v 1 ) is an edge in an undirected graph –v 0 and v 1 are adjacent –the edge (v 0,v 1 ) is incident on v 0 and v 1 If is an edge in a directed graph –v 0 is adjacent to v 1, and v 1 is adjacent from v 0 –the edge is incident on v 0 and v

Graph Sub-graph A sub-graph of G is a graph G’ such that V(G’)  V(G) and E(G’)  E(G) 9

10 G1G G3G3

Graph Path A path from vertex v p to vertex v q in a graph G, is a sequence of vertices, v p, v i1, v i2,..., v in, v q, such that (v p,v i1 ), (v i1,v i2 ),..., (v in,v q ) are edges in G –a path such as (0,2), (2,1), (1,3) is also written as 0,2,1,3 The length of a path is the number of edges on it

Graph Simple path and cycle Simple path (simple directed path) –a path in which all vertices, except possibly the first and the last, are distinct A cycle is a simple path in which the first and the last vertices are the same

Graph Connected graph/component Connected graph –in an undirected graph G, two vertices, v 0 and v 1, are connected if there is a path in G from v 0 to v 1 –an undirected graph is connected if, for every pair of distinct vertices v i, v j, there is a path from v i to v j Connected component –a connected component of an undirected graph is a maximal connected sub-graph by maximal, no other sub-graph that is both connected and properly contains the component A tree is a graph that is connected and acyclic (i.e., has no cycle) 13

14

Connected component Strongly connected A directed graph is strongly connected if there is a directed path from v i to v j and also from v j to v i A strongly connected component is a maximal sub-graph that is strongly connected 15

Any Questions? 16

How Many 17 Strongly connected components in this graph? G3G3

not strongly connected 2 strongly connected components (maximal strongly connected sub-graph) G3G

Graph Degree The degree of a vertex is the number of edges incident to that vertex For directed graph –in-degree(v): the number of edges that have v as the head –out-degree(v): the number of edges that have v as the tail If d i is the degree of a vertex i in a graph G with n vertices and e edges, the number of edges is – 19

20 undirected graph directed graph in:1, out:1 in:1, out:2 in:1, out:

21

Graph Representations Adjacency matrix Adjacency lists Adjacency multi-lists 22

Graph representation Adjacency matrix 23

Graph representation Adjacency list 24

Graph representation Adjacency multi-list 25

Adjacency matrix Let G = (V,E) be a graph with n vertices The adjacency matrix of G is a two- dimensional n x n array, say adj_mat If the edge (v i,v j ) is in E(G), then adj_mat[i][j]=1, otherwise adj_mat[i][j]=0 The adjacency matrix for an undirected graph is symmetric; the adjacency matrix for a digraph need not be symmetric 26

27 G1G G3G3 G4G

Adjacency matrix Merits For an undirected graph, the degree of any vertex, i, is its row sum: For a directed graph, the row sum is the out-degree, while the column sum is the in-degree The complexity of checking edge number or examining if G is connect –G is undirected: O(n 2 /2) –G is directed: O(n 2 ) 28

Adjacency Lists There is one list for each vertex in G –the nodes in list i represent the vertices that are adjacent from vertex i For an undirected graph with n vertices and e edges, this representation requires n head nodes and 2e list nodes 29

G1G1 30 G1G G3G3 G4G

Any Questions? 31

Adjacency Lists Interesting operations Degree of a vertex in an undirected graph –# of nodes in its adjacency list # of edges in a graph –determined in O(n+e) Out-degree of a vertex in a directed graph –# of nodes in its adjacency list In-degree of a vertex in a directed graph –traverse the whole data structure 32 answer

Adjacency list Sequential Representation Sequentially pack the nodes on the adjacency list n vertices and e edges requires an array of n+2e+2 nodes –node[0] … node[n-1]: gives the starting point of the list for vertex i –node[n]: n+2e+1 –node[n+1] … node[n+2e]: the other end of the edge The vertices adjacent from vertex I are stored in node[node[i]] … node[node[i+1]-1], 0 ≦ i < n 33 [0]9[13]02 [1]11[14]3 [2]13[15]13 [3]15[16]2 [4]17[17]54 [5]18[18]45 [6]20[19]6 [7]22[20]56 [8]23[21]7 [9]10[22]67 [10]2 [11]01 [12]3

Adjacency list Inverse adjacency list Finding in-degree of vertices G3G3 [0] [1] [2] 1NULL 0 1

Adjacency list Orthogonal representation 35 01NULL G3G3

Adjacency list Order is of no significance 36 [0] 312NULL [1]203NULL [2]301NULL [3]210NULL G1G

Any Questions? 37 About the previous representation

Is There 38 Any waste?

Adjacency multi-lists Lists in which nodes may be shared among several lists –an edge is shared by two different paths There is exactly one node for each edge 39

40 G1G

Graph Weighted edges The edges have weights assigned to them These weights may represent as –the distance from one vertex to another –the cost of going from one vertex to an adjacent vertex Adjacency matrix –adj_mat[i][j] would keep the weights Adjacency lists –add a weight field to the node structure A graph with weighted edges is called a network 41

Graph Operations Traversal –given G=(V,E) and vertex v, find all wV, such that w connects v –Depth First Search (DFS) preorder traversal –Breadth First Search (BFS) level order traversal Spanning tree Biconnected components 42

Traversal An example 43 DFS: v 0,v 1,v 3,v 7,v 4,v 5,v 2,v 6 BFS: v 0,v 1,v 2,v 3,v 4,v 5,v 6,v 7

Depth First Search 44

Breadth First Search It needs a queue to implement breadth-first search –typedef struct queue *queue_pointer; typedef struct queue { int vertex; queue_pointer link; }; void addq(queue_pointer *, queue_pointer *, int); int deleteq(queue_pointer *); 45

46

Connected components If G is an undirected graph, then one can determine whether or not it is connected: –simply making a call to either dfs or bfs –then determining if there is any unvisited vertex Adjacency list: O(n+e) Adjacency matrix: O(n 2 ) 47

Spanning tree Definition: A tree T is said to be a spanning tree of a connected graph G if T is a subgraph of G and T contains all vertices of G E(G): T (tree edges) + N (nontree edges) –T: set of edges used during search –N: set of remaining edges 48

Spanning tree DFS/BFS When graph G is connected, a depth first or breadth first search starting at any vertex will visit all vertices in G We may use DFS or BFS to create a spanning tree –Depth first spanning tree when DFS is used –Breadth first spanning tree when BFS is used 49

Spanning tree DFS/BFS When graph G is connected, a depth first or breadth first search starting at any vertex will visit all vertices in G We may use DFS or BFS to create a spanning tree –depth first spanning tree when DFS is used –breadth first spanning tree when BFS is used 50

Spanning tree Properties If a nontree edge (v,w) is introduced into any spanning tree T, then a cycle is formed A spanning tree is a minimal subgraph, G’, of G such that V(G’) = V(G) and G’ is connected –we define a minimal sub-graph as one with the fewest number of edge –a spanning tree has n-1 edges 51

Biconnected graph (G is an undirected, connected graph) Definition: A vertex v of G is an articulation point iff the deletion of v, together with the deletion of all edges incident to v, leaves behind a graph that has at least two connected components Definition: A biconnected graph is a connected graph that has no articulation points Definition: A biconnected component of a connected graph G is a maximal biconnected subgraph H of G 52

Articulation point Examples of articulation points (node 1, 3, 5, 7) connected graph two connected componentsone connected graph

Any Questions? 54

Biconnected component 55 answer

Biconnected component Using DFS By using depth first spanning tree of a connected undirected graph The depth first number (dfn) outside the vertices in the figures gives the DFS visit sequence If u is an ancestor of v then dfn(u) < dfn(v) 56 dfs(3): depth first spanning tree start from

Using DFS to find biconnected components dfn and low low(u) –the lowest dfn that we can reach from u using a path of descendants followed by at most one back edge –low(u) = min { dfn(u), min{ low(w) | w is a child of u }, min{ dfn(w) | (u,w) is a back edge } } nontree edge (back edge)

Using DFS to find biconnected components An example dfn and low values for dfs spanning tree with root = 3 –low(u) = min { dfn(u), min{ low(w) | w is a child of u }, min{ dfn(w) | (u,w) is a back edge } } 58 Vdfnlowchildlow_childlow:dfncut 044(4,n,n)null null:4 130(3,4,0)04 4≧34≧3 * 220(2,0,n)100<2 300(0,0,n)4,50,5 0,5 ≧ 0 * 410(1,0,n)200<1 555(5,5,n)65 5≧55≧5 * 665(6,5,n)755<6 775(7,8,5)8,99,8 9,8 ≧ 7 * 899(9,n,n)null null:9 988(8.n,n)null null:

Using DFS to find biconnected components Articulation point Any vertex u is an articulation point iff –u is the root of the spanning tree and has two or more children –u is not the root and has at least one child w such that we cannot reach an ancestor of u using a path that consists of only w descendants of w a single back edge –low(w)  dfn(u) vertex dfn low

Using DFS to find biconnected components Determining dfn and low 60

Using DFS to find biconnected components The biconnected components Partition the edges of the connected graph into their biconnected components –in dfnlow(), we know that low[w] has been computed following the return from the function call dfnlow(w,u) –if low[w]  dfn[u], then we have identified a new biconnected component –we can output all edges in a biconnected component if we use a stack to save the edges when we first encounter them –the function bicon() contains the code modified from dfnlow() 61

62

Any Questions? 63

AvA Network 64 Inlist of pairs and a minimum size, n OutAvA networks Requirement - all vs. all network - complexity/teamwork report - using C would be the best Bonus - allow some missing edges (most vs. most)

Deadline /4/20 23:59 Zip your code, step-by-step README, complexity analyses and anything worthy extra credit. to

Graph Algorithms and applications Shortest path –4  5: 4,1,3,5 –all pair shortest path Topological order –a,b,c,e,d –O(n+e)–O(n+e) Bipartite –maximum matching weighted or unweighted O(n 2 +ne) –all vs. all network are similar to bipartite except that the edges between vertices of the same group are allowed an n vs. n network is formed by (n–1) vs. (n–1) networks bad ce

Minimum cost spanning tree The cost of a spanning tree of a weighted, undirected graph is the sum of the costs (weights) of the edges in the spanning tree A minimum-cost spanning tree is a spanning tree of least cost Three different algorithms can be used to obtain a minimum cost spanning tree –Kruskal’s algorithm –Prim’s algorithm –Sollin’s algorithm All the three use a design strategy called the greedy method 67

Minimum cost spanning tree Greedy strategy Construct an optimal solution in stages At each stage, we make the best decision (selection) possible at this time –using least-cost criterion for constructing minimum-cost spanning trees Make sure that the decision will result in a feasible solution, where a feasible solution works within the constraints specified by the problem –our solution must satisfy the following constrains must use only edges within the graph must use exactly n – 1 edges may not use edges that produce a cycle 68

Minimum cost spanning tree Kruskal’s algorithm Build a minimum cost spanning tree T by adding edges to T one at a time Select the edges for inclusion in T in non- decreasing order of the cost An edge is added to T if it does not form a cycle Since G is connected and has n > 0 vertices, exactly n – 1 edges will be selected Time complexity: O(eloge) 69

70

Minimum cost spanning tree Prim’s algorithm Build a minimum cost spanning tree T by adding edges to T one at a time At each stage, add a least cost edge to T such that the set of selected edges is still a tree Repeat the edge addition step until T contains n – 1 edges 71

72

Minimum cost spanning tree Sollin’s algorithm Selects several edges for inclusion in T at each stage At the start of a stage, the selected edges forms a spanning forest During a stage we select a minimum cost edge that has exactly one vertex in the tree edge for each tree in the forest Repeat until only one tree at the end of a stage or no edges remain for selection 73