Download presentation
Presentation is loading. Please wait.
Published byGriffin Tyrone Patrick Modified over 8 years ago
1
Course Name: Comparative Genomics Conducted by- Shigehiko kanaya & Md. Altaf-Ul-Amin
2
Dates of Lectures: October 7, 14, 19, 26 November 4, 11, 18, 25 Network biology Comparative Analysis of biological networks
3
Central dogma of molecular biology
4
From Genome to Phenome Genome (Gene set) Transcriptome Phenome Metabolome Proteome Phenotype X DNA –Nucleotide sequence- ATCTGAT…… Double Helix Metabolites (Bio-chemical molecules) mRNA and other RNAs - Nucleotide sequence-Single Strand (Dynamic) Progressing genome projects, many kinds of “–omics” works have progressed such as transcriptome, …. These are dynamic information reflecting to Phenome. Proteins-Amino Acid Sequences (Statiic)
5
Lecture 1: Introduction to Graphs/Networks, Lecture 2: Different network models Lecture 3: Different centrality measures, Hierarchical Clustering(12/9) Lecture 4: Graph spectral analysis/Graph spectral clustering and its application to metabolic network Lecture5: Properties of Protein-Protein Interaction Networks & Protein Function prediction using network concepts Lecture 6:On finding clusters in undirected simple graphs: application to protein complex detection Lecture 7: Finding Biclusters in Bipartite Graphs, Properties of transcription and metabolic networks, Lecture 8: Application of network concepts in DNA sequencing, Line graphs
6
Introduction to Graphs/Networks
7
Konigsberg bridge problem Konigsberg was a city in present day Germany encompassing two islands and the banks of Pregel River. The city was connected by 7 bridges. Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible?
8
Konigsberg bridge problem Konigsberg was a city in present day Germany including two islands and the banks of Pregel River. The city was connected by 7 bridges. Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible?
9
Konigsberg bridge problem Konigsberg was a city in present day Germany including two islands and the banks of Pregel River. The city was connected by 7 bridges. Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible?
10
Konigsberg bridge problem Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible? This problem was solved by Leonhard Eular in 1736 by means of a graph.
11
Konigsberg bridge problem Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible? This problem was solved by Leonhard Eular in 1736 by means of a graph. A B C D
12
Konigsberg bridge problem Problem: Start at any point, walk over each bridge exactly once and return to the same point. Possible? A B C D The necessary condition for the existence of the desired route is that each land mass be connected to an even number of bridges. A, B, C, D circles represent land masses and each line represent a bridge The graph of Konigsberg bridge problem does not hold the necessary condition and hence there is no solution of the above problem. This notion has been used in solving DNA sequencing problem
13
A graph G=(V,E) consists of a set of vertices V={v 1, v 2,…) and a set of edges E={e 1,e 2, …..) such that each edge e k is identified by a pair of vertices (v i, v j ) which are called end vertices of e k. A graph is an abstract representation of almost any physical situation involving discrete objects and a relationship between them. Definition
14
A B C D It is immaterial whether the vertices are drawn rectangular or circular or the edges are drawn staright or curved, long or short. A B C D Both these graphs are the same
15
Many systems in nature can be represented as networks The internet is a network of computers
16
Very high degree node No such node exists Road Network Air route Network Many systems in nature can be represented as networks
17
Printed circuit boards are networks Many systems in nature can be represented as networks Network theory is extensively used to design the wiring and placement of components in electronic circuits
18
Protein-protein interaction network of e.coli Many systems in nature can be represented as networks
19
Some Basic Concepts regarding networks: Average Path length Diameter Eccentricity Clustering Coefficient Degree distribution
20
a d b f e c Distance between node u and v called d(u,v) is the least length of a path from u to v. d(a,e) = ? Average Path length
21
a d b f e c Distance between node u and v called d(u,v) is the least distance of a path from u to v. d(a,e) = ? Length of a-b-c-d-f-e path is 5 Average Path length
22
a d b f e c Distance between node u and v called d(u,v) is the least distance of a path from u to v. d(a,e) = ? Length of a-b-c-d-f-e path is 5 Length of a-c-d-f-e path is 4 Average Path length
23
a d b f e c Distance between node u and v called d(u,v) is the least length of a path from u to v d(a,e) = ? Length of a-b-c-d-f-e path is 5 Length of a-c-d-f-e path is 4 Length of a-c-d-e path is 3 The minimum length of a path from a to e is 3 and therefore d(a,e) = 3. Average Path length
24
a d b f e c There are 6 nodes and 6 C 2 = (6!)/(2!)(4!)=15 distinct pairs for example (a,b), (a,c)…..(e,f). We have to calculate distance between each of these 15 pairs and average them Average Path length Average path length L of a network is defined as the mean distance between all pairs of nodes.
25
Average Path length Average path length L of a network is defined as the mean distance between all pairs of nodes. a to b1 a to c1 a to d2 a to e 3 a to f3 ---------------------- ____________________ 15 pairs27(total length) L=27/15=1.8 Average path length of most real complex network is small a d b f e c
26
Finding average path length is not easy when the network is big enough. Even finding shortest path between any two pair is not easy. A well known algorithm is as follows: Dijkstra E.W., A note on two problems in connection with Graphs”, Numerische Mathematik, Vol. 1, 1959, 269-271. Dijkstra’s algorithm can be found in almost every book of graph theory. There are other algorithms for finding shortest paths between all pairs of nodes. Average Path length
27
Diameter a d b f e c Distance between node u and v called d(u,v) is the least length of a path from u to v. The longest of the distances between any two node is called Diameter a to b1 a to c1 a to d2 a to e 3 a to f3 ---------------------- 15 pairs Diameter of this graph is 3
28
Eccentricity And Radius a d b f e c Eccentricity of a node u is the maximum of the distances of any other node in the graph from u. The radius of a graph is the minimum of the eccentricity values among all the nodes of the graph. a to b1 a to c1 a to d2 a to e 3 a to f3 Therefore eccentricity of node a is 3 Radius of this graph is 2 3 3 3 3 2 2
29
The degree distribution is the probability distribution function P(k), which shows the probability that the degree of a randomly selected node is k. Degree Distribution
30
1243 10 # of nodes having degree k Degree Distribution Degree
31
1243 1 P(k) Degree Distribution Any randomness in the network will broaden the shape of this peak Degree
32
1243 2 4 # of nodes having degree k Degree Distribution Degree
33
1243 0.25 0.5 P(k) Degree Distribution Degree
34
Degree Distribution Poisson’s Distribution Degree distribution of random graphs follow Poisson’s distribution e = 2.71828..., the Base of natural Logarithms
35
Connectivity k P(k) P(k) ~ k -γ Power Law Distribution Degree distribution of many biological networks follow Power Law distribution Degree Distribution Power Law Distribution on log-log plot is a straight line
36
Clustering coefficient k i = # of neighbors of node i E i = # of edges among the neighbors of node i a d b f e c
37
Clustering coefficient C a =2*1/2*1=1 k i = # of neighbors of node i E i = # of edges among the neighbors of node i a d b f e c
38
Clustering coefficient C a =2*1/2*1=1 C b =2*1/2*1=1 C c =2*1/3*2=0.333 C d =2*1/3*2=0.333 C e =2*1/2*1=1 C f =2*1/2*1=1 Total =4.666 C =4.666/6= 0.7776 k i = # of neighbors of node i E i = # of edges among the neighbors of node i a d b f e c
39
Clustering coefficient By studying the average clustering C(k) of nodes with a given degree k, information about the actual modular organization can be extracted. a d b f e c C a =2*1/2*1=1 C b =2*1/2*1=1 C c =2*1/3*2=0.333 C d =2*1/3*2=0.333 C e =2*1/2*1=1 C f =2*1/2*1=1 C(1)=0 C(2)=(C a +C b +C e +C f )/4=1 C(3)=(C c +C d )/2=0.333
40
Clustering coefficient By studying the average clustering C(k) of nodes with a given degree k, information about the actual modular organization can be extracted. For most of the known metabolic networks the average clustering follows the power-law. C(k) ~ k -γ Power Law Distribution
41
Subgraphs Consider a graph G=(V,E). The graph G'=(V',E') is a subgraph of G if V' and E' are respectively subsets of V and E. a d b f e c a b c d f c Graph G Subgraph of G
42
Induced Subgraphs An induced subgraph on a graph G on a subset S of nodes of G is obtained by taking S and all end points of G having both end- points in S. a d b f e c a b c d f c Graph G Induced subgraph of G for S={a, b, c} Induced subgraph of G for S={c, d, f}
43
Graphlets Graphlets are non-isomprphic induced subgraphs of large networks T. Milenkovic, J. Lai, and N. Przulj, GraphCrunch: A Tool for Large Network Analyses, BMC Bioinformatics, 9:70, January 30, 2008.
44
Partial subgraphs/Motifs A partial subgraph on a graph G on a subset S of nodes of G is obtained by taking S and some of the edges in G having both end-points in S. They are sometimes called edge subgraphs. a d b f e c a b c Graph G Partial subgraph of G For S={a, b, c}
45
Partial subgraphs/Motifs Genomic analysis of regulatory network dynamics reveals large topological changes Nicholas M. Luscombe, M. Madan Babu, Haiyuan Yu, Michael Snyder, Sarah A. Teichmann & Mark Gerstein, NATURE | VOL 431| 2004 SIM=Single input motif MIM= Multiple input motif FFL=Feed forward loop This paper searched for these motifs in transcriptional regulatory network of Saccharomyces cerevisiae
46
Genomic analysis of regulatory network dynamics reveals large topological changes Nicholas M. Luscombe, M. Madan Babu, Haiyuan Yu, Michael Snyder, Sarah A. Teichmann & Mark Gerstein, NATURE | VOL 431| 2004 Partial subgraphs/Motifs
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.