Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological networks 6.Other types of networks
[Qian, et al, J. Mol. Bio., 314: ] Expression networks
Regulatory networks [Horak, et al, Genes & Development, 16: ]
Expression networksRegulatory networks
Expression networksRegulatory networks Interaction networks
Metabolic networks [DeRisi, Iyer, and Brown, Science, 278: ]
Expression networks Regulatory networks Interaction networks Metabolic networks
... more biological networks Hierarchies & DAGs [Enzyme, Bairoch; GO, Ashburner; MIPS, Mewes, Frishman]
Neural networks [Cajal] Gene order networks Genetic interaction networks [Boone]... more biological networks
Other types of networks Disease Spread [Krebs] Social Network Food Web Electronic Circuit Internet [Burch & Cheswick]
Part 2: Graphs, Networks Graph definition Topological properties of graphs -Degree of a node -Clustering coefficient -Characteristic path length Random networks Small World networks Scale Free networks
Graph: a pair of sets G={P,E} where P is a set of nodes, and E is a set of edges that connect 2 elements of P. Directed, undirected graphs Large, complex networks are ubiquitous in the world: -Genetic networks -Nervous system -Social interactions -World Wide Web
Degree of a node: the number of edges incident on the node i Degree of node i = 5
Clustering coefficient LOCAL property The clustering coefficient of node i is the ratio of the number of edges that exist among its neighbours, over the number of edges that could exist Clustering coefficient of node i = 1/6 The clustering coefficient for the entire network C is the average of all the
Characteristic path length GLOBAL property is the number of edges in the shortest path between vertices i and j The characteristic path length L of a graph is the average of the for every possible pair (i,j) i j Networks with small values of L are said to have the “small world property”
Models for networks of complex topology Erdos-Renyi (1960) Watts-Strogatz (1998) Barabasi-Albert (1999)
The Erdős-Rényi [ER] model (1960) Start with N vertices and no edges Connect each pair of vertices with probability P ER Important result: many properties in these graphs appear quite suddenly, at a threshold value of P ER (N) -If P ER ~c/N with c<1, then almost all vertices belong to isolated trees -Cycles of all orders appear at P ER ~ 1/N
The Watts-Strogatz [WS] model (1998) Start with a regular network with N vertices Rewire each edge with probability p For p=0 (Regular Networks): high clustering coefficient high characteristic path length For p=1 (Random Networks): low clustering coefficient low characteristic path length QUESTION: What happens for intermediate values of p?
1) There is a broad interval of p for which L is small but C remains large 2) Small world networks are common :
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly connected node decreases exponentially with k
GROWTH: starting with a small number of vertices m 0 at every timestep add a new vertex with m ≤ m 0 PREFERENTIAL ATTACHMENT: the probability Π that a new vertex will be connected to vertex i depends on the connectivity of that vertex: ● two problems with the previous models: 1. N does not vary 2. the probability that two vertices are connected is uniform
a) Connectivity distribution with N = m 0 +t= and m 0 =m=1(circles), m 0 =m=3 (squares), and m 0 =m=5 (diamons) and m 0 =m=7 (triangles) b) P(k) for m 0 =m=5 and system size N= (circles), N= (squares) and N= (diamonds) Scale Free Networks
Part 3: Machine Learning Artificial Intelligence/Machine Learning Definition of Learning 3 types of learning 1.Supervised learning 2.Unsupervised learning 3.Reinforcement Learning Classification problems, regression problems Occam’s razor Estimating generalization Some important topics: 1.Naïve Bayes 2.Probability density estimation 3.Linear discriminants 4.Non-linear discriminants (Decision Trees, Support Vector Machines)
Bayes’ Rule: minimum classification error is achieved by selecting the class with largest posterior probability PROBLEM: we are given and we have to decide whether it is an a or a b Classification Problems
Regression Problems PROBLEM: we are only given the red points, and we would like approximate the blue curve (e.g. with polynomial functions) QUESTION: which solution should I pick? And why?
Naïve Bayes F 1F 2F 3…F n TARGET Gene …2.231 Gene …2.31 Gene … Gene …24.30 Gene …6.50 ………………… Gene n …5.31 Example: given a set of features for each gene, predict whether it is essential
Bayes Rule: select the class with the highest posterior probability For a problem with two classes this becomes: otherwise, choose class then choose class if
whereand are called Likelihood Ratio for feature i. Naïve Bayes approximation: For a two classes problem:
Probability density estimation Assume a certain probabilistic model for each class Learn the parameters for each model (EM algorithm)
Linear discriminants assume a specific functional form for the discriminant function learn its parameters
Decision Trees (C4.5, CART) ISSUES: how to choose the “best” attribute how to prune the tree Trees can be converted into rules !
Part 4: Networks Predictions Naïve Bayes for inferring Protein-Protein Interactions
Network Gold-Standards The data [Jansen, Yu, et al., Science; Yu, et al., Genome Res.] Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –
Network Gold-Standards Likelihood Ratio for Feature i: Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/ 4 )/(3/6) =2 Likelihood Ratio for Feature i:
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = ( 4 /4)/(3/6) =2 Likelihood Ratio for Feature i:
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/ 6 ) =2 Likelihood Ratio for Feature i:
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/( 3 /6) =2 Likelihood Ratio for Feature i:
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards Likelihood Ratio for Feature i: L 1 = (4/4)/(3/6) =2
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1 L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:
Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1 L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:
1.Individual features are weak predictors, LR ~ 10; 2.Bayesian integration is much more powerful, LR cutoff = 600 ~9000 interactions