Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.

Slides:



Advertisements
Similar presentations
Complex Networks Advanced Computer Networks: Part1.
Advertisements

Network analysis Sushmita Roy BMI/CS 576
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
Traffic-driven model of the World-Wide-Web Graph A. Barrat, LPT, Orsay, France M. Barthélemy, CEA, France A. Vespignani, LPT, Orsay, France.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
Hierarchy in networks Peter Náther, Mária Markošová, Boris Rudolf Vyjde : Physica A, dec
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Complex Networks Third Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Scale-free networks Péter Kómár Statistical physics seminar 07/10/2008.
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Network Statistics Gesine Reinert. Yeast protein interactions.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Sedgewick & Wayne (2004); Chazelle (2005) Sedgewick & Wayne (2004); Chazelle (2005)
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Global topological properties of biological networks.
Advanced Topics in Data Mining Special focus: Social Networks.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
The structure of the Internet. The Internet as a graph Remember: the Internet is a collection of networks called autonomous systems (ASs) The Internet.
Modeling Gene Interactions in Disease CS 686 Bioinformatics.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Dependency networks Sushmita Roy BMI/CS 576 Nov 26 th, 2013.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
The Erdös-Rényi models
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Traceroute-like exploration of unknown networks: a statistical analysis A. Barrat, LPT, Université Paris-Sud, France I. Alvarez-Hamelin (LPT, France) L.
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
Complex Networks First Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
Intel Confidential – Internal Only Co-clustering of biological networks and gene expression data Hanisch et al. This paper appears in: bioinformatics 2002.
10 December, 2008 CIMCA2008 (Vienna) 1 Statistical Inferences by Gaussian Markov Random Fields on Complex Networks Kazuyuki Tanaka, Takafumi Usui, Muneki.
Biological Networks & Network Evolution Eric Xing
Complex Networks: Models Lecture 2 Slides by Panayiotis TsaparasPanayiotis Tsaparas.
Class 9: Barabasi-Albert Model-Part I
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
A connected simple graph is Eulerian iff every graph vertex has even degree. The numbers of Eulerian graphs with, 2,... nodes are 1, 1, 2, 3, 7, 16, 54,
Introduction to biological molecular networks
Lecture 2: Statistical learning primer for biologists
341- INTRODUCTION TO BIOINFORMATICS Overview of the Course Material 1.
Network resilience.
1 CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014 Network models Tamer Kahveci.
(c) M Gerstein '06, gerstein.info/talks 1 CS/CBB Data Mining Predicting Networks through Bayesian Integration #1 - Theory Mark Gerstein, Yale University.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Scale-free and Hierarchical Structures in Complex Networks L. Barabasi, Z. Dezso, E. Ravasz, S.H. Yook and Z. Oltvai Presented by Arzucan Özgür.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Network (graph) Models
Structures of Networks
Bioinformatics 3 V6 – Biological Networks are Scale- free, aren't they? Fri, Nov 2, 2012.
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Biological networks CS 5263 Bioinformatics.
Department of Computer Science University of York
Modelling Structure and Function in Complex Networks
Network Science: A Short Introduction i3 Workshop
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological networks 6.Other types of networks

[Qian, et al, J. Mol. Bio., 314: ] Expression networks

Regulatory networks [Horak, et al, Genes & Development, 16: ]

Expression networksRegulatory networks

Expression networksRegulatory networks Interaction networks

Metabolic networks [DeRisi, Iyer, and Brown, Science, 278: ]

Expression networks Regulatory networks Interaction networks Metabolic networks

... more biological networks Hierarchies & DAGs [Enzyme, Bairoch; GO, Ashburner; MIPS, Mewes, Frishman]

Neural networks [Cajal] Gene order networks Genetic interaction networks [Boone]... more biological networks

Other types of networks Disease Spread [Krebs] Social Network Food Web Electronic Circuit Internet [Burch & Cheswick]

Part 2: Graphs, Networks Graph definition Topological properties of graphs -Degree of a node -Clustering coefficient -Characteristic path length Random networks Small World networks Scale Free networks

Graph: a pair of sets G={P,E} where P is a set of nodes, and E is a set of edges that connect 2 elements of P. Directed, undirected graphs Large, complex networks are ubiquitous in the world: -Genetic networks -Nervous system -Social interactions -World Wide Web

Degree of a node: the number of edges incident on the node i Degree of node i = 5

Clustering coefficient  LOCAL property The clustering coefficient of node i is the ratio of the number of edges that exist among its neighbours, over the number of edges that could exist Clustering coefficient of node i = 1/6 The clustering coefficient for the entire network C is the average of all the

Characteristic path length  GLOBAL property is the number of edges in the shortest path between vertices i and j The characteristic path length L of a graph is the average of the for every possible pair (i,j) i j Networks with small values of L are said to have the “small world property”

Models for networks of complex topology Erdos-Renyi (1960) Watts-Strogatz (1998) Barabasi-Albert (1999)

The Erdős-Rényi [ER] model (1960) Start with N vertices and no edges Connect each pair of vertices with probability P ER Important result: many properties in these graphs appear quite suddenly, at a threshold value of P ER (N) -If P ER ~c/N with c<1, then almost all vertices belong to isolated trees -Cycles of all orders appear at P ER ~ 1/N

The Watts-Strogatz [WS] model (1998) Start with a regular network with N vertices Rewire each edge with probability p For p=0 (Regular Networks): high clustering coefficient high characteristic path length For p=1 (Random Networks): low clustering coefficient low characteristic path length QUESTION: What happens for intermediate values of p?

1) There is a broad interval of p for which L is small but C remains large 2) Small world networks are common :

The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly connected node decreases exponentially with k

GROWTH: starting with a small number of vertices m 0 at every timestep add a new vertex with m ≤ m 0 PREFERENTIAL ATTACHMENT: the probability Π that a new vertex will be connected to vertex i depends on the connectivity of that vertex: ● two problems with the previous models: 1. N does not vary 2. the probability that two vertices are connected is uniform

a) Connectivity distribution with N = m 0 +t= and m 0 =m=1(circles), m 0 =m=3 (squares), and m 0 =m=5 (diamons) and m 0 =m=7 (triangles) b) P(k) for m 0 =m=5 and system size N= (circles), N= (squares) and N= (diamonds)  Scale Free Networks

Part 3: Machine Learning Artificial Intelligence/Machine Learning Definition of Learning 3 types of learning 1.Supervised learning 2.Unsupervised learning 3.Reinforcement Learning Classification problems, regression problems Occam’s razor Estimating generalization Some important topics: 1.Naïve Bayes 2.Probability density estimation 3.Linear discriminants 4.Non-linear discriminants (Decision Trees, Support Vector Machines)

Bayes’ Rule: minimum classification error is achieved by selecting the class with largest posterior probability PROBLEM: we are given and we have to decide whether it is an a or a b Classification Problems

Regression Problems PROBLEM: we are only given the red points, and we would like approximate the blue curve (e.g. with polynomial functions) QUESTION: which solution should I pick? And why?

Naïve Bayes F 1F 2F 3…F n TARGET Gene …2.231 Gene …2.31 Gene … Gene …24.30 Gene …6.50 ………………… Gene n …5.31 Example: given a set of features for each gene, predict whether it is essential

Bayes Rule: select the class with the highest posterior probability For a problem with two classes this becomes: otherwise, choose class then choose class if

whereand are called Likelihood Ratio for feature i. Naïve Bayes approximation: For a two classes problem:

Probability density estimation Assume a certain probabilistic model for each class Learn the parameters for each model (EM algorithm)

Linear discriminants assume a specific functional form for the discriminant function learn its parameters

Decision Trees (C4.5, CART) ISSUES: how to choose the “best” attribute how to prune the tree Trees can be converted into rules !

Part 4: Networks Predictions Naïve Bayes for inferring Protein-Protein Interactions

Network Gold-Standards The data [Jansen, Yu, et al., Science; Yu, et al., Genome Res.] Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –

Network Gold-Standards Likelihood Ratio for Feature i: Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard –

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/ 4 )/(3/6) =2 Likelihood Ratio for Feature i:

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = ( 4 /4)/(3/6) =2 Likelihood Ratio for Feature i:

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/ 6 ) =2 Likelihood Ratio for Feature i:

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/( 3 /6) =2 Likelihood Ratio for Feature i:

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards Likelihood Ratio for Feature i: L 1 = (4/4)/(3/6) =2

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1  L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:

Feature 2, e.g. same function Feature 1, e.g. co-expression Gold-standard + Gold-standard – Network Gold-Standards L 1 = (4/4)/(3/6) =2 L2 = (3/4)/(3/6) =1.5 For each protein pair: LR = L1  L2 log(LR) = log(L 1 ) + log(L 2 ) Likelihood Ratio for Feature i:

1.Individual features are weak predictors, LR ~ 10; 2.Bayesian integration is much more powerful, LR cutoff = 600 ~9000 interactions