DYNAMICS OF COMPLEX SYSTEMS Self-similar phenomena and Networks Guido Caldarelli CNR-INFM Istituto dei Sistemi Complessi 4/6
1.SELF-SIMILARITY (ORIGIN AND NATURE OF POWER-LAWS) 2.GRAPH THEORY AND DATA 3.SOCIAL AND FINANCIAL NETWORKS 4.MODELS 5.INFORMATION TECHNOLOGY 6.BIOLOGY STRUCTURE OF THE COURSE
STRUCTURE OF THE FOURTH LECTURE 4.1) DEFINITION OF THE MODELS 4.2) RANDOM GRAPHS 4.3) SMALL WORLD 4.4) MULTIPLICATIVE PROCESSES 4.5) BARABASI-ALBERT 4.6) REWIRING 4.7) FITNESS
Standard Theory of Random Graph (Erdös and Rényi 1960) Random Graphs are composed by starting with n vertices. With probability p two vertices are connected by an edge P(k) k Degrees are Poisson distributed Small World (D. Watts and S.H. Strogatz 1998) Degrees are peaked around mean value Small World Graph are composed by adding shortcuts to regular lattices 4.1 MODELS DEFINITIONS
“Intrinsic” Fitness/Static/Hidden variable Models (K.-I. Goh, B. Khang, D. Kim 2001 G.Caldarelli A. Capocci, P.De Los Rios, M.A. Muñoz 2002) 1) Growth or not Nodes can be fixed at the beginning or be added 2) Attachment is related to intrinsic properties The probability to be connected depends on the sites Degrees are Power law distributed Model of Growing Networks (A.-L. Barabási – R. Albert 1999) 1) Growth Every time step new nodes enter the system 2) Preferential Attachment The probability to be connected depends on the degree P(k) k Degrees are Power law distributed 4.1 MODELS DEFINITIONS
The number m of edges in a Random Graph is a random variable whose expectation value is The probability to form a particular Graph G(N,m) is given by The degree has expectation value It is easy to check that the degree probability distribution is given by 4.2 RANDOM GRAPHS
We can give an estimate of the Clustering Coefficient for a complete graph it must be 1. If the graph is enough sparse then two points link each other with probability p Same estimate can be given for the average distance l between two vertices. If a graph has average degree then the first neighbours will be the second neighbours ~ 2 …………….. the n-th neighbours ~ n For the Diameter D → D of order N 4.2 RANDOM GRAPHS
Take a regular lattice and rewire with probability some of the links (for analytical treatment, a slight modification is recommended: Instead of rewiring add the new links proportional to the existing links) The total number of shortcuts is Average degree is now Therefore for small the degree distribution is peaked around 2 4.3 SMALL WORLD
Clustering Coefficient of the regular lattice ( → 0 and < 2/3N otherwise C=1) For the average distance there is no result but we can define a distance in the problem, given by the mean distance between two shortcuts endpoints. We have that in the regular lattice (start with c=1 and generalize) → We have that in the Random Graph 4.3 SMALL WORLD
Now in Small World graphs, the behaviour must be intermediate between the regular lattice and Random Graph. If we define a characteristic length in the system as for example x = average distance between two endpoints of shortcuts (not the same!) diverges when → 0 is characteristic distance we can define in the model so that we make the ansatz Several conjectures, made but neither the actual distribution of path lengths nor the has been found 4.3 SMALL WORLD
In a multiplicative process you have S(t)= (t)S(t-1)= (t) (t-1)… (1)S(0) For the central limit theorem the log of S(t) is normal distributed. If variance is large it can look as a power law with apparent slope MULTIPLICATIVE PROCESS
In this case the apparent slope ( 2 -1) of blue line is 0.6 ( ) If there is a threshold on the S YOU OBTAIN REAL POWER-LAWS INSTEAD M. Mitzenmacher, Internet Mathematics (2004) 4.4 MULTIPLICATIVE PROCESS
This is by far the most successful and used model in the field along with the related models FITNESS MODEL REWIRING MODEL AGING EFFECTS TWO STEPS 1.GROWTH: Every time step you add a vertex 2.PREFERENTIAL ATTACHMENT: This idea has been reformulated in different fields and has different names YULE PROCESS (G. Yule Phyl. Trans. Roy. Soc (1925) SIMON PROCESS (H.A. Simon Biometrika (1955) DE SOLLA PRICE MODEL (D.J. De Solla Price Science (1965) ST. MATTHEW EFFECT (K.R. Merton Science (1968) 4.5 BARABASI-ALBERT
As for the degree distribution we can compute the P(k i <k) The basic approach is through continuum theory, degree is now a continuum variable: Start with m 0 vertices and add for every t m new links R. Albert, A.-L. Barabási Review of Modern Physics (2001) 4.5 BARABASI-ALBERT
The distribution of incoming vertices is uniform in time From which we obtain 4.5 BARABASI-ALBERT
The value of the exponent depends on details of preferential attachment If (k)~k NO POWER LAW If (k)~k a =3+a/m Clustering is larger than Erdos Renyi (m>1…) No clear Assortative/Disassortative behaviour 4.5 BARABASI-ALBERT
TWO STEPS 1.GROWTH: Every time step you add a vertex 2.PREFERENTIAL ATTACHMENT: But now vertices differ, some are good, some are bad, you measure that by assigning a ``fitness’’ i For some choices of the distribution of fitnesses you still have Power law degree distribution and also assortativeness. Great success in reproducing Internet (A.Vazquez, R. Pastor-Satorras, A. Vespignani Phys. Rev. E (2002) ) G. Bianconi A.-L. Barabási Europhys. Lett (2001) 4.5 BARABASI-ALBERT: Fitness
P.L. Krapivsky, G.J. Rodgers, S. Redner Phys. Rev. Lett (2001) M. Catanzaro, G. Caldarelli, L. Pietronero Phys. Rev. E (2004). TWO STEPS PROBABILITY p 1.GROWTH: Every time step you add a vertex 2.PREFERENTIAL ATTACHMENT PROBABILITY 1-p 1. REWIRING of existing nodes 4.5 BARABASI-ALBERT: Rewiring
K. Klemm and V. M. Eguíluz Phys. Rev. E (2002) Only m vertices enter in the dynamics. Those are the ACTIVE SITES DIFFERENT STEPS 1.GROWTH: Every time step you add a vertex This new vertex draw a link with all the m active vertices. 2.AGING: A vertex is deactivated with a probability proportional to (k i +a) BARABASI-ALBERT: Aging
Consider the WWW. What is the “microscopic” process of growth? You see a WWW page that you like (i.e. that of a friend of yours) You copy it, and change a little bit R. Kumar et al. Computer Networks (1999) A. Vazquez, et al. Nature Biotechnology (2003) TWO STEPS 1.GROWTH: Every time step you copy a vertex and its m edges 2.MUTATION (for everyone of m edges) With Probability (1- you keep it With Probability you change destination vertex 4.6 REWIRING
The rate of change is given by It becomes clear we have an effective preferential attachment. It can be demonstrated (NOT HERE!) 4.6 REWIRING
In the completely different context of protein interaction networks the same mechanism is in agreement with the current view of genome evolution. When organisms reproduce, the duplication of their DNA is accompanied by mutations. Those mutations can sometimes entail a complete duplication of a gene. Since in this case the corresponding protein can be produced by two different copies of the same gene, point-like mutations on one of them can accumulate at a rate faster than normal since a weaker selection pressure is applied. Consequently, proteins with new, properties can arise by this process. The new proteins arising by this mechanism share many physico- chemical properties with their ancestors. Many interactions remain unchanged, some are lost and some are acquired. CLUSTERING MUCH SIMILAR TO THAT OF WWW 4.6 REWIRING
Without introducing growth or preferential attachment we can have power-laws We consider “disorder” in the Random Graph model (i.e. vertices differ one from the other). This mechanism is responsible of self-similarity in Laplacian Fractals Dielectric Breakdown In reality In a perfect dielectric K.-I. Goh, B. Khang, D. Kim Phys. Rev. Lett , 2001 G.Caldarelli et al. Phys Rev. Lett FITNESS MODEL
1.Assign to every vertex one real positive number x that we call fitness. fitnesses are drawn from probablity distribution r(x) 2.Link two vertices with fitnesses x and y according to a probability function f(x,y)=f(y,x) (choice function). STATIC if N is kept fixed The model can be considered DYNAMIC if N is growing This is a GOOD GETS RICHER model No preferential attachment is present. V.D.P. Servedio, P. Buttà, G. Caldarelli Phys. Rev. E (2004). 4.6 FITNESS MODEL
Different realizations of the model a) b) c) have (x) power law with exponent 2.5,3,4 respectively. d) has (x)=exp(-x) and a threshold rule. 4.6 FITNESS MODEL
Degree distribution for the case d) with (x)=exp(-x) and a threshold rule. Degree distribution for cases a) b) c) with (x) power law with exponent 2.5,3,4 respectively. 4.6 FITNESS MODEL
The Degree probability distribution P(k) is a functional of (x) and f(x,y). DIRECT PROBLEM Given a fitness (x) → which choice function f(x,y) produces scale free graphs? i.e. P(k) = ck INVERSE PROBLEM Given a choice function f(x,y) → which fitness (x) produces scale free graphs? i.e. P(k) = ck 4.6 FITNESS MODEL
Fitness probability distribution Non decreasing Vertex degree Vertex degree Probability Distribution 4.6 FITNESS MODEL
Degree Correlation Vertex Clustering Coefficient 4.6 FITNESS MODEL
We impose P(k)=c(k(x)) → Multipling both sides of the equation for k’(x) and integrating from 0 to x 4.6 FITNESS MODEL
We now have a constraint on the fitness distribution (x) and choice function f(x,y) Some exact results 4.6 FITNESS MODEL
Special case f(x,y)=g(x)g(y) 4.6 FITNESS MODEL
Special case f(x,y)=f(x+y) 4.6 FITNESS MODEL
Special case f(x,y)=f(x-y) 4.6 FITNESS MODEL
Using the intrinsic fitness model it is possible to create scale-free networks with any desired power-law exponent This is possible for any fitness probability distribution (x), it does not matter if they are (e.g.) exponential, power-law or Gaussian. We found analytic expressions for the choice function f(x,y) in three cases: f(x,y)=f(x)f(y) for every (x), f(x,y)=f(x y) (x)=e -x If f(x,y)=f(x)f(y) both vertex degree correlation and clustering coefficient are constant 4.6 FITNESS MODEL
There are plenty of models around, to check what is more likely to reproduce the data we have to check a series of quantities Degree distribution Assortativity Clustering, Motifs etc. Not all the ingredients are equally likely: RANDOM GRAPH:You choose your partner at random. INTRINSIC FITNESS You choose your partner if you like her/him BARABÁSI-ALBERT:To choose your partner: You must know how many partners she/he already had The larger this number, the better COPYING:You choose the partners of your close friends CONCLUSIONS