Network Computing Laboratory The Structure and Function of Complex Networks (Episode I) M.E.J. Newman, Dept. of Physics, U. Michigan, presented by Sungwon P. Choe
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 2 Contents Overview Introduction Networks in the Real World Social Networks Information Networks Technological Networks Properties of Networks The Small-World Effect Transitivity (Clustering) Degree Distributions Network Resilience Mixing Patterns Degree Correlations Community Structure Network Navigation Other Properties
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 3 Overview One-Line Comment The author presents a survey of recent techniques and models developed to understand complex networks And its 48 pages! (not including references) I’ll only present the 1st 20… I chose to do this?!
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 4 Introduction “Complex” Networks? Shift in research focus from small to large networks hundreds of vertices millions or billions of vertices individual vertex properties large-scale statistical properties Why now? Large sets of data available
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 5 Introduction Aims I. Find Statistical Properties e.g. path lengths, degree distributions II. Create Models Understand the properties III. Predict Behavior e.g. How does network structure affect Internet traffic? (still much research to be done) Create models
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 6
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 7 Networks in the Real World Social Networks “…a set of people… with some pattern of contacts or interactions between them” e.g. friendships, business communities, sexual contacts, collaboration (movie actors), , instant messaging, community web sites
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 8 Networks in the Real World Social Networks Social Sciences first to study real-world networks as early as 1920s and 30s (Moreno’s friendship pattern work) Late 60s Milgram’s “small world” experiments Acquaintance network experiment Pass a letter to an acquaintance who you think might know a person X On average, passed through 6 people “6 degrees of separation” i.e. path length = 6
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 9 Networks in the Real World Information Networks Network of information! a.k.a. knowledge networks e.g. academic paper citations, WWW pages, preference networks, P2P file-sharing, semantic word
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 10 Networks in the Real World Information Networks Citation and WWW page networks power-law in- and out- degree distributions i.e. degree k falls off as k -α for some constant α e.g. google.com vs. nclab.kaist.ac.kr/~sungwongoogle.com Preference Networks Bipartite graph Vertices of 2 distinct types with edges between them Vertices: people and objects they like (books, movies) Basis for Collaborative Filtering algorithms and recommender systems Amazon.com, etc.
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 11 Networks in the Real World Technological Networks “…man-made networks… for distribution of some commodity or resource” e.g. the Internet, electric power grid, airline routes, railways, pedestrian traffic, (physical) telephone networks, package/mail delivery Physical structure of the Internet difficult to study Infrastructure maintained by different organizations Often use traceroute programs to reconstruct network Large sample fairly complete (but… not complete!)
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 12 Networks in the Real World Biological Networks e.g. metabolic pathways, protein networks, genetic regulatory network, food web (predator/prey), neural networks, blood vessels & vascular networks, free energy minima networks
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 13 Properties of Networks Random Networks 1950~60s early work on networks (Rapoport, Erdos & Renyi) Undirected edges placed at random between n vertices But…. Most real networks not random!!
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 14 Properties of Networks The Small-World Effect Best of random networks….. Low diameter …and regular networks Regular structure First speculated in short story! 1929 by Hungarian writer Frigyes Karinthy Other early mathematical work by Pool and Kochen (’50s) and social science work by Milgram (’60s)
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 15 Properties of Networks The Small-World Effect Big network, small diameter! How to measure diameter? Mathematically Let l be the mean geodesic distance between vertex pairs in an undirected network: Where d ij is the geodesic distance from vertex i to j for n vertices and m edges, measuring l Breadthfirst search takes O(mn)
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 16
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 17 Properties of Networks The Small-World Effect What’s the “effect”? Information spreads fast! E.g. # of hops for a packet, spread of a disease Basis of some “games”: Erdos numbers Kevin Bacon numbers
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 18 Properties of Networks Transitivity (a.k.a. Clustering) If A is connected to B and B is connected to C heightened probability that A is connected to C i.e. the friend of my friend is probably my friend, too! In a network Greater number of “triangles” My friend…
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 19 Properties of Networks Transitivity (a.k.a. Clustering) Quantified by a clustering coefficient C “connected triple” = a single vertex with edges to 2 others 1 triangle and eight connected triples, so C = (3 x 1) / 8 = 3/8
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 20 Properties of Networks Transitivity (a.k.a. Clustering) Alternate definition of C uses a local value: If vertex degree = 0 or 1 (i.e. numerator and denominator are 0) C i = 0 Then This definition easier for computational calculation
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 21 The vertices have local clustering coefficients of, respectively, 1, 1, 1/6, 0 and 0 for a mean of 13/30 Properties of Networks Transitivity (a.k.a. Clustering)
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 22
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 23 Properties of Networks Degree Distributions degree = number of edges incident on a vertex p k = fraction of vertices with degree k i.e. p k = probability that a vertex has degree k degree distribution Histogram of the degrees of vertices Degree distribution…
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 24 Properties of Networks Degree Distributions Random graphs Each edge present/absent with equal probability Binomial distribution Poisson distribution (for large networks) Most network distributions Highly right-skewed Long right tail
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 25 Properties of Networks Degree Distributions Measuring the right tail difficult Not enough measurements, data is noisy 2 solutions: 1) Histogram with exponentially increasing bin sizes e.g. 1, 2-3, 4-7, 8-15, etc. Normalize: # of samples / width of bin Plotted on logarithmic scale Looks even! Doesn’t completely fix the problem 2) Present degree data with cumulative distribution function
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 26 Properties of Networks Degree Distributions Cumulative distribution function i.e. probability degree >= k
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 27 Properties of Networks X axis: vertex degree k, Y axis: cumulative probability distribution of degrees (fraction of vertices w/ degree >= k) log-linear scale: c, d, f appear to have power-law degree distributions (straightish), e exponential
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 28 Properties of Networks Degree Distributions Power-law degree distributions ( p k ~ k -α ) Called “Scale-free” – only the degree distribution is scale free! (Other properties usually scale) (Scale-free: f(ax) = bf(x) where b <= a ) Seen in: citation networks, WWW, the Internet, metabolic networks, telephone call graphs, human sexual contact networks
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 29 Properties of Networks Degree Distributions Other common forms: Exponential degree distributions e.g. power grid, railway networks Power-laws with exponential cut-offs e.g. movie actors, some collaboration networks
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 30 Properties of Networks Network Resilience How does vertex removal affect networks? Vertex path lengths increase Become disconnected Epidemiology Vertex “removal” = vaccination Vaccination destroys paths to potential victims
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 31 Properties of Networks Network Resilience Internet model (Albert et al.) 6000 vertices Vertex removal Random (blue squares) Highest-degree first (red circles) Random removal Virtually no effect Targeted removal Sharp distance increase
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 32 Properties of Networks Mixing Patterns Which vertices pair with which others e.g. food web Vertices between plants and herbivores, herbivores and carnivores Internet Links between end-users and ISPs, ISPs and backbones
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 33 Properties of Networks Mixing patterns a.k.a. assortative mixing 1958 couples in San Francisco:
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 34 Properties of Networks Mixing Patterns How to quantify? Let E ij = # of edges connecting type i and j vertices E = matrix with elements E ij (like couple table) Then, normalized mixing matrix: ||E|| = the sum of all elements of E Element e ij of e = fraction of edges between i and j
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 35 Properties of Networks Mixing Patterns Given i what is the probability of i having a neighbor j ? Assortativity coefficient: (Tr = trace or sum of diagonal elements)
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 36 Properties of Networks Degree Correlations Special case of assortative mixing High-degree vertices preferentially attach to other high-degree vertices? Or low-degree vertices? Newman calculated the Pearson correlation coefficient of degrees at either ends of an edge If r > 0, assortative If r < 0, disassortative
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 37
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 38 Properties of Networks Community Structure Groups of vertices with high density of edges
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 39 Top-bottom split: high school/middle school
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 40 Properties of Networks Community Structure Examined with cluster analysis (a.ka. hieararchical clustering) Assign “connection strength” to vertex pairs Add edges in decreasing strength order e,g, “edge betweeness” – count of geodesic paths through edge Process represented by dendogram Tree of union operations between vertex sets
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 41 Properties of Networks Network Navigation Milgram’s experiment Showed existence of “small world effect” with short diameter Kleinbeg pointed out another result in 2000 Ordinary people good at finding short paths! Creating easy-to-navigate artificial networks could result in: Efficient database structures Better P2P networks
Korea Advanced Institute of Science and Technology Network Computing Laboratory | 42 Properties of Networks Other Properties Largest component “betweenness centrality” The “edge betweenness” of a vertex i Measure of network resilience Recurrent motifs Small subgraphs