Network properties Slides are modified from Networks: Theory and Application by Lada Adamic
How do you characterize it? Outline What is a network? a bunch of nodes and edges How do you characterize it? with some basic network metrics Network models
What are networks? Networks are collections of points joined by lines. “Network” ≡ “Graph” node edge points lines vertices edges, arcs math nodes links computer science sites bonds physics actors ties, relations sociology
Network elements: edges Directed (also called arcs) A -> B A likes B, A gave a gift to B, A is B’s child Undirected A <-> B or A – B A and B like each other A and B are siblings A and B are co-authors Edge attributes weight (e.g. frequency of communication) ranking (best friend, second best friend…) type (friend, relative, co-worker) properties depending on the structure of the rest of the graph: e.g. betweenness
Directed networks girls’ school dormitory dining-table partners (Moreno, The sociometry reader, 1960) first and second choices shown 2 1 Ada Cora Louise Jean Helen Martha Alice Robin Marion Maxine Lena Hazel Hilda Frances Eva Ruth Edna Adele Jane Anna Mary Betty Ella Ellen Laura Irene
Edge weights can have positive or negative values One gene activates/ inhibits another One person trusting/ distrusting another Research challenge: How does one ‘propagate’ negative feelings in a social network? Is my enemy’s enemy my friend? Transcription regulatory network in baker’s yeast
Representing edges (who is adjacent to whom) as a matrix Adjacency matrices Representing edges (who is adjacent to whom) as a matrix Aij = 1 if node i has an edge to node j = 0 if node i does not have an edge to j Aii = 0 unless the network has self-loops Aij = Aji if the network is undirected, or if i and j share a reciprocated edge j i i i j 1 2 3 4 Example: 5 1 A =
Adjacency lists Edge list Adjacency list 2 3 2 4 3 2 3 4 4 5 5 2 5 1 Adjacency list is easier to work with if network is large sparse quickly retrieve all neighbors for a node 1: 2: 3 4 3: 2 4 4: 5 5: 1 2 2 3 1 4 5
Node network properties Nodes Node network properties from immediate connections indegree how many directed edges (arcs) are incident on a node outdegree how many directed edges (arcs) originate at a node degree (in or out) number of edges incident on a node indegree=3 outdegree=2 degree=5
Node degree from matrix values 2 Node degree from matrix values 3 1 4 1 5 Outdegree = A = example: outdegree for node 3 is 2, which we obtain by summing the number of non-zero entries in the 3rd row 1 Indegree = A = example: the indegree for node 3 is 1, which we obtain by summing the number of non-zero entries in the 3rd column
Characterizing networks: Is everything connected?
Network metrics: connected components Strongly connected components Each node within the component can be reached from every other node in the component by following directed links B F G Strongly connected components B C D E A G H F C A H D E Weakly connected components: every node can be reached from every other node by following links in either direction A B C D E F G H Weakly connected components A B C D E G H F In undirected networks one talks simply about ‘connected components’
network metrics: size of giant component if the largest component encompasses a significant fraction of the graph, it is called the giant component
How do you characterize it? Outline What is a network? a bunch of nodes and edges How do you characterize it? with some basic network metrics Network models
Several other graph metrics exist Structural Metrics Degree distribution Average path length Centrality Betweenness Closeness Graph density Clustering coefficient Several other graph metrics exist Assortativity Modularity …
degree sequence and degree distribution Degree sequence: An ordered list of the (in,out) degree of each node In-degree sequence: [2, 2, 2, 1, 1, 1, 1, 0] Out-degree sequence: [2, 2, 2, 2, 1, 1, 1, 0] (undirected) degree sequence: [3, 3, 3, 2, 2, 1, 1, 1] Degree distribution: A frequency count of the occurrence of each degree In-degree distribution: [(2,3) (1,4) (0,1)] Out-degree distribution: [(2,4) (1,3) (0,1)] (undirected) distribution: [(3,3) (2,2) (1,3)]
Structural Metrics: Degree distribution
Characterizing networks: How far apart are things?
Structural metrics: Average path length
Characterizing networks: Who is most central?
Centrality: betweenness The fraction of all directed paths between any two vertices that pass through a node paths between j and k that pass through i betweenness of vertex i all paths between j and k Normalization undirected: (N-1)*(N-2)/2 directed graph: (N-1)*(N-2) e.g.
Centrality: closeness How close the vertex is to others depends on inverse distance to other vertices Normalization
network metrics: graph density Of the connections that may exist between n nodes directed graph emax = n*(n-1) undirected graph emax = n*(n-1)/2 What fraction are present? density = e/ emax For example, out of 12 possible connections, this graph has 7, giving it a density of 7/12 = 0.583 Would this measure be useful for comparing networks of different sizes (different numbers of nodes)?
Structural Metrics: Clustering coefficient
How do you characterize it? Outline What is a network? a bunch of nodes and edges How do you characterize it? with some basic network metrics Network models
Four structural models Regular networks Random networks Small-world networks Scale-free networks
Regular networks – fully connected
Regular networks – Lattice
Regular networks – Lattice: ring world
modeling networks: random networks Nodes connected at random Number of edges incident on each node is Poisson distributed Poisson distribution
Random networks
Erdos-Renyi random graphs What happens to the size of the giant component as the density of the network increases? http://ccl.northwestern.edu/netlogo/models/run.cgi?GiantComponent.884.534
Random Networks
modeling networks: small worlds a friend of a friend is also frequently a friend but only six hops separate any two people in the world Arnold S. – thomashawk, Flickr; http://creativecommons.org/licenses/by-nc/2.0/deed.en
Duncan Watts and Steven Strogatz Small world models Duncan Watts and Steven Strogatz a few random links in an otherwise structured graph make the network a small world: the average shortest path is short regular lattice: my friend’s friend is always my friend small world: mostly structured with a few random connections random graph: all connections random Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
Watts Strogatz Small World Model As you rewire more and more of the links and random, what happens to the clustering coefficient and average shortest path relative to their values for the regular lattice? http://projects.si.umich.edu/netlearn/NetLogo4/SmallWorldWS.html
Small-world networks
Scale-free networks
Many real world networks contain hubs: highly connected nodes Scale-free networks Many real world networks contain hubs: highly connected nodes Usually the distribution of edges is extremely skewed many nodes with few edges number of nodes with so many edges fat tail: a few nodes with a very large number of edges number of edges no “typical” number of edges
But is it really a power-law? A power-law will appear as a straight line on a log-log plot: A deviation from a straight line could indicate a different distribution: exponential lognormal log(# nodes) log(# edges)
Scale-free networks