Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Network Measurements.

Slides:



Advertisements
Similar presentations
Analysis and Modeling of Social Networks Foudalis Ilias.
Advertisements

VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Advanced Topics in Data Mining Special focus: Social Networks.
On Power-Law Relationships of the Internet Topology Michalis Faloutsos Petros Faloutsos Christos Faloutsos.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
Web Graph Characteristics Kira Radinsky All of the following slides are courtesy of Ronny Lempel (Yahoo!)
Weighted networks: analysis, modeling A. Barrat, LPT, Université Paris-Sud, France M. Barthélemy (CEA, France) R. Pastor-Satorras (Barcelona, Spain) A.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Complex Networks Third Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Network Models Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Models Why should I use network models? In may 2011, Facebook.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Global topological properties of biological networks.
Advanced Topics in Data Mining Special focus: Social Networks.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
Data Mining for Complex Network Introduction and Background.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Statistical Properties of Massive Graphs (Networks) Networks and Measurements.
Peer-to-Peer and Social Networks Random Graphs. Random graphs E RDÖS -R ENYI MODEL One of several models … Presents a theory of how social webs are formed.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
The Erdös-Rényi models
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Models and Algorithms for Complex Networks Networks and Measurements Lecture 3.
Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.
Models and Algorithms for Complex Networks Power laws and generative processes.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2013 Figures are taken.
Online Social Networks and Media
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
Analysis of biological networks Part III Shalev Itzkovitz Shalev Itzkovitz Uri Alon’s group Uri Alon’s group July 2005 July 2005.
Complex Networks Measures and deterministic models Philippe Giabbanelli.
Network Computing Laboratory The Structure and Function of Complex Networks (Episode I) M.E.J. Newman, Dept. of Physics, U. Michigan, presented.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Online Social Networks and Media Network Measurements.
Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel.
Bioinformatics Center Institute for Chemical Research Kyoto University
How Do “Real” Networks Look?
Class 2: Graph Theory IST402. Can one walk across the seven bridges and never cross the same bridge twice? Network Science: Graph Theory THE BRIDGES OF.
Information Retrieval Search Engine Technology (10) Prof. Dragomir R. Radev.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Complex Networks Mark Jelasity. 2 ● Where are the networks? – Some example computer systems ● WWW, Internet routers, software components – Large decentralized.
Lecture 23: Structure of Networks
Structures of Networks
Topics In Social Computing (67810)
How Do “Real” Networks Look?
Lecture 23: Structure of Networks
Network Science: A Short Introduction i3 Workshop
Section 8.6 of Newman’s book: Clustering Coefficients
How Do “Real” Networks Look?
How Do “Real” Networks Look?
Peer-to-Peer and Social Networks Fall 2017
How Do “Real” Networks Look?
Network Science: A Short Introduction i3 Workshop
Clustering Coefficients
Lecture 23: Structure of Networks
Lecture 9: Network models CS 765: Complex Networks
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Network Measurements

Measuring Networks Degree distributions and power-laws Clustering Coefficient Small world phenomena Components Motifs Homophily

The basic random graph model The measurements on real networks are usually compared against those on “random networks” The basic G n,p (Erdös-Renyi) random graph model: – n : the number of vertices – 0 ≤ p ≤ 1 – for each pair (i,j), generate the edge (i,j) independently with probability p – Expected degree of a node: z = np

Degree distributions Problem: find the probability distribution that best fits the observed data degree frequency k fkfk f k = fraction of nodes with degree k = probability of a randomly selected node to have degree k

Power-law distributions The degree distributions of most real-life networks follow a power law Right-skewed/Heavy-tail distribution – there is a non-negligible fraction of nodes that has very high degree (hubs) – scale-free: no characteristic scale, average is not informative In stark contrast with the random graph model! – Poisson degree distribution, z=np – highly concentrated around the mean – the probability of very high degree nodes is exponentially small p(k) = Ck -α

Power-law signature Power-law distribution gives a line in the log-log plot α : power-law exponent (typically 2 ≤ α ≤ 3) degree frequency log degree log frequency α log p(k) = -α logk + logC

Examples Taken from [Newman 2003]

A random graph example

Exponential distribution Observed in some technological or collaboration networks Identified by a line in the log-linear plot p(k) = λe -λk log p(k) = - λk + log λ degree log frequency λ

Measuring power-laws How do we create these plots? How do we measure the power-law exponent? Collect a set of measurements: – E.g., the degree of each page, the number of appearances of each word in a document, the size of solar flares(continuous) Create a value histogram – For discrete values, number of times each value appears – For continuous values (but also for discrete): Break the range of values into bins of equal width Sum the count of values in the bin Represent the bin by the mean (median) value Plot the histogram in log-log scale – Bin representatives vs Value in the bin

Discrete Counts

Measuring power laws Simple binning produces a noisy plot

Logarithmic binning Exponential binning – Create bins that grow exponentially in size – In each bin divide the sum of counts by the bin length (number of observations per bin unit) Still some noise at the tail

Cumulative distribution Compute the cumulative distribution – P[X≥x]: fraction (or number) of observations that have value at least x – It also follows a power-law with exponent α-1

Pareto distribution A random variable follows a Pareto distribution if Power law distribution with exponent α=1+β

Zipf plot There is another easy way to see the power- law, by doing the Zipf plot – Order the values in decreasing order – Plot the values against their rank in log-log scale i.e., for the r-th value x r, plot the point (log(r),log(x r )) – If there is a power-law you should see something a straight line

Zipf’s Law A random variable X follows Zipf’s law if the r-th largest value x r satisfies Same as Pareto distribution X follows a power-law distribution with α=1+1/γ Named after Zipf, who studied the distribution of words in English language and found Zipf law with exponent 1

Zipf vs Pareto

Computing the exponent

Collective Statistics (M. Newman 2003)

Power Laws - Recap A (continuous) random variable X follows a power- law distribution if it has density function A (continuous) random variable X follows a Pareto distribution if it has cumulative function A (discrete) random variable X follows Zipf’s law if the frequency of the r-th largest value satisfies power-law with α=1+β power-law with α=1+1/γ

Average/Expected degree

Maximum degree For random graphs, the maximum degree is highly concentrated around the average degree z For power law graphs Rough argument: solve nP[X≥k]=1

The 80/20 rule Top-heavy: Small fraction of values collect most of distribution mass

The effect of exponent

Generating power-law values

Clustering (Transitivity) coefficient Measures the density of triangles (local clusters) in the graph Two different ways to measure it: The ratio of the means

Example

Clustering (Transitivity) coefficient Clustering coefficient for node i The mean of the ratios

Example The two clustering coefficients give different measures C (2) increases with nodes with low degree

Collective Statistics (M. Newman 2003)

Clustering coefficient for random graphs The probability of two of your neighbors also being neighbors is p, independent of local structure – clustering coefficient C = p – when the average degree z=np is constant C =O(1/n)

Small worlds Millgram’s experiment: Letters were handed out to people in Nebraska to be sent to a target in Boston People were instructed to pass on the letters to someone they knew on first-name basis The letters that reached the destination followed paths of length around 6 Six degrees of separation: (play of John Guare) Also: – The Kevin Bacon game – The Erdös number Small world project:

Measuring the small world phenomenon d ij = shortest path between i and j Diameter: Characteristic path length: Harmonic mean Also, distribution of all shortest paths Problem if no path between two nodes

Collective Statistics (M. Newman 2003)

Small worlds in real networks For all real networks there are (on average) short paths between nodes of the network. – Largest path found in the IMDB actor network: 7 Is this interesting? – Random graphs also have small diameter (d=logn/loglogn when z=ω(logn)) Short paths are not surprising and should be combined with other properties – ease of navigation – high clustering coefficient

Connected components For undirected graphs, the size and distribution of the connected components – is there a giant component? – Most known real undirected networks have a giant component For directed graphs, the size and distribution of strongly and weakly connected components

Connected components – definitions Weakly connected components (WCC) – Set of nodes such that from any node can go to any node via an undirected path Strongly connected components (SCC) – Set of nodes such that from any node can go to any node via a directed path. – IN: Nodes that can reach the SCC (but not in the SCC) – OUT: Nodes reachable by the SCC (but not in the SCC) SCC WCC

The bow-tie structure of the Web The largest weakly connected component contains 90% of the nodes

SCC and WCC distribution The SCC and WCC sizes follows a power law distribution – the second largest SCC is significantly smaller

Another bow-tie Who lends to whom

Web Cores Cores: Small complete bipartite graphs (of size 3x3, 4x3, 4x4) Found more frequently than expected on the Web graph Correspond to communities of enthusiasts (e.g., fans of japanese rock bands)

Motifs Most networks have the same characteristics with respect to global measurements – can we say something about the local structure of the networks? Motifs: Find small subgraphs that over- represented in the network

Example Motifs of size 3 in a directed graph

Finding interesting motifs Sample a part of the graph of size S Count the frequency of the motifs of interest Compare against the frequency of the motif in a random graph with the same number of nodes and the same degree distribution

Generating a random graph Find edges (i,j) and (x,y) such that edges (i,y) and (x,j) do not exist, and swap them – repeat for a large enough number of times i j x y G i j x y G-swapped degrees of i,j,x,y are preserved

The feed-forward loop Over-represented in gene-regulation networks – a signal delay mechanism X YZ Milo et al. 2002

Homophily Love of the same: People tend to have friends with common interests – Students separated by race and age

Measuring homophily Friendships in elementary school The connections of people with the same interests should be higher than on a random experiment

References M. E. J. Newman, The structure and function of complex networks, SIAM Reviews, 45(2): , 2003 R. Albert and A.-L. Barabási, Statistical mechanics of complex networks, Reviews of Modern Physics 74, (2002). S. N. Dorogovstev and J. F. F. Mendez, Evolution of Networks: From Biological Nets to the Internet and WWW. Michalis Faloutsos, Petros Faloutsos and Christos Faloutsos. On Power-Law Relationships of the Internet Topology. ACM SIGCOMM E. Ravasz, A. L. Somera, D. A. Mongru, Z. N. Oltvai, and A.-L. Barabási, Hierarchical organization of modularity in metabolic networks, Science 297, (2002). R Milo, S Shen-Orr, S Itzkovitz, N Kashtan, D Chklovskii & U Alon, Network Motifs: Simple Building Blocks of Complex Networks. Science, 298: (2002). R Milo, S Itzkovitz, N Kashtan, R Levitt, S Shen-Orr, I Ayzenshtat, M Sheffer & U Alon, Superfamilies of designed and evolved networks. Science, 303: (2004).